Unable to use RPA.PDF library

Hi, I was trying to do the level two certification of robocorp in a python based template instead of a standard robot template and I found out that RPA.PDF import is not working in python even thou the documentation clearly states an example of how to use it in python as well.

I just created a standard python template robot with Robocorp’s VSCode extension and overwrote the task.py with:

from RPA.PDF import PDF

def minimal_task():
    text = PDF().get_text_from_pdf("report.pdf")
    print(text)

if __name__ == "__main__":
    minimal_task()

It did work correctly. What error do you get? How/where are you running it? What do you have in conda.yaml?

1 Like

Hi @Teppo
I encountered same issue as above, i can import the lib. like: “from RPA.PDF import PDF”, use the class like: pdf = PDF().
But when i try to use any functions from PDF lib, no functions shows what so ever.
pdf.switch_to_pdf function i wanted to use in particular but the function is not recognized (see screenshot attached)

conda.yaml file looks like this:

For more details on the format and content:

Tip: Adding a link to the release notes of the packages helps maintenance and security.

channels:

  • conda-forge

dependencies:

Define conda-forge packages here

When available, prefer the conda-forge packages over pip as installations are more efficient.

  • python=3.9.13
  • pip=22.1.2
  • pip:

    Define pip packages here →

    • rpaframework==22.0.0

image

Could you explain what you mean by not recognized. I had no issue running your conda with code:

from RPA.PDF import PDF

pdf = PDF()
pdf.switch_to_pdf("testpdf.pdf")
1 Like

Hello raivo,
Thank you so much for being here, what i mean by recognized is that when i type "pdf.(here i should have the list of functions that i can use from RPA.PDF lib correct?) well i dont see anything. See below a screenshot how it looks in terms of colors when i use driver.click_button for example and how it looks for pdf.switch_to_pdf() (for pdf its white like its pure text)
(and yes i did import the lib :stuck_out_tongue: just to get this out of the way in case you wonder it, i use (from RPA.PDF import PDF)

Not all of our RPA libraries are fully supported when developing on python as they have been made to work with robot framework. They will still function when robot is ran!

“They will still function when robot is ran!” > I cannot run the robot as i get this error: “SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated \UXXXXXXXX escape”

@raivo , could it be that something we are missing here? Because in the robocorp docs from where i learn, i see examples of this functions from PDF lib with Python examples. See below,

If you get this error with html_to_pdf then you need to make sure that html you are passing to function is valid. Input errors does not effect library usability. My previous statement for full support was aimed to intellisense(auto-complete), library can be used from python as examples show.

@sorinmnx I faced exactly the same issue today. Import library RPA.PDF looks good, but keyword/method can’t be recognized. Haven’t figured out root cause, and solution.

Just used a workaround to make it work(and it worked for me)
Import module that define those keywords/methods.
Eg.
if use in Python
from RPA.PDF.keywords important DocumentKeywords

If use in robot
Library DocumentKeywords

Hopefully we can find the real reason soon.

2 Likes

Hey @lijunhe.canada and @raivo
Coming back on this issue after doing some playaround and researching more, i found the solution in order for us to be able to work with RPA.PDF.

Steps:
1.CTRL+SHIFT+P
2. Search: Robot Framework: Clear caches and restart Robot Framework Language Server
3. Wait till restart ends
At this point, nothing from front end really changes if you try to use any functions from RPA.PDF at first sight, however after clearing caches/restart rbt framework, now if you hover on the function “html_to_pdf” for example or any functions, they are still in wait but now it shows like below screenshot. Strange enough now interpretator see the function as “function”.
4. All functions will look in white but they will work at least for me.

  1. Yes @raivo you were right, error: “SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated \UXXXXXXXX escape” >>>> was caused by my output_path from screenshot above where because i copy paste the path for my folder it used \ instead of /.

@lijunhe.canada Hope above will work for you aswell

2 Likes

The “SyntaxError: (Unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated \uxxxxxxxx escape” error occurs when Python encounters a string with an incomplete Unicode escape sequence, typically in the form of “\u” followed by hexadecimal digits. To fix this error, you should either provide a complete Unicode escape sequence (e.g., “\uXXXX” where XXXX represents four hexadecimal digits) or use a raw string by prefixing the string with “r” to avoid interpreting backslashes as escape characters. For example, if you intended to use a backslash in the string, you can write it as “" or use a raw string like r”". Ensure that any Unicode escape sequences in your code are correctly formatted to resolve this error.