Can robot framework detect text from a scanned file?

Since we need to process a lot of extraction from pdf file/images, i found that seems like the robot can’t detect the word from a scanned pdf. I wonder if there’s a method to let the robot detect the text from scanned file(pdf)?

Here is the example of scanned file: https://th.bing.com/th/id/OIP.rednfKM5zP-d-q5zgF7TcQHaFu?pid=ImgDet&rs=1

See IDP - Robocorp Portal for ideas.

2 Likes

And also this developer guides article about reading PDFs.

1 Like

When i try running the robot task ‘Extract Structured Data With AI’ from here’GitHub - robocorp/example-parse-pdf-invoice: Extract information from PDF invoices’, it shows

Screenshot 2023-04-13 152934
What should i do?

I never learn api related knowledge before, so i dont know how to fix this issue

You need to set up vault to your local settings. Or link to control room and set up secrets there.

I read the article but I can’t really understand it. What thing should i do to let the robot works? Like changing some data in the settings.json?

It may be easier to use Control Room for storing secrets. Then there is no need to make any changes in settings.json. To get it working you must:

  1. have an Account in Control Room https://cloud.robocorp.com/
  2. Have Robocorp extensions installed to VSCode
  3. Link VSCode to Control Room and Connect to Control Room Vault (see Robocorp Visual Studio Code Overview | Robocorp documentation)
1 Like