OCR for PDF files

Here is a an example Robot that does OCR for all PDF files attached to a Work Item: https://github.com/robocorp/robot-ocr-my-pdf

The original PDF files should be in image format (scanned). They are converted into text based PDF files (can be copy/pasted) and attached to the work item.

This robot can run in Robocorp Cloud container engine or in your local computer. It uses OCRmyPDF -python library as an underlying tech: https://github.com/jbarlow83/OCRmyPDF

Please, share if you find ways to improve this example!

2 Likes