Millennium Tower
Weena 690 – 16th floor
3012 CN Rotterdam

Expanding The Impact Of OCR: Using Text Extraction To Take RPA Even Further

June 13, 2019

The industry has only scratched the surface when it comes to the use of OCR in automation – but Ciphix is working to change that.

Leveraging cutting-edge text extraction tools, our team is discovering new ways for OCR and RPA to work hand in hand. Although OCR (Optical Character Recognition) is a widely used technology, its potential in the context of automation has only recently been explored. OCR is being rapidly advanced by innovations in text extraction and recognition, allowing software to more effectively convert images of text into machine-encoded text. Two applications in particular, Abbyy Flexicapture and Google Vision, have made valuable contributions in this field – and we believe that combining them with RPA technology will unlock a new realm of possibilities.

The Ciphix OCR Initiatives

Over the last several months, the Ciphix team has been researching these industry-changing tools and most effective ways to utilize them. There are a number of qualities that set Abbyy Flexicapture and Google Vision ahead of the curve. Both applications enable text extraction from a wide variety of structured and unstructured documents and forms, such as questionnaires and letters, while being significantly easier to use than competing products. They also provide a greater customizability, more extensive documentation and, perhaps most importantly, improved accuracy.

At Ciphix, we aim to take leading-edge tools like these even further by pioneering new ways to integrate them into digital workforces. Karan, one of our solutions architects, has been heading up our exploration of these two tools. Karan has recently completed the Abbyy Basic Training course and is now working on integrating Abbyy with RPA processes. Our initial findings have been exciting – but Karan emphasizes that Abby’s abilities to customize text extraction mean the best is yet to come.

The MRZ Reader Custom Activity

Alongside another of our solutions architects, Robbert, Karan has also investigated the combination of Google Cloud Vision with RPA. The focus of their research project was the use of the Google Vision API to build a custom activity for the detection and extraction of text from the MRZ (Machine readable zone) on identification cards. With this custom activity, the extraction of MRZ text from identification documents can be combined with the processing of the extracted text using RPA tasks.

The activity also facilitates the parsing of the extracted text, providing a detailed description of data contained within the MRZ text. With RPA automating tedious tasks such as reading the MRZ and writing it into another system, this process can be completely automated. This is only one use case for the Google Vision API, according to Karan and Robbert. As with Abbyy, there are clearly many more opportunities to be explored.

We look forward to continuing this research, and ultimately putting our findings to use for organizations around the world. Stay tuned for our next developments in integrating OCR with RPA – a combination we’re confident will help organizations and humans thrive.

Karan Ramsodit – Solution Architect