Home > Blogs > From OCR to Computer Vision — The journey

From OCR to Computer Vision — The journey

September 23, 2019 - Razia Kuvale Zubair

It’s a typical day for Samantha at work; checking, verifying and classifying a multitude of sales invoices which is time-consuming and tedious. How she wishes for an automation software that could extract the information from the images, allowing her to focus on more strategic tasks!

Are you also having trouble converting bank statements, sales invoices, and computerized receipts into digital format? Do you want to digitize your bank statements? Optical Character Recognition or OCR is the answer.

In this blog, we will delve deeper into what OCR and Computer Vision is and why enterprises should make the shift from OCR to Computer Vision.

What is Optical Character Recognition (OCR)? — The basic concept

OCR refers to the process of converting different types of data including PDF files, printed documents or images into editable, accessible and searchable formats for computers. The power of OCR is limitless — can read documents in multiple languages and formats and convert documents into text-searchable data, thereby maximizing accuracy, eliminating manual efforts, driving analytics, and enabling enterprises to adapt to the ever-evolving business needs.

The history of OCR and its related technologies

Did you know that the history of OCR dates back to 1914? In 1914, Emanuel Goldberg developed a machine that read characters and converted them into standard telegraph code. During the late 1920s, Emanuel developed a statistical machine used for searching microfilm archives. The optical code recognition system he developed was acquired by IBM, with many inventions to his name. In the 2000s, OCR was widely available “as-a-service”, and its use grew significantly.

Another example is the CAPTCHA program that was developed to avoid bots and spammers. From HPE’s Haven OnDemand to OpenCV, OCR is a field in artificial intelligence, pattern recognition and computer vision that enterprises continue to explore.

Whether you’re looking to convert handwritten scans to machine-encoded text or automate data entry tasks, OCR has got you covered. Optical Character Recognition is the most common means of extracting business-critical data, translating data into digital forms.

Limitations of OCR and the need for enterprises to adopt Computer Vision

Indisputably, OCR helps enterprises save time and effort in scanning, processing and editing documents of all forms. With OCR, you can extract information from a printed contract or image without the need to retype or scan the image.

As Intelligent Automation has evolved significantly, there is a need for greater inherent understanding of where data is located on a document, and the various forms the same data may take across different types of documents. OCR generally depends on templates and rules that define document layout.

Is OCR 100% accurate? Does it work with all types of documents? Can OCR differentiate between characters? These are some of the perplexing questions that enterprises are trying to answer. How can Computer Vision solve these challenges?

What exactly is Computer Vision?

Picture this — Google Lens, Face Recognition, Snapchat filers, and Google maps aerial imaging.

They all have one thing in common — Deep Learning-based computer vision algorithms.

According to the SSON report on Computer Vision and Cognitive Automation, Computer Vision (CV) refers to the ability to see, read and recognize specific objects or data within an unstructured format. It falls under the broad area of Artificial Intelligence.

Computer Vision works by digesting massive quantities of data on related images to recognize specific characteristics and patterns. It helps understand the concept of digital images, extracting information from images/pixels. The primary role Computer Vision plays is to identify those ‘areas’ or ‘regions’ of interest in a given document, and pass this information on to an OCR engine, where the information will be converted into a structured format.

In the 70s, David Marr, a neuroscientist at MIT, set up the building blocks for the modern Computer Vision and thus is known as the father of the modern Computer Vision.

Take your document/image conversion to the next level with CV’s capabilities such as Deep Learning. From translating text into several languages to tagging friends in photos, Computer Vision provides superior performance, surpassing human-level accuracy.

As per a report, the Computer Vision market is expected to reach 25.32 billion U.S. dollars by 2023, at a CAGR of 47.54%.

With such staggering growth, it’s no wonder that Computer Vision with AI-enhanced capabilities is applied across a plethora of industry sectors — consumer, healthcare, automotive, sports and entertainment among others. AI in Computer Vision is not a pie-in-the-sky goal anymore, rather an emerging technology that is driving business growth, strategic partnerships, collaborations, and an increase in revenue.

With data growing at a mammoth pace, there is a huge opportunity for enterprises to leverage new technologies such as Computer Vision to find patterns and make sense of the available data.

Why should enterprises make the shift from OCR to Computer Vision?

Adopting OCR in business processes signals a new wave of modern enterprises, where ensuring customer satisfaction and improving user experience is critical. Since the technology involves reading text from images and extracting value, it increases data access to customers, eliminates traditional systems, thereby reducing manual errors and improving cost efficiency, productivity, speed, and accuracy.

In this age of digitization, it’s not surprising that enterprises are gearing up for a future that is conducive for human-digital collaboration — A future where the humans and bots work alongside each other, taking automation and AI capabilities to the next level.

That’s where breakthrough revolutionary technologies such as Computer Vision, ICR or Intelligent Character Recognition, coupled with analytics, are expanding the scope of process automation across the enterprise.

Computer Vision is an area of artificial intelligence that can be used to simplify paper-driven processes across the enterprise including financial services (loan applications, vendor onboarding, receipt processing), manufacturing (accounts payable, sales order purchasing), insurance (claims handling), healthcare (billings and claims management), and government (passport applications).

Using deep learning models, computers will be able to accurately collect, analyze, and classify data. It’s time enterprises transition from tried-and-tested techniques to ML-backed technologies that will help beef up your processes and systems.

Don’t be left behind! Make the start to data-driven solutions with computer vision software today.

For further insights on Computer Vision and Cognitive Automation, download the SSON report — Enabling Intelligent Automation using Computer Vision now.

References:

https://en.wikipedia.org/wiki/Optical_character_recognition

https://towardsdatascience.com/computer-vision-an-introduction-bbc81743a2f7

https://www.marketsandmarkets.com/Market-Reports/ai-in-computer-vision-market-141658064.html

https://medium.com/@hdinhofer/optical-character-recognition-ocr-a-branch-of-computer-vision-76887e1d6ab0