Skip links

OCR Using Azure Computer Vision API

I have been getting some good feedback on Azure’s Computer Vision API, in particular, the OCR functionality. Although I am not working on any project that requires this functionality at the moment, I thought it would be a good idea to check out the service – just to be “future ready”!

This article is not meant to be a detailed review of the OCR service, but merely to share my first experience with it, that is all. 

Go to Azure Portal  and sign in.

Click “All services” in the menu bar and select “Cognitive Services” under the group “AI + Machine Learning”. 

Cognitive Services
Cognitive Services

Now select “Computer Vision”

Computer Vision
Computer Vision

Click “Create”. The following screen requires you to configure the resource:

Configuring Computer Vision
Configuring Computer Vision

Fill in the various fields and click “Create”.

It will take a a minute or two to deploy the service.

Go to the Dashboard and click on the newly created resource “OCR-Test”. Click on the item “Keys” under “Resource Management” group. You will see two keys:

Keys
Keys

Copy the value of “KEY 1” field (you can use “KEY 2” as well). You will need this to make the REST API call.

I wrote a Python program to send a JPG file to the OCR end point and save the returned data in a text file:

Python Program
Python Program

To check the service, I used my iPhone X camera to capture one page of a homeopathy book that I have. Here is the image:

The Image to be Converted
The Image to be Converted

I then ran my Python program to send this image to Azure and got the converted text.  Here it is:

Converted Text
Converted Text

You can see that the conversion is nearly perfect, except for the marked area, where there is an extra character “t”. If you look carefully at the captured image, you will notice that there is some “noise” due to the text on the reverse side of this page. My guess is that the extra “t” comes from the text on the reverse side. But other than that, it is perfect. I am quite pleased!

I used the “free” tier, so I am eligible for 5000 pages of OCR in a month. Pretty generous I would say!

There is also a separate end point for handling handwritten images, but I did not explore that feature. Will try that one of these days.

You can download my Python program here.

Have a nice day!

Leave a comment