OCR Using Azure Computer Vision API

Written by on March 28, 2019 in Image Processing, Programming, Python with 0 Comments

I have been getting some good feedback on Azure’s Computer Vision API, in particular, the OCR functionality. Although I am not working on any project that requires this functionality at the moment, I thought it would be a good idea to check out the service – just to be “future ready”!

This article is not meant to be a detailed review of the OCR service, but merely to share my first experience with it, that is all. 

Go to Azure Portal  and sign in.

Click “All services” in the menu bar and select “Cognitive Services” under the group “AI + Machine Learning”. 

Cognitive Services

Cognitive Services

Now select “Computer Vision”

Computer Vision

Computer Vision

Click “Create”. The following screen requires you to configure the resource:

Configuring Computer Vision

Configuring Computer Vision

Fill in the various fields and click “Create”.

It will take a a minute or two to deploy the service.

Go to the Dashboard and click on the newly created resource “OCR-Test”. Click on the item “Keys” under “Resource Management” group. You will see two keys:

Keys

Keys

Copy the value of “KEY 1” field (you can use “KEY 2” as well). You will need this to make the REST API call.

I wrote a Python program to send a JPG file to the OCR end point and save the returned data in a text file:

Python Program

Python Program

To check the service, I used my iPhone X camera to capture one page of a homeopathy book that I have. Here is the image:

The Image to be Converted

The Image to be Converted

I then ran my Python program to send this image to Azure and got the converted text.  Here it is:

Converted Text

Converted Text

You can see that the conversion is nearly perfect, except for the marked area, where there is an extra character “t”. If you look carefully at the captured image, you will notice that there is some “noise” due to the text on the reverse side of this page. My guess is that the extra “t” comes from the text on the reverse side. But other than that, it is perfect. I am quite pleased!

I used the “free” tier, so I am eligible for 5000 pages of OCR in a month. Pretty generous I would say!

There is also a separate end point for handling handwritten images, but I did not explore that feature. Will try that one of these days.

You can download my Python program here.

Have a nice day!

Tags: , ,

Subscribe

If you enjoyed this article, subscribe now to receive more just like it.

Subscribe via RSS Feed

Leave a Reply

Your email address will not be published. Required fields are marked *

Top