#28/3 Daphne OCR - Google Vision API

image

Google Vision API

Daphne's optical character recognition uses Google's Vision API. Let's see how it works with a few examples. An application that performs OCR works on the server side, preparing the character content of the uploaded image files.

As a first step, in Daphne's folder definition screen, let's check that OCR will be applied to the image files in the folder.

Let's upload a book page as an image file. Processing a book page image is an easy process for OCR. When we save the file, a server-side agent executes the OCR process.

The OCR Text tab field in the Revision detail of the document is filled by the OCR agent.

As you can see, a complete success.

In the second example, let's look at OCR of a handwritten image.

A handwritten image is a difficult process for OCR.

Although there were some minor problems, the OCR process was largely successful.

OCR makes it very easy to search for words in files, and these small errors will not hinder this.

For the last example, let's upload a TIFF file.

TIFF is widely used with scanning, faxing, word processing, and optical character recognition.

A TIFF file can contain one or more images. In this example we look at a multi-page scanned text in a foreign language. Since the tiff file contains 14 pages, the OCR process will take more time.

We see that the OCR is not yet complete.

OCR jobs are queued and done by the agent on the server side by connecting to the Google Cloud service.

You can use the application from Google Cloud Console with the Service Account you can add. key.json file contains Google authorisation information.

When we check the OCR tab again we can see that the result has arrived.