Machine-learning-based OCR techniques allow you to extract printed or handwritten text from images such as posters, street signs and product labels, as well as from documents like articles, reports, forms, and invoices.
The text is typically extracted as words, text lines, and paragraphs or text blocks, enabling access to digital version of the scanned text. This eliminates or significantly reduces the need for manual data entry.
Overview:
OCR or Optical Character Recognition is also referred to as text recognition or text extraction.
For those new to AI, please refer to the following link for more information on Azure AI Services before diving into OCR.
Getting started with Azure AI Services
Getting started with AI vision
How is OCR related to Intelligent Document Processing (IDP)?
Intelligent Document Processing (IDP) uses OCR as its foundational technology to additionally extract structure, relationships, key-values, entities, and other document-centric insights with an advanced machine-learning based AI service like Document Intelligence. Document Intelligence includes a document-optimized version of Read as its OCR engine while delegating to other models for higher-end insights. If you are extracting text from scanned and digital documents, use Document Intelligence Read OCR.
OCR engine
Microsoft’s Read OCR engine is composed of multiple advanced machine-learning based models supporting global languages.
OCR (Read) editions
Select the Read edition that best fits your requirements.
| Images: General, in-the-wild images | labels, street signs, and posters | OCR for images (version 4.0) | Optimized for general, non-document images with a performance-enhanced synchronous API that makes it easier to embed OCR in your user experience scenarios. |
| Documents: Digital and scanned, including images | books, articles, and reports | Document Intelligence read model | Optimized for text-heavy scanned and digital documents with an asynchronous API to help automate intelligent document processing at scale. |
Demo:
Let’s try out OCR by using Vision Studio.
https://portal.vision.cognitive.azure.com/demo/extract-text-from-images
or Login to Vision Studio (azure.com)
Click on OCR ->

Before trying it out, create a Vision Cognitive services

OCR supported languages
Both Read versions available today in Azure AI Vision support several languages for printed and handwritten text.
OCR for printed text includes support for English, French, German, Italian, Portuguese, Spanish, Chinese, Japanese, Korean, Russian, Arabic, Hindi, and other international languages that use Latin, Cyrillic, Arabic, and Devanagari scripts.
OCR for handwritten text includes support for English, Chinese Simplified, French, German, Italian, Japanese, Korean, Portuguese, and Spanish languages.
OCR common features
The Read OCR model is available in Azure AI Vision and Document Intelligence with common baseline capabilities while optimizing for respective scenarios. The following list summarizes the common features:
- Printed and handwritten text extraction in supported languages
- Pages, text lines and words with location and confidence scores
- Support for mixed languages, mixed mode (print and handwritten)
- Available as Distroless Docker container for on-premises deployment
Use the OCR cloud APIs or deploy on-premises
The cloud APIs are the preferred option for most customers because of their ease of integration and fast productivity out of the box. Azure and the Azure AI Vision service handle scale, performance, data security, and compliance needs while you focus on meeting your customers’ needs.
For on-premises deployment, the Read Docker container enables you to deploy the Azure AI Vision v3.2 generally available OCR capabilities in your own local environment. Containers are great for specific security and data governance requirements.
OCR data privacy and security
As with all of the Azure AI services, developers using the Azure AI Vision service should be aware of Microsoft’s policies on customer data.
Conclusion
OCR is a powerful vision service within Microsoft’s Vision API that enables developers to extract text from images. It is closely related to IDP and utilizes the OCR engine to provide accurate text recognition.
Different OCR editions are available to suit different application requirements. Using Vision Studio, developers can easily and visually integrate OCR functionality into their applications, allowing them to process and analyze images with text extraction capabilities.
References:
https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview
https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview-ocr

[…] Getting started with OCR […]