27 Aug

Using Form Recognizer to create a Document Management System

Form Recognizer is one of the Microsoft’s cognitive services which applies advanced machine learning to accurately extract text, key/value pairs, and tables data from typed or handwritten documents.

Functions of form recognizer:

It can analyze various types of documents like phone-captured images, scanned documents, and digital PDFs, typed and handwritten documents and returns structured data representation of the documents.
It offers a lot of prebuilt models to extract key information from receipts, invoices, and identity documents. Apart from the prebuilt models, we can also train custom models that can extract information from specific forms of our interest with just five sample documents.
Using Form Recognizer via the REST API or SDK will save time by reducing manual data entry errors, while also making it easier to perform additional analysis of the available data.

With composed models, we can create a single model ID by combining multiple models each corresponding to a specific type of form. When we submit a document to the composed model, the form recognizer service performs a classification step to find the type of the forms and routes it to the corresponding custom model.

Mesh 3.0- an employee experience platform has utilized the capabilities of Form Recognizer to create a Document Management System where the users can submit any type of form. The application will categorize the document type and extracts the associated key information from the form, allowing users to search for the information across the documents.

Form Recognizer Features:

Layout API:

Layout API is used as a part of the custom models to detect and identify text, tables, selection marks, and structure information from documents (PDF, TIFF) and images (JPG, PNG, BMP).

The JSON returned by the Layout API contains the following nodes-

“readResults” – It contains all the text with its respective bounding box placement on the page.
“selectionMarks” – It has every selection mark (checkbox, radio mark), whether it is “selected” or “unselected”.
“pageResults” – It includes the tables that are extracted.

Prebuilt models:

The prebuilt models support receipts like sales receipts from Australia, Canada, Great Britain, India, and the United States. It also supports business cards, identity documents, invoices in various formats, and can extract key information from world-wide passports and US driver licenses.

Let’s see an example of prebuilt model that extracts information of a US driver’s license.

1. Use the open-source labelling tool, part of the Form OCR Test Toolset (FOTT)

https://fott-2-1.azurewebsites.net/ (for Form Recognizer 2.1 GA services)

2. To work with prebuilt models, click on “Use prebuilt model to get data”

3. Go to the Form Recognizer resource created in the azure portal, get the Form recognizer service endpoint and API key present in the Keys and Endpoint tab.

4. Provide the Form recognizer service endpoint, API key and the form type that we are going to analyze. In our case it is ID and chose the file for analysis.

5. Once we click on Run analysis, the data gets extracted in the form of key value pairs.

Custom models

With the help of custom models, we can analyze various forms of our interest. We just need five sample forms of same type to train the custom model using labelled or unlabeled data.

Train a Custom model

1. Use the open-source labelling tool, part of the Form OCR Test Toolset (FOTT) where we can label our custom forms, train a model, and can analyze the form using the model

https://fott-2-1.azurewebsites.net/ (for Form Recognizer 2.1 GA services)

2. Enable Resource sharing (CORS) for the storage account by clicking on the CORS tab and fill the values as the following

3. Create an Azure blob storage container and upload the forms and generate the SAS URI for the container by selecting all the permissions as shown

4. To create connection to the azure storage container go to connections page and click on add icon. Provide a display name and the SAS URL generated in the previous step.

5. Create new Project

Display Name – The project display name
Security Token – Each project will generate a security token that can be used to encrypt/decrypt sensitive project settings.
Source Connection – Select the connection to the azure blob storage created in the previous step
Folder Path – “Optional” – Specify the folder name here, if the forms are in a sub-folder on the blob container,
Form Recognizer Service Uri, API Key – Follow the procedure mentioned in step 3 of using Prebuilt models to get the service Uri and API key
Description – “Optional” – Project description

6. On the creation of project the forms will get displayed in the Tags editor tab and click on Run OCR on all files on the left side which detects the text and tables in all the documents.

7. Label the forms by creating the tags and then add the associated text. Specify the tag type and format to get better results.

8. Train the custom model by navigating to train page and click on the “train” button. Provide a name to the model for better identification in the compose models tab and examine the average accuracy. If it is low, then train additional forms.

9. Compose a model by clicking on the “Compose” icon and select the model IDs you want to compose into a single model and click on “Compose” in the upper left corner and give a name to the composed model.

When the operation completes, your newly composed model will appear in the list.

10. Any form that is submitted to the composed model goes through the classification step which matches to the corresponding model ID.

Tip: To avoid the failure in classification, label the forms with a greater number of tags.

Steps to resume a project:

1. Click on share project icon present on the right-side top

2. Create connection to the same blob storage container to restore the project.

3. Go to main page and click on “Open Cloud Project” and paste the shared project token.

As we have learnt the steps to create custom and composed models, we can use them in Form Recognizer applications to build automated data processing software.

Abbreviations:

REST: Representational State Transfer
API: Application Program Interface
SDK: Software Development Kit
OCR: Optical character recognition
GA: General Availability
CORS: Cross-Origin Resource Sharing
SAS: Shared Access Signature
URI: Uniform Resource Identifier

No Comments

Portals

Differentiators

À la carte

Next Steps

Features

Employee Experience

Using Form Recognizer to create a Document Management System

27 Aug

Using Form Recognizer to create a Document Management System

Form Recognizer Features:

Steps to resume a project:

Abbreviations:

Kovida Vegi

Categories

eBooks

Recent Posts

Why Knowledge Management is Critical for Employee Self-Service and Self-Help?

How Can a Modern Intranet Boost Employee Experience in the Financial Service Industry?

How To Enhance Employee Onboarding Experience With Digital Workplace Solutions?

5 Features That Make Modern AI-Powered Intranets Indispensable

How does Employee Engagement Impact Employee Retention?

MESH 3.0

Mesh 3.0 stands by the motto of ‘the right content at the right time to the right person’. An Employee Experience Platform powered with AI, it finely manages the data for enhanced collaboration, communication and engagement

LOCATIONS

COMPANY

2024 Acuvate. All rights reserved

Portals

Differentiators

À la carte

Next Steps

We don't believe in one-size-fits-all approach

Features

Be sure your intranet is a success. Measure your ROI with Mesh.

Employee Experience

Using Form Recognizer to create a Document Management System

27 Aug

Using Form Recognizer to create a Document Management System

Form Recognizer Features:

Steps to resume a project:

Abbreviations:

Kovida Vegi

Categories

eBooks

Recent Posts

Why Knowledge Management is Critical for Employee Self-Service and Self-Help?

How Can a Modern Intranet Boost Employee Experience in the Financial Service Industry?

How To Enhance Employee Onboarding Experience With Digital Workplace Solutions?

5 Features That Make Modern AI-Powered Intranets Indispensable

How does Employee Engagement Impact Employee Retention?