DOCUMENT ENTITY EXTRACTION USING DOCUMENT REGION DETECTION

In some embodiments, techniques for document entity extraction are provided. For example, a process may involve processing document images to detect a plurality of regions of interest that includes text objects and non-text objects; for each of the plurality of regions of interest, producing a corre...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Singh, Harvarinder, Zeng, Zhihong, Chouksey, Ankit, Chaudhry, Anwar, Chandrasekhar, Rajesh, Kumar, Sandeep
Format:	Patent
Sprache:	eng
Schlagworte:	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In some embodiments, techniques for document entity extraction are provided. For example, a process may involve processing document images to detect a plurality of regions of interest that includes text objects and non-text objects; for each of the plurality of regions of interest, producing a corresponding text string; and processing the text strings to identify entities. Processing the document images may involve applying a text object detection model to the document images to detect the text objects; and applying at least one non-text object detection model to the document images to detect the non-text objects. Prior to processing the document images, at least two object detection models among the text object detection model and the at least one non-text object detection model were generated by fine-tuning respective instances of a pre-trained object detection model.