DOCUMENT ENTITY EXTRACTION USING DOCUMENT REGION DETECTION

In some embodiments, techniques for document entity extraction are provided. For example, a process may involve processing document images to detect a plurality of regions of interest that includes text objects and non-text objects; for each of the plurality of regions of interest, producing a corre...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Singh, Harvarinder, Zeng, Zhihong, Chouksey, Ankit, Chaudhry, Anwar, Chandrasekhar, Rajesh, Kumar, Sandeep
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In some embodiments, techniques for document entity extraction are provided. For example, a process may involve processing document images to detect a plurality of regions of interest that includes text objects and non-text objects; for each of the plurality of regions of interest, producing a corresponding text string; and processing the text strings to identify entities. Processing the document images may involve applying a text object detection model to the document images to detect the text objects; and applying at least one non-text object detection model to the document images to detect the non-text objects. Prior to processing the document images, at least two object detection models among the text object detection model and the at least one non-text object detection model were generated by fine-tuning respective instances of a pre-trained object detection model.