SYSTEM AND METHOD FOR EXTRACTING TABULAR DATA FROM A DOCUMENT

The present invention relates to a method for extracting tabular data from a document. The method includes identifying a bordered table or a borderless table in a received document and an image of the document. The tabular data in the identified bordered table is extracted using a first and a second...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Yellapragada, Krishna Prasad, URS, Veena Srikanth Raje, AGGARWAL, Amit
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The present invention relates to a method for extracting tabular data from a document. The method includes identifying a bordered table or a borderless table in a received document and an image of the document. The tabular data in the identified bordered table is extracted using a first and a second set of pixel coordinates from the plurality of pixel coordinates. Further, upon identifying the borderless table in the document, a first set of document coordinates of at least one row of the borderless table is determined. Furthermore, a second set of document coordinates of the at least one column corresponding to the at least one row is determined. Finally, the tabular data in the identified borderless table is extracted from the document based on the determined first and second set of document coordinates.