SYSTEM AND METHOD FOR EXTRACTING TABULAR DATA FROM A DOCUMENT
The present invention relates to a method for extracting tabular data from a document. The method includes identifying a bordered table or a borderless table in a received document and an image of the document. The tabular data in the identified bordered table is extracted using a first and a second...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The present invention relates to a method for extracting tabular data from a document. The method includes identifying a bordered table or a borderless table in a received document and an image of the document. The tabular data in the identified bordered table is extracted using a first and a second set of pixel coordinates from the plurality of pixel coordinates. Further, upon identifying the borderless table in the document, a first set of document coordinates of at least one row of the borderless table is determined. Furthermore, a second set of document coordinates of the at least one column corresponding to the at least one row is determined. Finally, the tabular data in the identified borderless table is extracted from the document based on the determined first and second set of document coordinates. |
---|