IMAGE READING SYSTEMS, METHODS AND STORAGE MEDIUM FOR PERFORMING GEOMETRIC EXTRACTION
Geometric extraction is performed on an unstructured document by recognizing textual blocks on at least a portion of a page of the unstructured document, generating bounding boxes that surround and correspond to the textual blocks, determining search paths having coordinates of two endpoints and con...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Geometric extraction is performed on an unstructured document by recognizing textual blocks on at least a portion of a page of the unstructured document, generating bounding boxes that surround and correspond to the textual blocks, determining search paths having coordinates of two endpoints and connecting at least two bounding boxes, and generating a graph representation of the at least a portion of the page, the graph representation including the plurality of textual blocks, the coordinates of the vertices of each bounding box and the coordinates of the two endpoints of each search path. |
---|