APPROXIMATING THE LAYOUT OF A PAPER DOCUMENT

An image processing method to generate a layout of searchable content from a physical document. The method includes generating extracted content blocks in the physical document, generating, based on a bounding box of a text block, a layout rectangle that identifies where machine-encoded text is plac...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Bellert, Darrell Eugene
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:An image processing method to generate a layout of searchable content from a physical document. The method includes generating extracted content blocks in the physical document, generating, based on a bounding box of a text block, a layout rectangle that identifies where machine-encoded text is placed in the layout of the searchable content, generating, based on a bounding box of a non-text block, an avoidance region that identifies where the machine-encoded text is prohibited in the layout of the searchable content, generating, based on the layout rectangle and the avoidance region, a draft layout of the searchable content, and iteratively adjusting a point size of the machine-encoded text in the draft layout to generate the layout of the searchable content.