Estimation of document structure
A system and method for estimating document structure of a document which includes extracting one or more candidate elements describing the document structure from the document and grouping the one or more candidate elements into a group and building one or more trees for the group. Each tree has a...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A system and method for estimating document structure of a document which includes extracting one or more candidate elements describing the document structure from the document and grouping the one or more candidate elements into a group and building one or more trees for the group. Each tree has a root node and a leaf node selected from the candidate elements in the group. The method further includes pruning the one or more trees while leaving a path from the root node to the leaf node, based on whether a text corresponding to the path to the leaf node is accommodated in a single group of words. |
---|