FORM AND TEMPLATE DETECTION
Methods, systems and computer program products for content management systems. A content management system is configured to manage a plurality of content objects. Unsupervised learning is performed over the plurality of content objects to identify document templates that are associated with content...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Methods, systems and computer program products for content management systems. A content management system is configured to manage a plurality of content objects. Unsupervised learning is performed over the plurality of content objects to identify document templates that are associated with content objects taken from the plurality of content objects. When a document template is identified, then template metadata is associated with the document template. Additional content objects that are similar to the document template can take on the template metadata as well. In this way, many documents can be automatically populated with template metadata that corresponds to the identified document template. All or portions of the template metadata can be applied to policies, which policies serve to marshal ongoing document handling operations. During learning, document features are extracted and analyzed so as to define feature clusters, which feature clusters are in turn are used to form document template clusters. |
---|