Method and system for document indexing and retrieval

Existing systems for document processing are either based on a supervised approach using annotated tags, and these systems identify section-based data from the unstructured documents without considering the statistical variations in content, which results in highly inaccurate content extraction. The...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Ansari, Saad, Patel, Hemil, Tripathy, Saswati Soumya, Thakare, Shreya Sanjay, Rana, Rahul, Shah, Pranav Champaklal, Poojary, Sudhakara Deva
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Existing systems for document processing are either based on a supervised approach using annotated tags, and these systems identify section-based data from the unstructured documents without considering the statistical variations in content, which results in highly inaccurate content extraction. The disclosure herein generally relates to document processing, and, more particularly, to method and system for document indexing and retrieval. The system provides a mechanism to correlate unique words in a document with different topics identified in the document, based on a word pattern identified from the document. The correlations are captured in a knowledge graph, and can be further used in applications such as but not limited to document retrieval.