METHOD AND APPARATUS FOR AUTOMATIC ENTITY DISAMBIGUATION

Entity disambiguation resolves which names, words, or phrases in text correspond to distinct persons, organizations, locations, or other entities in the context of an entire corpus. The invention is based largely on language-independent algorithms. Thus, it is applicable not only to unstructured tex...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: BLUME, MATTHIAS, ZOLDI, SCOTT, FREITAG, DAYNE, CALMBACH, RICHARD, ROHWER, RICHARD
Format: Patent
Sprache:eng ; fre ; ger
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Entity disambiguation resolves which names, words, or phrases in text correspond to distinct persons, organizations, locations, or other entities in the context of an entire corpus. The invention is based largely on language-independent algorithms. Thus, it is applicable not only to unstructured text from arbitrary human languages, but also to semi-structured data, such as citation databases and the disambiguation of named entities mentioned in wire transfer transaction records for the purpose of detecting money-laundering activity. The system uses multiple types of context as evidence for determining whether two mentions correspond to the same entity and it automatically learns the weight of evidence of each context item via corpus statistics. The invention uses multiple search keys to efficiently find pairs of mentions that correspond to the same entity, while skipping billions of unnecessary comparisons, yielding a system with very high throughput that can be applied to truly massive data.