Disambiguation of massive graph databases

Certain aspects provide techniques for disambiguating graph data. In one example, a method includes receiving entity data from a data source in a first format; converting the entity data in the first format to a second format, wherein the second format is a standardized input format for a disambigua...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Srinivas, Sudhir, Geraghty, Kevin
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Certain aspects provide techniques for disambiguating graph data. In one example, a method includes receiving entity data from a data source in a first format; converting the entity data in the first format to a second format, wherein the second format is a standardized input format for a disambiguation pipeline; determining a blocked data set from the entity data in the second format based on a blocking parameter, wherein: the blocked data set comprises data regarding a first plurality of entities, and the first plurality of entities is a subset of a second plurality of entities represented in the entity data from the data source; matching at least two entities in the first plurality of entities in the blocked data set; merging the at least two entities into a single entity; generating a unique ID for the single entity; and importing the single entity into a graph database.