SYSTEM AND METHOD FOR AUTOMATIC DATA ENRICHMENT FROM MULTIPLE PUBLIC DATASETS IN DATA INTEGRATION TOOLS

A source dataset is enriched by standardization of address data, date and time analysis, and demographic analysis. The enriched source dataset is used to form one or more distinct clusters that are unique combinations of values for one or more attributes of the enriched source dataset. One or more r...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Ramos, Jo A, Bhide, Manish A
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A source dataset is enriched by standardization of address data, date and time analysis, and demographic analysis. The enriched source dataset is used to form one or more distinct clusters that are unique combinations of values for one or more attributes of the enriched source dataset. One or more related datasets are found for each of the clusters, and the related datasets are merged into the enriched source dataset using a distributed join operation, wherein the distributed join allows each row of the source dataset to be joined with a different one of the related datasets, where the different one of the related datasets is closest to the cluster to which the row belongs.