SYSTEM AND METHOD FOR AUTOMATIC DATA ENRICHMENT FROM MULTIPLE PUBLIC DATASETS IN DATA INTEGRATION TOOLS
A source dataset is enriched by standardization of address data, date and time analysis, and demographic analysis. The enriched source dataset is used to form one or more distinct clusters that are unique combinations of values for one or more attributes of the enriched source dataset. One or more r...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A source dataset is enriched by standardization of address data, date and time analysis, and demographic analysis. The enriched source dataset is used to form one or more distinct clusters that are unique combinations of values for one or more attributes of the enriched source dataset. One or more related datasets are found for each of the clusters, and the related datasets are merged into the enriched source dataset using a distributed join operation, wherein the distributed join allows each row of the source dataset to be joined with a different one of the related datasets, where the different one of the related datasets is closest to the cluster to which the row belongs. |
---|