Improvement of snapshot differential algorithm based on hadoop platform

Snapshot differential algorithm is one of ways of extracting delta from views in the data warehouse in data integration circumstance. Due to the scale of the views in data warehouse is likely to be very massive, it will take lots of time to run snapshot differential algorithm and become the bottlene...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Guoyong Yuan, Bin Li, Taiyang Xiao
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Snapshot differential algorithm is one of ways of extracting delta from views in the data warehouse in data integration circumstance. Due to the scale of the views in data warehouse is likely to be very massive, it will take lots of time to run snapshot differential algorithm and become the bottleneck of the system performance. In this paper, in order to improve efficiency of Snapshot Differential Algorithm, by using the massive data processing platform, we modify traditional Partition Hash algorithm, improve the efficiency and reduce the calculating time. At the end of this paper, we show a test which will demonstrate the improvement of efficiency after modification.
DOI:10.1109/CSQRWC.2011.6037179