Hadoop offline data incremental updating method and system and storable medium

The invention provides a Hadoop off-line data incremental updating method and system and a storable medium, a physical table, a temporary table and a deletion table are designed by utilizing an intermediate database, data of an orc file and a parquet file written into a Hadoop cluster off-line datab...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: LIU YUAN, WANG JICHUAN, BI YONGHUI, ZHOU CHENGZU, YAN XIAOZHENG, TANG CHENGWU, PENG CHONGLIN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention provides a Hadoop off-line data incremental updating method and system and a storable medium, a physical table, a temporary table and a deletion table are designed by utilizing an intermediate database, data of an orc file and a parquet file written into a Hadoop cluster off-line database are updated in combination with a data monitoring service and a data updating service of a Hadoop cluster, and the data of the orc file and the parquet file written into the Hadoop cluster off-line database are updated. According to the method and the system, the offline orc file and the offline parquet file are recorded at the same time, the data monitoring service monitors data changes, the data updating service performs incremental updating on the offline data, the problem that the incremental updating data are repeatedly stored in a Hadoop cluster is solved, and the accuracy and freshness of an operation result are ensured. The method can be suitable for scenes which have high requirements on data quality,