Hadoop offline data incremental updating method and system and storable medium
The invention provides a Hadoop off-line data incremental updating method and system and a storable medium, a physical table, a temporary table and a deletion table are designed by utilizing an intermediate database, data of an orc file and a parquet file written into a Hadoop cluster off-line datab...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention provides a Hadoop off-line data incremental updating method and system and a storable medium, a physical table, a temporary table and a deletion table are designed by utilizing an intermediate database, data of an orc file and a parquet file written into a Hadoop cluster off-line database are updated in combination with a data monitoring service and a data updating service of a Hadoop cluster, and the data of the orc file and the parquet file written into the Hadoop cluster off-line database are updated. According to the method and the system, the offline orc file and the offline parquet file are recorded at the same time, the data monitoring service monitors data changes, the data updating service performs incremental updating on the offline data, the problem that the incremental updating data are repeatedly stored in a Hadoop cluster is solved, and the accuracy and freshness of an operation result are ensured. The method can be suitable for scenes which have high requirements on data quality, |
---|