Loading method and system for automatically loading big data

The invention provides a loading method and system for automatically loading big data. The loading method comprises the following steps: checking and uploading a data file and a data mark file to a distributed file system HDFS (Hadoop Distributed File System); initialization configuration is carried...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: HU XIAOMING, ZHANG YUNLIANG, WEI DONGWANG, LIAO YUNWEN, TANG JIN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention provides a loading method and system for automatically loading big data. The loading method comprises the following steps: checking and uploading a data file and a data mark file to a distributed file system HDFS (Hadoop Distributed File System); initialization configuration is carried out, data produced in the data checking and uploading step are sliced after being processed, and the sliced data are written into cache regions on different machines on a cluster; and writing the data in the cache region into a distributed file system HDFS to form a new file, and automatically reading the new file written into a directory of the distributed file system HDFS through hive. According to the method, the automatic loading speed of the big data is increased, the fault tolerance of mapReduce is improved, the possibility of data skew is reduced, and the requirements of textFi, Parquet, ORC and other different formats required after the big data is read in are basically covered, so that an automatic loadin