Loading method and system for automatically loading big data
The invention provides a loading method and system for automatically loading big data. The loading method comprises the following steps: checking and uploading a data file and a data mark file to a distributed file system HDFS (Hadoop Distributed File System); initialization configuration is carried...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention provides a loading method and system for automatically loading big data. The loading method comprises the following steps: checking and uploading a data file and a data mark file to a distributed file system HDFS (Hadoop Distributed File System); initialization configuration is carried out, data produced in the data checking and uploading step are sliced after being processed, and the sliced data are written into cache regions on different machines on a cluster; and writing the data in the cache region into a distributed file system HDFS to form a new file, and automatically reading the new file written into a directory of the distributed file system HDFS through hive. According to the method, the automatic loading speed of the big data is increased, the fault tolerance of mapReduce is improved, the possibility of data skew is reduced, and the requirements of textFi, Parquet, ORC and other different formats required after the big data is read in are basically covered, so that an automatic loadin |
---|