Optimization method for loading Hive data to local file system

The invention relates to the field of data processing, in particular to an optimization method for loading Hive data to a local file system, which comprises the following steps of: S1, copying an input file from a data source into an execution unit through Hive for each MapReduce task; s2, the MapRe...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	LYU YANKUI, GAO JINGJUN, GAO HAILING
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention relates to the field of data processing, in particular to an optimization method for loading Hive data to a local file system, which comprises the following steps of: S1, copying an input file from a data source into an execution unit through Hive for each MapReduce task; s2, the MapReduce task is operated, and an output file is generated; s3, the output files are combined through Hive; s4, writing the obtained combined file into a temporary directory of an HDFS file system; and S5, loading the result file into a target directory of the local file system, deleting the temporary directory, and optimizing the creation of the directory in the same path of the HDFS in the process of loading the data to the local file system through Hive, thereby avoiding the problem of no corresponding HDFS file system authority. 本发明涉及数据处理领域，尤其涉及一种将Hive数据load到本地文件系统的优化方法，本发明包括，步骤S1，针对每个MapReduce任务，通过Hive从数据源复制输入文件到执行单元中；步骤S2，将MapReduce任务运行并生成输出文件；步骤S3，通过Hive将各所述输出文件合并；步骤S4，将所得合并文件写入HDFS文件系统的临时目录中；步骤S5，将结果文件加载到本地文件系统