File merging method and device

The invention relates to the field of data processing, in particular to a file merging method and device. The method comprises the steps that an application log of a Spark application is obtained, the application log comprises calling information generated when each Spark task in a plurality of Spar...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: LIU JIZHE
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention relates to the field of data processing, in particular to a file merging method and device. The method comprises the steps that an application log of a Spark application is obtained, the application log comprises calling information generated when each Spark task in a plurality of Spark tasks calls a file in a plurality of files, the calling information comprises the data size of file data called by the Spark tasks, and the data size of the file data called by the Spark tasks is smaller than the data size of the file data called by the Spark tasks; the file data called by the Spark task is all or part of data in a file called by the Spark task; based on the calling information of the multiple Spark tasks, obtaining the data volume of the multiple files; and combining at least two files in the plurality of files into one file based on the data volume of the plurality of files. By means of the method, the sizes of the files can be efficiently recognized, and file merging is achieved. 本申请涉及数据处理领域,具