Distributed small file treatment method and device, equipment and storage medium

The invention discloses a distributed small file management method and device, equipment and a storage medium, and relates to the technical field of big data, and the method comprises the steps: grouping all to-be-managed small files in each partition to obtain N file groups; constructing a hash tab...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: HAO WEILIANG, XU CHAO, FENG FANGWEI, LIANG WEIXIONG, LI ZI'AO
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a distributed small file management method and device, equipment and a storage medium, and relates to the technical field of big data, and the method comprises the steps: grouping all to-be-managed small files in each partition to obtain N file groups; constructing a hash table based on the directory path and the file group of each partition; respectively reading all file groups corresponding to each partition into a memory to obtain a plurality of original data sets, and newly adding a column of which the field name is a group number in each original data set; merging all the original data sets corresponding to the same partition to obtain merged data sets, and writing data with the same group number in each merged data set into the same file to obtain a merged file; and replacing all the to-be-managed small files in the corresponding partitions with the merged files, and determining whether data rollback is performed or not according to a replacement result. According to the method a