Distributed small file treatment method and device, equipment and storage medium
The invention discloses a distributed small file management method and device, equipment and a storage medium, and relates to the technical field of big data, and the method comprises the steps: grouping all to-be-managed small files in each partition to obtain N file groups; constructing a hash tab...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses a distributed small file management method and device, equipment and a storage medium, and relates to the technical field of big data, and the method comprises the steps: grouping all to-be-managed small files in each partition to obtain N file groups; constructing a hash table based on the directory path and the file group of each partition; respectively reading all file groups corresponding to each partition into a memory to obtain a plurality of original data sets, and newly adding a column of which the field name is a group number in each original data set; merging all the original data sets corresponding to the same partition to obtain merged data sets, and writing data with the same group number in each merged data set into the same file to obtain a merged file; and replacing all the to-be-managed small files in the corresponding partitions with the merged files, and determining whether data rollback is performed or not according to a replacement result. According to the method a |
---|