MBFS: a parallel metadata search method based on Bloomfilters using MapReduce for large-scale file systems

The metadata search is an important way to access and manage file systems. Many solutions have been proposed to tackle performance issue of metadata search. However, the existing solutions build a separate metadata index at the internal or external file system through the related data structure or d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of supercomputing 2016-08, Vol.72 (8), p.3006-3032
Hauptverfasser: Huo, Zhisheng, Xiao, Limin, Zhong, Qiaoling, Li, Shupan, Li, Ang, Ruan, Li, Wang, Shouxin, Fu, Lihong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The metadata search is an important way to access and manage file systems. Many solutions have been proposed to tackle performance issue of metadata search. However, the existing solutions build a separate metadata index at the internal or external file system through the related data structure or database use semantics and event-notification method to construct the index structure, utilize the sampling-based method to conduct direct metadata search on the namespace, face problems of the high I/O overhead for maintaining consistency between metadata indexes and metadata, have enormous space overhead for metadata indexes storing and low accuracy of results and so on. To address these problems, this paper presents MBFS, a fast, accurate and lightweight metadata search method based on multi-dimensional Bloomfilters. We create a multi-dimensional Bloomfilter structure on the basis of the directory entry that can prune sub-trees to narrow the search scope of namespace. MBFS is capable of producing fast and accurate answers for a class of complex search over a file system after consuming a small number of disk accesses. MBFS residing in the file system does not need additional I/O overhead to maintain consistency. MBFS consists of Bloomfilters which are composed of bits, so it is a lightweight metadata search method that consumes marginal space overhead. Moreover, MBFS employs MapReduce for speeding up search under the environment of multiple metadata servers. Extensive experiments are conducted to prove the effectiveness of MBFS. The experimental results show that MBFS can achieve an excellent performance not only on the search latency, but also on the accuracy of results with low space and time overhead.
ISSN:0920-8542
1573-0484
DOI:10.1007/s11227-015-1464-2