Don't Thrash: How to Cache Your Hash on Flash
Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 11, pp. 1627-1637 (2012) This paper presents new alternatives to the well-known Bloom filter data structure. The Bloom filter, a compact data structure supporting set insertion and membership queries, has found wide application in databases, sto...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 11, pp.
1627-1637 (2012) This paper presents new alternatives to the well-known Bloom filter data
structure. The Bloom filter, a compact data structure supporting set insertion
and membership queries, has found wide application in databases, storage
systems, and networks. Because the Bloom filter performs frequent random reads
and writes, it is used almost exclusively in RAM, limiting the size of the sets
it can represent. This paper first describes the quotient filter, which
supports the basic operations of the Bloom filter, achieving roughly comparable
performance in terms of space and time, but with better data locality.
Operations on the quotient filter require only a small number of contiguous
accesses. The quotient filter has other advantages over the Bloom filter: it
supports deletions, it can be dynamically resized, and two quotient filters can
be efficiently merged. The paper then gives two data structures, the buffered
quotient filter and the cascade filter, which exploit the quotient filter
advantages and thus serve as SSD-optimized alternatives to the Bloom filter.
The cascade filter has better asymptotic I/O performance than the buffered
quotient filter, but the buffered quotient filter outperforms the cascade
filter on small to medium data sets. Both data structures significantly
outperform recently-proposed SSD-optimized Bloom filter variants, such as the
elevator Bloom filter, buffered Bloom filter, and forest-structured Bloom
filter. In experiments, the cascade filter and buffered quotient filter
performed insertions 8.6-11 times faster than the fastest Bloom filter variant
and performed lookups 0.94-2.56 times faster. |
---|---|
DOI: | 10.48550/arxiv.1208.0290 |