Bloom Filter With a False Positive Free Zone

Bloom filters and their variants are widely used as space-efficient probabilistic data structures for representing sets and are very popular in networking applications. They support fast element insertion and deletion, along with membership queries with the drawback of false positives. Bloom filters...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE eTransactions on network and service management 2021-06, Vol.18 (2), p.2334-2349
Hauptverfasser: Kiss, Sandor Z., Hosszu, Eva, Tapolcai, Janos, Ronyai, Lajos, Rottenstreich, Ori
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Bloom filters and their variants are widely used as space-efficient probabilistic data structures for representing sets and are very popular in networking applications. They support fast element insertion and deletion, along with membership queries with the drawback of false positives. Bloom filters can be designed to match the false positive rates that are acceptable for the application domain. However, in many applications, a common engineering solution is to set the false positive rate very small and ignore the existence of the very unlikely false positive answers. This article is devoted to close the gap between the two design concepts of unlikely and not having false positives. We propose a data structure called EGH filter that supports the Bloom filter operations, and besides, it can guarantee false positive free operations for a finite universe and a restricted number of elements stored in the filter. We refer to the limited universe and filter size as the false positive free zone of the filter. We describe necessary conditions for the false-positive free zone of a filter. We then generalize the filter to support the listing of the elements through the use of counters rather than bits. We detail networking applications of the filter and discuss potential generalizations. We evaluate the performance of the filter in comparison with the traditional Bloom filters. We also evaluate the price in terms of memory that needs to be paid to guarantee real false positive-free operations for having a deterministic Bloom filter-like behavior. Our data structure is based on recently developed combinatorial group testing techniques.
ISSN:1932-4537
1932-4537
DOI:10.1109/TNSM.2021.3059075