Efficient Multiset Synchronization
Set synchronization is an essential job for distributed applications. In many cases, given two sets A and B, applications need to identify those elements that appear in set A but not in set B, and vice versa. Bloom filter, a spaceefficient data structure for representing a set and supporting members...
Gespeichert in:
Veröffentlicht in: | IEEE/ACM transactions on networking 2017-04, Vol.25 (2), p.1190-1205 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Set synchronization is an essential job for distributed applications. In many cases, given two sets A and B, applications need to identify those elements that appear in set A but not in set B, and vice versa. Bloom filter, a spaceefficient data structure for representing a set and supporting membership queries, has been employed as a lightweight method to realize set synchronization with a low false positive probability. Unfortunately, bloom filters and their variants can only be applied to simple sets rather than more general multisets, which allow elements to appear multiple times. In this paper, we first examine the potential of addressing the multiset synchronization problem based on two existing variants of the bloom filters: the IBF and the counting bloom filter (CBF). We then design a novel data structure, invertible CBF (ICBF), which represents a multiset using a vector of cells. Each cell contains two fields, id and count, which record the identifiers and number of elements mapped into them, respectively. Given two multisets, based on the encoding results, the ICBF can execute the dedicated subtracting and decoding operations to recognize the different elements and differences in the multiplicities of elements between the two multisets. We conduct comprehensive experiments to evaluate and compare the three dedicated multiset synchronization approaches proposed in this paper. The evaluation results indicate that the ICBF-based approach outperforms the other two approaches in terms of synchronization accuracy, timeconsumption, and communication overhead. |
---|---|
ISSN: | 1063-6692 1558-2566 |
DOI: | 10.1109/TNET.2016.2618006 |