Partition and Code: learning how to compress graphs
Proc. Adv. Neur. Inf. Process. Syst. (NeurIPS) vol. 34 pp. 18603 - 18619 (2021) Can we use machine learning to compress graph data? The absence of ordering in graphs poses a significant challenge to conventional compression algorithms, limiting their attainable gains as well as their ability to disc...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Proc. Adv. Neur. Inf. Process. Syst. (NeurIPS) vol. 34 pp. 18603 -
18619 (2021) Can we use machine learning to compress graph data? The absence of ordering
in graphs poses a significant challenge to conventional compression algorithms,
limiting their attainable gains as well as their ability to discover relevant
patterns. On the other hand, most graph compression approaches rely on
domain-dependent handcrafted representations and cannot adapt to different
underlying graph distributions. This work aims to establish the necessary
principles a lossless graph compression method should follow to approach the
entropy storage lower bound. Instead of making rigid assumptions about the
graph distribution, we formulate the compressor as a probabilistic model that
can be learned from data and generalise to unseen instances. Our "Partition and
Code" framework entails three steps: first, a partitioning algorithm decomposes
the graph into subgraphs, then these are mapped to the elements of a small
dictionary on which we learn a probability distribution, and finally, an
entropy encoder translates the representation into bits. All the components
(partitioning, dictionary and distribution) are parametric and can be trained
with gradient descent. We theoretically compare the compression quality of
several graph encodings and prove, under mild conditions, that PnC achieves
compression gains that grow either linearly or quadratically with the number of
vertices. Empirically, PnC yields significant compression improvements on
diverse real-world networks. |
---|---|
DOI: | 10.48550/arxiv.2107.01952 |