Distance labeling: on parallelism, compression, and ordering
Distance labeling approaches are widely adopted to speed up the online performance of shortest-distance queries. The construction of the distance labeling, however, can be exhaustive, especially on big graphs. For a major category of large graphs, small-world networks, the state-of-the-art approach...
Gespeichert in:
Veröffentlicht in: | The VLDB journal 2022, Vol.31 (1), p.129-155 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Distance labeling approaches are widely adopted to speed up the online performance of shortest-distance queries. The construction of the distance labeling, however, can be exhaustive, especially on big graphs. For a major category of large graphs, small-world networks, the state-of-the-art approach is pruned landmark labeling (
PLL
).
PLL
prunes distance labels based on a node order and directly constructs the pruned labels by performing breadth-first searches in the node order. The pruning technique, as well as the index construction, has a strong sequential nature which hinders
PLL
from being parallelized. It becomes an urgent issue on massive small-world networks whose index can hardly be constructed by a single thread within a reasonable time. This paper first scales distance labeling on small-world networks by proposing a parallel shortest-distance labeling (
PSL
) scheme.
PSL
insightfully converts the
PLL
’s node-order dependency to a shortest-distance dependence, which leads to a propagation-based parallel labeling in
D
rounds where
D
denotes the diameter of the graph. To further scale up
PSL
, it is critical to reduce the index size. This paper proposes effective index compression techniques based on graph properties as well as label properties; it also explores best practices in using betweenness-based node order to reduce the index size. The efficient betweenness estimation of the graph nodes proposed may be of independent interest to graph practitioners. Extensive experimental results verify our efficiency on billion-scale graphs, near-linear speedup in a multi-core environment, and up to
94
%
reduction in the index size. |
---|---|
ISSN: | 1066-8888 0949-877X |
DOI: | 10.1007/s00778-021-00694-1 |