Hyper-distance oracles in hypergraphs
We study point-to-point distance estimation in hypergraphs, where the query is parameterized by a positive integer s , which defines the required level of overlap for two hyperedges to be considered adjacent. To answer s -distance queries, we first explore an oracle based on the line graph of the gi...
Gespeichert in:
Veröffentlicht in: | The VLDB journal 2024-09, Vol.33 (5), p.1333-1356 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We study point-to-point distance estimation in hypergraphs, where the query is parameterized by a positive integer
s
, which defines the required level of overlap for two hyperedges to be considered adjacent. To answer
s
-distance queries, we first explore an oracle based on the line graph of the given hypergraph and discuss its limitations: The line graph is typically orders of magnitude larger than the original hypergraph. We then introduce
HypED
, a landmark-based oracle with a predefined size, built directly on the hypergraph, thus avoiding the materialization of the line graph. Our framework allows to approximately answer vertex-to-vertex, vertex-to-hyperedge, and hyperedge-to-hyperedge
s
-distance queries for any value of
s
. A key observation at the basis of our framework is that as
s
increases, the hypergraph becomes more fragmented. We show how this can be exploited to improve the placement of landmarks, by identifying the
s
-connected components of the hypergraph. For this latter task, we devise an efficient algorithm based on the union-find technique and a dynamic inverted index. We experimentally evaluate
HypED
on several real-world hypergraphs and prove its versatility in answering
s
-distance queries for different values of
s
. Our framework allows answering such queries in fractions of a millisecond while allowing fine-grained control of the trade-off between index size and approximation error at creation time. Finally, we prove the usefulness of the
s
-distance oracle in two applications, namely hypergraph-based recommendation and the approximation of the
s
-closeness centrality of vertices and hyperedges in the context of protein-protein interactions. |
---|---|
ISSN: | 1066-8888 0949-877X |
DOI: | 10.1007/s00778-024-00851-2 |