Temporal SIR-GN: Efficient and Effective Structural Representation Learning for Temporal Graphs

Node representation learning (NRL) generates numerical vectors (embeddings) for the nodes of a graph. Structural NRL specifically assigns similar node embeddings for those nodes that exhibit similar structural roles. This is in contrast with its proximity-based counterpart, wherein similarity betwee...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the VLDB Endowment 2023-05, Vol.16 (9), p.2075-2089
Hauptverfasser: Layne, Janet, Carpenter, Justin, Serra, Edoardo, Gullo, Francesco
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Node representation learning (NRL) generates numerical vectors (embeddings) for the nodes of a graph. Structural NRL specifically assigns similar node embeddings for those nodes that exhibit similar structural roles. This is in contrast with its proximity-based counterpart, wherein similarity between embeddings reflects spatial proximity among nodes. Structural NRL is useful for tasks such as node classification where nodes of the same class share structural roles, though there may exist a distant, or no path between them. Athough structural NRL has been well-studied in static graphs, it has received limited attention in the temporal setting. Here, the embeddings are required to represent the evolution of nodes' structural roles over time. The existing methods are limited in terms of efficiency and effectiveness: they scale poorly to even moderate number of timestamps, or capture structural role only tangentially. In this work, we present a novel unsupervised approach to structural representation learning for temporal graphs that overcomes these limitations. For each node, our approach clusters then aggregates the embedding of a node's neighbors for each timestamp, followed by a further temporal aggregation of all timestamps. This is repeated for (at most) d iterations, so as to acquire information from the d -hop neighborhood of a node. Our approach takes linear time in the number of overall temporal edges, and possesses important theoretical properties that formally demonstrate its effectiveness. Extensive experiments on synthetic and real datasets show superior performance in node classification and regression tasks, and superior scalability of our approach to large graphs.
ISSN:2150-8097
2150-8097
DOI:10.14778/3598581.3598583