How to Realize Efficient and Scalable Graph Embeddings via an Entropy-Driven Mechanism

Graph embedding is becoming widely adopted as an efficient way to learn graph representations required to solve graph analytics problems. However, most existing graph embedding methods, owing to computation-efficiency challenges for large-scale graphs, generally employ a one-size-fits-all strategy t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on big data 2023-02, Vol.9 (1), p.358-371
Hauptverfasser: Fang, Peng, Wang, Fang, Shi, Zhan, Jiang, Hong, Feng, Dan, Xu, Xianghao, Yin, Wei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Graph embedding is becoming widely adopted as an efficient way to learn graph representations required to solve graph analytics problems. However, most existing graph embedding methods, owing to computation-efficiency challenges for large-scale graphs, generally employ a one-size-fits-all strategy to extract information, resulting in a large amount of redundant or inaccurate representations. In this work, we propose HuGE+, an efficient and scalable graph embedding method enabled by an entropy-driven mechanism. Specifically, HuGE+ leverages hybrid-property heuristic random walk to capture node features, which considers both information content of nodes and the number of common neighbors in each walking step. More importantly, to guarantee information effectiveness of sampling, HuGE+ adopts two heuristic methods to decide the random walk length and the number of walks per node, respectively. Extensive experiments on real-world graphs demonstrate that HuGE+ achieves both efficiency and performance advantages over recent popular graph embedding approaches. For three downstream graph tasks, our approach not only offers >10% average gains, but also exhibits 23×-127× speedup over existing sampling-based methods. In addition, HuGE+ significantly reduces memory footprint by an average of 68.9%, facilitating training for billion-node-scale graph embeddings.
ISSN:2332-7790
2372-2096
DOI:10.1109/TBDATA.2022.3164575