Large Scale Network Embedding: A Separable Approach
Many successful methods have been proposed for learning low-dimensional representations on large-scale networks, while almost all existing methods are designed in inseparable processes, learning embeddings for entire networks even when only a small proportion of nodes are of interest. This leads to...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on knowledge and data engineering 2022-04, Vol.34 (4), p.1829-1842 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Many successful methods have been proposed for learning low-dimensional representations on large-scale networks, while almost all existing methods are designed in inseparable processes, learning embeddings for entire networks even when only a small proportion of nodes are of interest. This leads to great inconvenience, especially on large-scale or dynamic networks, where these methods become almost impossible to implement. In this paper, we formalize the problem of separated matrix factorization, based on which we elaborate a novel objective function that preserves both local and global information. We compare our SMF framework with approximate SVD algorithms and demonstrate SMF can capture more information when factorizing a given matrix. We further propose SepNE, a simple and flexible network embedding algorithm which independently learns representations for different subsets of nodes in separated processes. By implementing separability, our algorithm reduces the redundant efforts to embed irrelevant nodes, yielding scalability to large networks. To further incorporate complex information into SepNE, we discuss several methods that can be used to leverage high-order proximities in large networks. We demonstrate the effectiveness of SepNE on several real-world networks with different scales and subjects. With comparable accuracy, our approach significantly outperforms state-of-the-art baselines in running times on large networks. |
---|---|
ISSN: | 1041-4347 1558-2191 |
DOI: | 10.1109/TKDE.2020.3002700 |