Generating post-hoc explanations for Skip-gram-based node embeddings by identifying important nodes with bridgeness

Node representation learning in a network is an important machine learning technique for encoding relational information in a continuous vector space while preserving the inherent properties and structures of the network. Recently, unsupervised node embedding methods such as DeepWalk (Perozzi et al....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural networks 2023-07, Vol.164, p.546-561
Hauptverfasser: Park, Hogun, Neville, Jennifer
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Node representation learning in a network is an important machine learning technique for encoding relational information in a continuous vector space while preserving the inherent properties and structures of the network. Recently, unsupervised node embedding methods such as DeepWalk (Perozzi et al., 2014), LINE (Tang et al., 2015), struc2vec (Ribeiro et al., 2017), PTE (Tang et al., 2015), UserItem2vec (Wu et al., 2020), and RWJBG (Li et al., 2021) have emerged from the Skip-gram model (Mikolov et al., 2013) and perform better performance in several downstream tasks such as node classification and link prediction than the existing relational models. However, providing post-hoc explanations of unsupervised embeddings remains a challenging problem because of the lack of explanation methods and theoretical studies applicable for embeddings. In this paper, we first show that global explanations to the Skip-gram-based embeddings can be found by computing bridgeness under a spectral cluster-aware local perturbation. Moreover, a novel gradient-based explanation method, which we call GRAPH-wGD, is proposed that allows the top-q global explanations about learned graph embedding vectors more efficiently. Experiments show that the ranking of nodes by scores using GRAPH-wGD is highly correlated with true bridgeness scores. We also observe that the top-q node-level explanations selected by GRAPH-wGD have higher importance scores and produce more changes in class label prediction when perturbed, compared with the nodes selected by recent alternatives, using five real-world graphs.
ISSN:0893-6080
1879-2782
DOI:10.1016/j.neunet.2023.04.029