Self-supervised graph clustering via attention auto-encoder with distribution specificity

Graph clustering, an essential unsupervised learning task in data mining, has garnered significant attention in recent years. With the advent of deep learning, considerable progress has been made in this field. However, existing methods present several limitations: (1) Most encoder models employ Gra...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia systems 2024-06, Vol.30 (3), Article 150
Hauptverfasser: Li, Zishi, Zhu, Changming
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Graph clustering, an essential unsupervised learning task in data mining, has garnered significant attention in recent years. With the advent of deep learning, considerable progress has been made in this field. However, existing methods present several limitations: (1) Most encoder models employ Graph Convolutional Networks (GCNs) as encoders. However, GCNs assign equal weight to each neighboring node and have been shown to be oversmoothing, thereby impacting clustering performance. (2) Most algorithms do not fully utilize the original graph content and structural information, leading to incomplete embedding features. (3) These methods do not account for the specific distribution of clustering of embedding features and the enhancement of staged pseudo-labels on clustering tasks.In this study,we propose a novel end-to-end graph clustering model that leverages graph attention encoders. Specifically, we initially employ a graph attention encoder to extract the inherent information from the original graph. This process assigns varying weights to different nodes, thereby avoiding excessive smoothing. We also fully utilize the guidance of periodic pseudo-labels to facilitate the learning of potential features that are beneficial for clustering. In addition, to improve the model’s clustering performance, we introduce a regularization term that distributes the node features of different classifications across distinct low-dimensional spaces. Furthermore, to prevent the embedding features from straying from the original graph features, we design an information consistency module. Experimental results on the node graph datasets show that our model outperforms other state-of-the-art algorithms.
ISSN:0942-4962
1432-1882
DOI:10.1007/s00530-024-01346-4