Disentangling clusters from non-Euclidean data via graph frequency reorganization

In light of the growing need for non-Euclidean data analysis, graphs have been recognized as an effective tool for characterizing the distribution and correlation of such data, thus inspiring many graph-based developments for various applications such as clustering, of non-Euclidean data. However, u...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information sciences 2024-03, Vol.662, p.120288, Article 120288
Hauptverfasser: Geng, Yangli-ao, Chi, Chong-Yung, Sun, Wenju, Zhang, Jing, Li, Qingyong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In light of the growing need for non-Euclidean data analysis, graphs have been recognized as an effective tool for characterizing the distribution and correlation of such data, thus inspiring many graph-based developments for various applications such as clustering, of non-Euclidean data. However, under unsupervised scenarios, the construction of graphs from unlabeled data often involves numerous noisy links, consequently leading to serious performance degradation in concerned applications. To resolve this issue, we propose a novel method, referred to as Graph Frequency Reorganization (GFR), to enhance the discriminability of potential clusters and the associated graph quality. GFR shows capability far beyond the suboptimality in unsupervised graph construction. Furthermore, a fast version of GFR is proposed to reduce its computation overhead for large-scale datasets. Consequently, the obtained unsupervised clustering results can be significantly upgraded using the GFR data (i.e., the data after the GFR processing). To evaluate the effectiveness of the GFR, some experimental results on ten real-world datasets are provided to demonstrate that the overall clustering performance of a simple k-means using the GFR data is superior to several state-of-the-art graph-based clustering methods1.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2024.120288