Optimal centroids model approach for many-feature data structure prediction

Data clustering is a popular machine learning technique widely applied for data structure analysis. Among the major techniques, fuzzy co-clustering (FCoC) is known for the capability of complex data processing such as many features, large size, and uncertainty. In some cases, FCoC has demonstrated s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Evolutionary intelligence 2023-08, Vol.16 (4), p.1353-1367
Hauptverfasser: Cam Binh, Le Thi, Nha, Pham Van
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Data clustering is a popular machine learning technique widely applied for data structure analysis. Among the major techniques, fuzzy co-clustering (FCoC) is known for the capability of complex data processing such as many features, large size, and uncertainty. In some cases, FCoC has demonstrated superior performance over some traditional clustering methods. However, FCoC has some limitations such as being sensitive to the initial cluster centers and being stuck at locally optimal values. Particle swarm optimization (PSO) is a multidisciplinary optimization algorithm. PSO was used to find a suitable initial cluster centers solution for the clustering algorithms. However, PSO is limited and unclear, especially for algorithms with complex structures and many parameters. In this paper, we propose a new fuzzy co-clustering algorithm by using the optimal solution of the PSO algorithm to find the initial centroids for the FCoC algorithm. We call the new algorithm FCOCM. In our experience, we use symbols and mathematical language to describe the process of finding initial centroids called the optimal centroids model (OCM). Thus, our work proposes two new contributions including the OCM model and the FCOCM algorithm. Experiments were conducted on benchmark data sets from the UCI Machine Learning Repository and the School of the Computing University of Eastern Finland to demonstrate the superior performance of the proposed method.
ISSN:1864-5909
1864-5917
DOI:10.1007/s12065-022-00747-6