cs-means: Determining optimal number of clusters based on a level-of-similarity

This paper proposes a centroid-based clustering algorithm, cs-means, which is capable of clustering data-points with n-features, without having to specify the number of clusters to be formed. The core logic behind the algorithm is a similarity measure that collectively decides whether to assign an i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	SN applied sciences 2020-11, Vol.2 (11), p.1774, Article 1774
Hauptverfasser:	Lamsal, Rabindra, Katiyar, Shubham
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Applied and Technical Physics Centroids Chemistry/Food Science Cluster analysis Clustering Earth Sciences Engineering Engineering: Young Investigators in Computational Science and Engineering Environment Materials Science Research Article Similarity Similarity measures
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper proposes a centroid-based clustering algorithm, cs-means, which is capable of clustering data-points with n-features, without having to specify the number of clusters to be formed. The core logic behind the algorithm is a similarity measure that collectively decides whether to assign an incoming data-point to a pre-existing cluster, or create a new cluster and assign the data-point to it. The algorithm is application-specific and applicable when the need is to perform clustering analysis of a stream of data-points, where the similarity measure between an incoming data-point and the cluster to which the data-point is to be associated with, is higher than the predefined level-of-similarity (cluster strictness). The algorithm was experimented on 4 public datasets and 10 isotropic Gaussian blobs. The cluster analysis strongly confirms the objectives of the proposed clustering algorithm.
ISSN:	2523-3963 2523-3971
DOI:	10.1007/s42452-020-03582-5