cs-means: Determining optimal number of clusters based on a level-of-similarity

This paper proposes a centroid-based clustering algorithm, cs-means, which is capable of clustering data-points with n-features, without having to specify the number of clusters to be formed. The core logic behind the algorithm is a similarity measure that collectively decides whether to assign an i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:SN applied sciences 2020-11, Vol.2 (11), p.1774, Article 1774
Hauptverfasser: Lamsal, Rabindra, Katiyar, Shubham
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper proposes a centroid-based clustering algorithm, cs-means, which is capable of clustering data-points with n-features, without having to specify the number of clusters to be formed. The core logic behind the algorithm is a similarity measure that collectively decides whether to assign an incoming data-point to a pre-existing cluster, or create a new cluster and assign the data-point to it. The algorithm is application-specific and applicable when the need is to perform clustering analysis of a stream of data-points, where the similarity measure between an incoming data-point and the cluster to which the data-point is to be associated with, is higher than the predefined level-of-similarity (cluster strictness). The algorithm was experimented on 4 public datasets and 10 isotropic Gaussian blobs. The cluster analysis strongly confirms the objectives of the proposed clustering algorithm.
ISSN:2523-3963
2523-3971
DOI:10.1007/s42452-020-03582-5