cs-means: Determining optimal number of clusters based on a level-of-similarity
This paper proposes a centroid-based clustering algorithm, cs-means, which is capable of clustering data-points with n-features, without having to specify the number of clusters to be formed. The core logic behind the algorithm is a similarity measure that collectively decides whether to assign an i...
Gespeichert in:
Veröffentlicht in: | SN applied sciences 2020-11, Vol.2 (11), p.1774, Article 1774 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper proposes a centroid-based clustering algorithm, cs-means, which is capable of clustering data-points with n-features, without having to specify the number of clusters to be formed. The core logic behind the algorithm is a similarity measure that collectively decides whether to assign an incoming data-point to a pre-existing cluster, or create a new cluster and assign the data-point to it. The algorithm is application-specific and applicable when the need is to perform clustering analysis of a stream of data-points, where the similarity measure between an incoming data-point and the cluster to which the data-point is to be associated with, is higher than the predefined level-of-similarity (cluster strictness). The algorithm was experimented on 4 public datasets and 10 isotropic Gaussian blobs. The cluster analysis strongly confirms the objectives of the proposed clustering algorithm. |
---|---|
ISSN: | 2523-3963 2523-3971 |
DOI: | 10.1007/s42452-020-03582-5 |