DenMune: Density peak based clustering using mutual nearest neighbors

•We present a novel algorithm (pseudo code given) to find clusters of arbitrary number, shapes and densities in two-dimensions. Higher dimensions are first reduced to 2-D using the t-sne algorithm.•The algorithm relies on a single parameter K (the number of nearest neighbors).•The algorithm proposes...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition 2021-01, Vol.109, p.107589, Article 107589
Hauptverfasser: Abbas, Mohamed, El-Zoghabi, Adel, Shoukry, Amin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•We present a novel algorithm (pseudo code given) to find clusters of arbitrary number, shapes and densities in two-dimensions. Higher dimensions are first reduced to 2-D using the t-sne algorithm.•The algorithm relies on a single parameter K (the number of nearest neighbors).•The algorithm proposes a simple rule that classifies the data points into three types: those that certainly belong to clusters/ certainly do not belong to any cluster (i.e. noise) and uncertain points (that either succeed to join a cluster or are considered, also, as noise).•The performance of the proposed algorithm is compared to nine well known algorithms using  thirty-six real and synthetic data sets.•The results show the superiority of the proposed algorithm. [Display omitted] Many clustering algorithms fail when clusters are of arbitrary shapes, of varying densities, or the data classes are unbalanced and close to each other, even in two dimensions. A novel clustering algorithm “DenMune” is presented to meet this challenge. It is based on identifying dense regions using mutual nearest neighborhoods of size K, where K is the only parameter required from the user, besides obeying the mutual nearest neighbor consistency principle. The algorithm is stable for a wide range of values of K. Moreover, it is able to automatically detect and remove noise from the clustering process as well as detecting the target clusters. It produces robust results on various low and high dimensional datasets relative to several known state of the art clustering algorithms.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2020.107589