Fast LDP-MST: An Efficient Density-Peak-Based Clustering Method for Large-Size Datasets

Recently, a new density-peak-based clustering method, called clustering with local density peaks-based minimum spanning tree (LDP-MST), was proposed, which has several attractive merits, e.g., being able to detect arbitrarily shaped clusters and not very sensitive to noise and parameters. Neverthele...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on knowledge and data engineering 2023-05, Vol.35 (5), p.4767-4780
Hauptverfasser: Qiu, Teng, Li, Yong-Jie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recently, a new density-peak-based clustering method, called clustering with local density peaks-based minimum spanning tree (LDP-MST), was proposed, which has several attractive merits, e.g., being able to detect arbitrarily shaped clusters and not very sensitive to noise and parameters. Nevertheless, we also found the limitation of LDP-MST in efficiency. Specifically, LDP-MST has O(N\log N+M^{2}) O(NlogN+M2) time, where N N denotes the dataset size and M M is an intermediate variable denoting the number of local density peaks. As our experimental results reveal, when processing large-size datasets, the value of M M could be very large and consequently those steps of LDP-MST involving O(M^{2}) O(M2) time term would be time-consuming. And in the worst case, the value of M M could be very close to that of N N , which means that the time complexity of LDP-MST could be O(N^{2}) O(N2)
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2022.3150403