Clustering algorithms in data science: Evaluating the time and space complexities of K-means, DBSCAN, and hierarchical methods

In the expansive domain of data science, clustering algorithms play a pivotal role in segmenting datasets into meaningful groups without prior knowledge of their underlying patterns. This research provides an in-depth evaluation of the time and space complexities of three widely-used clustering algo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Vybhavi, G. Y., Sriramya, G., Bharadwaj, V. Y., Ramesh, G.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Algorithms Cluster analysis Clustering Data science Datasets Performance measurement
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In the expansive domain of data science, clustering algorithms play a pivotal role in segmenting datasets into meaningful groups without prior knowledge of their underlying patterns. This research provides an in-depth evaluation of the time and space complexities of three widely-used clustering algorithms: K-Means, DBSCAN (Density-Based Spatial Clustering of Applications with Noise), and Hierarchical Clustering. The study delves into each algorithm’s inherent strengths and limitations, factoring in real-world data application scenarios. Our results indicate varying performance metrics, with K-Means showcasing scalability for larger datasets, DBSCAN aptly handling datasets with arbitrary shapes and noise, and Hierarchical Clustering offering insights into intricate hierarchical structures. By offering a comprehensive comparison, this article aims to guide data scientists in selecting the most appropriate clustering technique based on specific problem requirements and dataset characteristics.
ISSN:	0094-243X 1551-7616
DOI:	10.1063/5.0215042