Clustering algorithms in data science: Evaluating the time and space complexities of K-means, DBSCAN, and hierarchical methods
In the expansive domain of data science, clustering algorithms play a pivotal role in segmenting datasets into meaningful groups without prior knowledge of their underlying patterns. This research provides an in-depth evaluation of the time and space complexities of three widely-used clustering algo...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In the expansive domain of data science, clustering algorithms play a pivotal role in segmenting datasets into meaningful groups without prior knowledge of their underlying patterns. This research provides an in-depth evaluation of the time and space complexities of three widely-used clustering algorithms: K-Means, DBSCAN (Density-Based Spatial Clustering of Applications with Noise), and Hierarchical Clustering. The study delves into each algorithm’s inherent strengths and limitations, factoring in real-world data application scenarios. Our results indicate varying performance metrics, with K-Means showcasing scalability for larger datasets, DBSCAN aptly handling datasets with arbitrary shapes and noise, and Hierarchical Clustering offering insights into intricate hierarchical structures. By offering a comprehensive comparison, this article aims to guide data scientists in selecting the most appropriate clustering technique based on specific problem requirements and dataset characteristics. |
---|---|
ISSN: | 0094-243X 1551-7616 |
DOI: | 10.1063/5.0215042 |