Clustering algorithms in data science: Evaluating the time and space complexities of K-means, DBSCAN, and hierarchical methods

In the expansive domain of data science, clustering algorithms play a pivotal role in segmenting datasets into meaningful groups without prior knowledge of their underlying patterns. This research provides an in-depth evaluation of the time and space complexities of three widely-used clustering algo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Vybhavi, G. Y., Sriramya, G., Bharadwaj, V. Y., Ramesh, G.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 1
container_start_page
container_title
container_volume 3101
creator Vybhavi, G. Y.
Sriramya, G.
Bharadwaj, V. Y.
Ramesh, G.
description In the expansive domain of data science, clustering algorithms play a pivotal role in segmenting datasets into meaningful groups without prior knowledge of their underlying patterns. This research provides an in-depth evaluation of the time and space complexities of three widely-used clustering algorithms: K-Means, DBSCAN (Density-Based Spatial Clustering of Applications with Noise), and Hierarchical Clustering. The study delves into each algorithm’s inherent strengths and limitations, factoring in real-world data application scenarios. Our results indicate varying performance metrics, with K-Means showcasing scalability for larger datasets, DBSCAN aptly handling datasets with arbitrary shapes and noise, and Hierarchical Clustering offering insights into intricate hierarchical structures. By offering a comprehensive comparison, this article aims to guide data scientists in selecting the most appropriate clustering technique based on specific problem requirements and dataset characteristics.
doi_str_mv 10.1063/5.0215042
format Conference Proceeding
fullrecord <record><control><sourceid>proquest_scita</sourceid><recordid>TN_cdi_proquest_journals_3074186120</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3074186120</sourcerecordid><originalsourceid>FETCH-LOGICAL-p632-70d94118d26a330eddf569e0d60d8299d5fc018ce8b637bec1e6c4c9c196de7f3</originalsourceid><addsrcrecordid>eNotkDtPwzAYRS0EEqUw8A8ssaGm-JE4MVsJ5SEqGOjAFrn2l8ZVXtgOgoXfTks73eXoXp2L0CUlU0oEv0mmhNGExOwIjWiS0CgVVByjESEyjljMP07RmfcbQphM02yEfvN68AGcbddY1evO2VA1HtsWGxUU9tpCq-EWz79UPaiww0IFONgGsGoN9r3SgHXX9DV822DB467EL1EDqvUTfH_3ns9eJ_9oZcEppyurVY0bCFVn_Dk6KVXt4eKQY7R8mC_zp2jx9viczxZRLziLUmJkTGlmmFCcEzCmTIQEYgQxGZPSJKUmNNOQrQRPV6ApCB1rqakUBtKSj9HVvrZ33ecAPhSbbnDtdrHgJI1pJigjW-p6T22tw9a1a4ve2Ua5n4KSYndvkRSHe_kfOVZs8A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>3074186120</pqid></control><display><type>conference_proceeding</type><title>Clustering algorithms in data science: Evaluating the time and space complexities of K-means, DBSCAN, and hierarchical methods</title><source>American Institute of Physics (AIP) Journals</source><creator>Vybhavi, G. Y. ; Sriramya, G. ; Bharadwaj, V. Y. ; Ramesh, G.</creator><contributor>Akinlabi, Esther ; Singh, Swadesh Kumar ; Kosaraju, Satyanarayana ; Muttil, Nitin ; Tanya</contributor><creatorcontrib>Vybhavi, G. Y. ; Sriramya, G. ; Bharadwaj, V. Y. ; Ramesh, G. ; Akinlabi, Esther ; Singh, Swadesh Kumar ; Kosaraju, Satyanarayana ; Muttil, Nitin ; Tanya</creatorcontrib><description>In the expansive domain of data science, clustering algorithms play a pivotal role in segmenting datasets into meaningful groups without prior knowledge of their underlying patterns. This research provides an in-depth evaluation of the time and space complexities of three widely-used clustering algorithms: K-Means, DBSCAN (Density-Based Spatial Clustering of Applications with Noise), and Hierarchical Clustering. The study delves into each algorithm’s inherent strengths and limitations, factoring in real-world data application scenarios. Our results indicate varying performance metrics, with K-Means showcasing scalability for larger datasets, DBSCAN aptly handling datasets with arbitrary shapes and noise, and Hierarchical Clustering offering insights into intricate hierarchical structures. By offering a comprehensive comparison, this article aims to guide data scientists in selecting the most appropriate clustering technique based on specific problem requirements and dataset characteristics.</description><identifier>ISSN: 0094-243X</identifier><identifier>EISSN: 1551-7616</identifier><identifier>DOI: 10.1063/5.0215042</identifier><identifier>CODEN: APCPCS</identifier><language>eng</language><publisher>Melville: American Institute of Physics</publisher><subject>Algorithms ; Cluster analysis ; Clustering ; Data science ; Datasets ; Performance measurement</subject><ispartof>AIP Conference Proceedings, 2024, Vol.3101 (1)</ispartof><rights>Author(s)</rights><rights>2024 Author(s). Published under an exclusive license by AIP Publishing.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://pubs.aip.org/acp/article-lookup/doi/10.1063/5.0215042$$EHTML$$P50$$Gscitation$$H</linktohtml><link.rule.ids>309,310,314,780,784,789,790,794,4512,23930,23931,25140,27924,27925,76384</link.rule.ids></links><search><contributor>Akinlabi, Esther</contributor><contributor>Singh, Swadesh Kumar</contributor><contributor>Kosaraju, Satyanarayana</contributor><contributor>Muttil, Nitin</contributor><contributor>Tanya</contributor><creatorcontrib>Vybhavi, G. Y.</creatorcontrib><creatorcontrib>Sriramya, G.</creatorcontrib><creatorcontrib>Bharadwaj, V. Y.</creatorcontrib><creatorcontrib>Ramesh, G.</creatorcontrib><title>Clustering algorithms in data science: Evaluating the time and space complexities of K-means, DBSCAN, and hierarchical methods</title><title>AIP Conference Proceedings</title><description>In the expansive domain of data science, clustering algorithms play a pivotal role in segmenting datasets into meaningful groups without prior knowledge of their underlying patterns. This research provides an in-depth evaluation of the time and space complexities of three widely-used clustering algorithms: K-Means, DBSCAN (Density-Based Spatial Clustering of Applications with Noise), and Hierarchical Clustering. The study delves into each algorithm’s inherent strengths and limitations, factoring in real-world data application scenarios. Our results indicate varying performance metrics, with K-Means showcasing scalability for larger datasets, DBSCAN aptly handling datasets with arbitrary shapes and noise, and Hierarchical Clustering offering insights into intricate hierarchical structures. By offering a comprehensive comparison, this article aims to guide data scientists in selecting the most appropriate clustering technique based on specific problem requirements and dataset characteristics.</description><subject>Algorithms</subject><subject>Cluster analysis</subject><subject>Clustering</subject><subject>Data science</subject><subject>Datasets</subject><subject>Performance measurement</subject><issn>0094-243X</issn><issn>1551-7616</issn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNotkDtPwzAYRS0EEqUw8A8ssaGm-JE4MVsJ5SEqGOjAFrn2l8ZVXtgOgoXfTks73eXoXp2L0CUlU0oEv0mmhNGExOwIjWiS0CgVVByjESEyjljMP07RmfcbQphM02yEfvN68AGcbddY1evO2VA1HtsWGxUU9tpCq-EWz79UPaiww0IFONgGsGoN9r3SgHXX9DV822DB467EL1EDqvUTfH_3ns9eJ_9oZcEppyurVY0bCFVn_Dk6KVXt4eKQY7R8mC_zp2jx9viczxZRLziLUmJkTGlmmFCcEzCmTIQEYgQxGZPSJKUmNNOQrQRPV6ApCB1rqakUBtKSj9HVvrZ33ecAPhSbbnDtdrHgJI1pJigjW-p6T22tw9a1a4ve2Ua5n4KSYndvkRSHe_kfOVZs8A</recordid><startdate>20240701</startdate><enddate>20240701</enddate><creator>Vybhavi, G. Y.</creator><creator>Sriramya, G.</creator><creator>Bharadwaj, V. Y.</creator><creator>Ramesh, G.</creator><general>American Institute of Physics</general><scope>8FD</scope><scope>H8D</scope><scope>L7M</scope></search><sort><creationdate>20240701</creationdate><title>Clustering algorithms in data science: Evaluating the time and space complexities of K-means, DBSCAN, and hierarchical methods</title><author>Vybhavi, G. Y. ; Sriramya, G. ; Bharadwaj, V. Y. ; Ramesh, G.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p632-70d94118d26a330eddf569e0d60d8299d5fc018ce8b637bec1e6c4c9c196de7f3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Cluster analysis</topic><topic>Clustering</topic><topic>Data science</topic><topic>Datasets</topic><topic>Performance measurement</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Vybhavi, G. Y.</creatorcontrib><creatorcontrib>Sriramya, G.</creatorcontrib><creatorcontrib>Bharadwaj, V. Y.</creatorcontrib><creatorcontrib>Ramesh, G.</creatorcontrib><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Vybhavi, G. Y.</au><au>Sriramya, G.</au><au>Bharadwaj, V. Y.</au><au>Ramesh, G.</au><au>Akinlabi, Esther</au><au>Singh, Swadesh Kumar</au><au>Kosaraju, Satyanarayana</au><au>Muttil, Nitin</au><au>Tanya</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Clustering algorithms in data science: Evaluating the time and space complexities of K-means, DBSCAN, and hierarchical methods</atitle><btitle>AIP Conference Proceedings</btitle><date>2024-07-01</date><risdate>2024</risdate><volume>3101</volume><issue>1</issue><issn>0094-243X</issn><eissn>1551-7616</eissn><coden>APCPCS</coden><abstract>In the expansive domain of data science, clustering algorithms play a pivotal role in segmenting datasets into meaningful groups without prior knowledge of their underlying patterns. This research provides an in-depth evaluation of the time and space complexities of three widely-used clustering algorithms: K-Means, DBSCAN (Density-Based Spatial Clustering of Applications with Noise), and Hierarchical Clustering. The study delves into each algorithm’s inherent strengths and limitations, factoring in real-world data application scenarios. Our results indicate varying performance metrics, with K-Means showcasing scalability for larger datasets, DBSCAN aptly handling datasets with arbitrary shapes and noise, and Hierarchical Clustering offering insights into intricate hierarchical structures. By offering a comprehensive comparison, this article aims to guide data scientists in selecting the most appropriate clustering technique based on specific problem requirements and dataset characteristics.</abstract><cop>Melville</cop><pub>American Institute of Physics</pub><doi>10.1063/5.0215042</doi><tpages>9</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0094-243X
ispartof AIP Conference Proceedings, 2024, Vol.3101 (1)
issn 0094-243X
1551-7616
language eng
recordid cdi_proquest_journals_3074186120
source American Institute of Physics (AIP) Journals
subjects Algorithms
Cluster analysis
Clustering
Data science
Datasets
Performance measurement
title Clustering algorithms in data science: Evaluating the time and space complexities of K-means, DBSCAN, and hierarchical methods
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T11%3A13%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_scita&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Clustering%20algorithms%20in%20data%20science:%20Evaluating%20the%20time%20and%20space%20complexities%20of%20K-means,%20DBSCAN,%20and%20hierarchical%20methods&rft.btitle=AIP%20Conference%20Proceedings&rft.au=Vybhavi,%20G.%20Y.&rft.date=2024-07-01&rft.volume=3101&rft.issue=1&rft.issn=0094-243X&rft.eissn=1551-7616&rft.coden=APCPCS&rft_id=info:doi/10.1063/5.0215042&rft_dat=%3Cproquest_scita%3E3074186120%3C/proquest_scita%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3074186120&rft_id=info:pmid/&rfr_iscdi=true