Performance Evaluation of Simple K-Mean and Parallel K-Mean Clustering Algorithms: Big Data Business Process Management Concept

Data is the most valuable asset in any firm. As time passes, the data expands at a breakneck speed. A major research issue is the extraction of meaningful information from a complex and huge data source. Clustering is one of the data extraction methods. The basic K-Mean and Parallel K-Mean partition...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Mobile information systems 2022-06, Vol.2022, p.1-15
Hauptverfasser:	Zada, Islam, Ali, Shaukat, Khan, Inayat, Hadjouni, Myriam, Elmannai, Hela, Zeeshan, Muhammad, Serat, Ali Mohammad, Jameel, Abid
Format:	Artikel
Sprache:	eng
Schlagworte:	Academic achievement Algorithms Big Data Business process management Centroids Clustering Data mining Datasets Employees Learning Performance evaluation Research methodology Similarity measures Students
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	15
container_issue
container_start_page	1
container_title	Mobile information systems
container_volume	2022
creator	Zada, Islam Ali, Shaukat Khan, Inayat Hadjouni, Myriam Elmannai, Hela Zeeshan, Muhammad Serat, Ali Mohammad Jameel, Abid
description	Data is the most valuable asset in any firm. As time passes, the data expands at a breakneck speed. A major research issue is the extraction of meaningful information from a complex and huge data source. Clustering is one of the data extraction methods. The basic K-Mean and Parallel K-Mean partition clustering algorithms work by picking random starting centroids. The basic and K-Mean parallel clustering methods are investigated in this work using two different datasets with sizes of 10000 and 5000, respectively. The findings of the Simple K-Mean clustering algorithms alter throughout numerous runs or iterations, according to the study, and so iterations differ for each run or execution. In some circumstances, the clustering algorithms’ outcomes are always different, and the algorithms separate and identify unique properties of the K-Mean Simple clustering algorithm from the K-Mean Parallel clustering algorithm. Differentiating these features will improve cluster quality, lapsed time, and iterations. Experiments are designed to show that parallel algorithms considerably improve the Simple K-Mean techniques. The findings of the parallel techniques are also consistent; however, the Simple K-Mean algorithm’s results vary from run to run. Both the 10,000 and 5000 data item datasets are divided into ten subdatasets for ten different client systems. Clusters are generated in two iterations, i.e., the time it takes for all client systems to complete one iteration (mentioned in chapter number 4). In the first execution, Client No. 5 has the longest elapsed time (8 ms), whereas the longest elapsed time in the following iterations is 6 ms, for a total elapsed time of 12 ms for the K-Mean clustering technique. In addition, the Parallel algorithms reduce the number of executions and the time it takes to complete a task.
doi_str_mv	10.1155/2022/1277765
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2683797787</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2683797787</sourcerecordid><originalsourceid>FETCH-LOGICAL-c337t-dabc16b259f3c58513ecade74cd048f377f2a5f57f68149767e54f6e609197653</originalsourceid><addsrcrecordid>eNp9kE1Lw0AQhoMoWKs3f8CCR43dTbKZxFuN9QNbLKjQW5gmu-mWZLfuJoon_7oprVdP78zwMC88nnfO6DVjnI8CGgQjFgBAzA-8AUuA-ynli8N-5hD5lMHi2Dtxbk1pTEMOA-9nLqw0tkFdCDL5xLrDVhlNjCSvqtnUgjz7M4GaoC7JHC3Wtaj_blnduVZYpSsyritjVbtq3A25VRW5wxbJbeeUFs6RuTXFNmeosRKN0C3JTN-4aU-9I4m1E2f7HHrv95O37NGfvjw8ZeOpX4QhtH6Jy4LFy4CnMix4wlkoCiwFREVJo0SGADJALjnIOGFRCjEIHslYxDRl_cbDoXex-7ux5qMTrs3XprO6r8yDOAkhBUigp652VGGNc1bIfGNVg_Y7ZzTfKs63ivO94h6_3OErpUv8Uv_Tv-ofe1g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2683797787</pqid></control><display><type>article</type><title>Performance Evaluation of Simple K-Mean and Parallel K-Mean Clustering Algorithms: Big Data Business Process Management Concept</title><source>Wiley Open Access</source><source>Alma/SFX Local Collection</source><source>EZB Electronic Journals Library</source><creator>Zada, Islam ; Ali, Shaukat ; Khan, Inayat ; Hadjouni, Myriam ; Elmannai, Hela ; Zeeshan, Muhammad ; Serat, Ali Mohammad ; Jameel, Abid</creator><contributor>Wahid, Fazli ; Fazli Wahid</contributor><creatorcontrib>Zada, Islam ; Ali, Shaukat ; Khan, Inayat ; Hadjouni, Myriam ; Elmannai, Hela ; Zeeshan, Muhammad ; Serat, Ali Mohammad ; Jameel, Abid ; Wahid, Fazli ; Fazli Wahid</creatorcontrib><description>Data is the most valuable asset in any firm. As time passes, the data expands at a breakneck speed. A major research issue is the extraction of meaningful information from a complex and huge data source. Clustering is one of the data extraction methods. The basic K-Mean and Parallel K-Mean partition clustering algorithms work by picking random starting centroids. The basic and K-Mean parallel clustering methods are investigated in this work using two different datasets with sizes of 10000 and 5000, respectively. The findings of the Simple K-Mean clustering algorithms alter throughout numerous runs or iterations, according to the study, and so iterations differ for each run or execution. In some circumstances, the clustering algorithms’ outcomes are always different, and the algorithms separate and identify unique properties of the K-Mean Simple clustering algorithm from the K-Mean Parallel clustering algorithm. Differentiating these features will improve cluster quality, lapsed time, and iterations. Experiments are designed to show that parallel algorithms considerably improve the Simple K-Mean techniques. The findings of the parallel techniques are also consistent; however, the Simple K-Mean algorithm’s results vary from run to run. Both the 10,000 and 5000 data item datasets are divided into ten subdatasets for ten different client systems. Clusters are generated in two iterations, i.e., the time it takes for all client systems to complete one iteration (mentioned in chapter number 4). In the first execution, Client No. 5 has the longest elapsed time (8 ms), whereas the longest elapsed time in the following iterations is 6 ms, for a total elapsed time of 12 ms for the K-Mean clustering technique. In addition, the Parallel algorithms reduce the number of executions and the time it takes to complete a task.</description><identifier>ISSN: 1574-017X</identifier><identifier>EISSN: 1875-905X</identifier><identifier>DOI: 10.1155/2022/1277765</identifier><language>eng</language><publisher>Amsterdam: Hindawi</publisher><subject>Academic achievement ; Algorithms ; Big Data ; Business process management ; Centroids ; Clustering ; Data mining ; Datasets ; Employees ; Learning ; Performance evaluation ; Research methodology ; Similarity measures ; Students</subject><ispartof>Mobile information systems, 2022-06, Vol.2022, p.1-15</ispartof><rights>Copyright © 2022 Islam Zada et al.</rights><rights>Copyright © 2022 Islam Zada et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c337t-dabc16b259f3c58513ecade74cd048f377f2a5f57f68149767e54f6e609197653</citedby><cites>FETCH-LOGICAL-c337t-dabc16b259f3c58513ecade74cd048f377f2a5f57f68149767e54f6e609197653</cites><orcidid>0000-0001-9070-6821 ; 0000-0002-7441-3962 ; 0000-0003-0895-9665 ; 0000-0001-6472-8795 ; 0000-0002-6959-3401</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><contributor>Wahid, Fazli</contributor><contributor>Fazli Wahid</contributor><creatorcontrib>Zada, Islam</creatorcontrib><creatorcontrib>Ali, Shaukat</creatorcontrib><creatorcontrib>Khan, Inayat</creatorcontrib><creatorcontrib>Hadjouni, Myriam</creatorcontrib><creatorcontrib>Elmannai, Hela</creatorcontrib><creatorcontrib>Zeeshan, Muhammad</creatorcontrib><creatorcontrib>Serat, Ali Mohammad</creatorcontrib><creatorcontrib>Jameel, Abid</creatorcontrib><title>Performance Evaluation of Simple K-Mean and Parallel K-Mean Clustering Algorithms: Big Data Business Process Management Concept</title><title>Mobile information systems</title><description>Data is the most valuable asset in any firm. As time passes, the data expands at a breakneck speed. A major research issue is the extraction of meaningful information from a complex and huge data source. Clustering is one of the data extraction methods. The basic K-Mean and Parallel K-Mean partition clustering algorithms work by picking random starting centroids. The basic and K-Mean parallel clustering methods are investigated in this work using two different datasets with sizes of 10000 and 5000, respectively. The findings of the Simple K-Mean clustering algorithms alter throughout numerous runs or iterations, according to the study, and so iterations differ for each run or execution. In some circumstances, the clustering algorithms’ outcomes are always different, and the algorithms separate and identify unique properties of the K-Mean Simple clustering algorithm from the K-Mean Parallel clustering algorithm. Differentiating these features will improve cluster quality, lapsed time, and iterations. Experiments are designed to show that parallel algorithms considerably improve the Simple K-Mean techniques. The findings of the parallel techniques are also consistent; however, the Simple K-Mean algorithm’s results vary from run to run. Both the 10,000 and 5000 data item datasets are divided into ten subdatasets for ten different client systems. Clusters are generated in two iterations, i.e., the time it takes for all client systems to complete one iteration (mentioned in chapter number 4). In the first execution, Client No. 5 has the longest elapsed time (8 ms), whereas the longest elapsed time in the following iterations is 6 ms, for a total elapsed time of 12 ms for the K-Mean clustering technique. In addition, the Parallel algorithms reduce the number of executions and the time it takes to complete a task.</description><subject>Academic achievement</subject><subject>Algorithms</subject><subject>Big Data</subject><subject>Business process management</subject><subject>Centroids</subject><subject>Clustering</subject><subject>Data mining</subject><subject>Datasets</subject><subject>Employees</subject><subject>Learning</subject><subject>Performance evaluation</subject><subject>Research methodology</subject><subject>Similarity measures</subject><subject>Students</subject><issn>1574-017X</issn><issn>1875-905X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RHX</sourceid><recordid>eNp9kE1Lw0AQhoMoWKs3f8CCR43dTbKZxFuN9QNbLKjQW5gmu-mWZLfuJoon_7oprVdP78zwMC88nnfO6DVjnI8CGgQjFgBAzA-8AUuA-ynli8N-5hD5lMHi2Dtxbk1pTEMOA-9nLqw0tkFdCDL5xLrDVhlNjCSvqtnUgjz7M4GaoC7JHC3Wtaj_blnduVZYpSsyritjVbtq3A25VRW5wxbJbeeUFs6RuTXFNmeosRKN0C3JTN-4aU-9I4m1E2f7HHrv95O37NGfvjw8ZeOpX4QhtH6Jy4LFy4CnMix4wlkoCiwFREVJo0SGADJALjnIOGFRCjEIHslYxDRl_cbDoXex-7ux5qMTrs3XprO6r8yDOAkhBUigp652VGGNc1bIfGNVg_Y7ZzTfKs63ivO94h6_3OErpUv8Uv_Tv-ofe1g</recordid><startdate>20220623</startdate><enddate>20220623</enddate><creator>Zada, Islam</creator><creator>Ali, Shaukat</creator><creator>Khan, Inayat</creator><creator>Hadjouni, Myriam</creator><creator>Elmannai, Hela</creator><creator>Zeeshan, Muhammad</creator><creator>Serat, Ali Mohammad</creator><creator>Jameel, Abid</creator><general>Hindawi</general><general>Hindawi Limited</general><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-9070-6821</orcidid><orcidid>https://orcid.org/0000-0002-7441-3962</orcidid><orcidid>https://orcid.org/0000-0003-0895-9665</orcidid><orcidid>https://orcid.org/0000-0001-6472-8795</orcidid><orcidid>https://orcid.org/0000-0002-6959-3401</orcidid></search><sort><creationdate>20220623</creationdate><title>Performance Evaluation of Simple K-Mean and Parallel K-Mean Clustering Algorithms: Big Data Business Process Management Concept</title><author>Zada, Islam ; Ali, Shaukat ; Khan, Inayat ; Hadjouni, Myriam ; Elmannai, Hela ; Zeeshan, Muhammad ; Serat, Ali Mohammad ; Jameel, Abid</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c337t-dabc16b259f3c58513ecade74cd048f377f2a5f57f68149767e54f6e609197653</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Academic achievement</topic><topic>Algorithms</topic><topic>Big Data</topic><topic>Business process management</topic><topic>Centroids</topic><topic>Clustering</topic><topic>Data mining</topic><topic>Datasets</topic><topic>Employees</topic><topic>Learning</topic><topic>Performance evaluation</topic><topic>Research methodology</topic><topic>Similarity measures</topic><topic>Students</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zada, Islam</creatorcontrib><creatorcontrib>Ali, Shaukat</creatorcontrib><creatorcontrib>Khan, Inayat</creatorcontrib><creatorcontrib>Hadjouni, Myriam</creatorcontrib><creatorcontrib>Elmannai, Hela</creatorcontrib><creatorcontrib>Zeeshan, Muhammad</creatorcontrib><creatorcontrib>Serat, Ali Mohammad</creatorcontrib><creatorcontrib>Jameel, Abid</creatorcontrib><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Mobile information systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zada, Islam</au><au>Ali, Shaukat</au><au>Khan, Inayat</au><au>Hadjouni, Myriam</au><au>Elmannai, Hela</au><au>Zeeshan, Muhammad</au><au>Serat, Ali Mohammad</au><au>Jameel, Abid</au><au>Wahid, Fazli</au><au>Fazli Wahid</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Performance Evaluation of Simple K-Mean and Parallel K-Mean Clustering Algorithms: Big Data Business Process Management Concept</atitle><jtitle>Mobile information systems</jtitle><date>2022-06-23</date><risdate>2022</risdate><volume>2022</volume><spage>1</spage><epage>15</epage><pages>1-15</pages><issn>1574-017X</issn><eissn>1875-905X</eissn><abstract>Data is the most valuable asset in any firm. As time passes, the data expands at a breakneck speed. A major research issue is the extraction of meaningful information from a complex and huge data source. Clustering is one of the data extraction methods. The basic K-Mean and Parallel K-Mean partition clustering algorithms work by picking random starting centroids. The basic and K-Mean parallel clustering methods are investigated in this work using two different datasets with sizes of 10000 and 5000, respectively. The findings of the Simple K-Mean clustering algorithms alter throughout numerous runs or iterations, according to the study, and so iterations differ for each run or execution. In some circumstances, the clustering algorithms’ outcomes are always different, and the algorithms separate and identify unique properties of the K-Mean Simple clustering algorithm from the K-Mean Parallel clustering algorithm. Differentiating these features will improve cluster quality, lapsed time, and iterations. Experiments are designed to show that parallel algorithms considerably improve the Simple K-Mean techniques. The findings of the parallel techniques are also consistent; however, the Simple K-Mean algorithm’s results vary from run to run. Both the 10,000 and 5000 data item datasets are divided into ten subdatasets for ten different client systems. Clusters are generated in two iterations, i.e., the time it takes for all client systems to complete one iteration (mentioned in chapter number 4). In the first execution, Client No. 5 has the longest elapsed time (8 ms), whereas the longest elapsed time in the following iterations is 6 ms, for a total elapsed time of 12 ms for the K-Mean clustering technique. In addition, the Parallel algorithms reduce the number of executions and the time it takes to complete a task.</abstract><cop>Amsterdam</cop><pub>Hindawi</pub><doi>10.1155/2022/1277765</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0001-9070-6821</orcidid><orcidid>https://orcid.org/0000-0002-7441-3962</orcidid><orcidid>https://orcid.org/0000-0003-0895-9665</orcidid><orcidid>https://orcid.org/0000-0001-6472-8795</orcidid><orcidid>https://orcid.org/0000-0002-6959-3401</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1574-017X
ispartof	Mobile information systems, 2022-06, Vol.2022, p.1-15
issn	1574-017X 1875-905X
language	eng
recordid	cdi_proquest_journals_2683797787
source	Wiley Open Access; Alma/SFX Local Collection; EZB Electronic Journals Library
subjects	Academic achievement Algorithms Big Data Business process management Centroids Clustering Data mining Datasets Employees Learning Performance evaluation Research methodology Similarity measures Students
title	Performance Evaluation of Simple K-Mean and Parallel K-Mean Clustering Algorithms: Big Data Business Process Management Concept
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T14%3A21%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Performance%20Evaluation%20of%20Simple%20K-Mean%20and%20Parallel%20K-Mean%20Clustering%20Algorithms:%20Big%20Data%20Business%20Process%20Management%20Concept&rft.jtitle=Mobile%20information%20systems&rft.au=Zada,%20Islam&rft.date=2022-06-23&rft.volume=2022&rft.spage=1&rft.epage=15&rft.pages=1-15&rft.issn=1574-017X&rft.eissn=1875-905X&rft_id=info:doi/10.1155/2022/1277765&rft_dat=%3Cproquest_cross%3E2683797787%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2683797787&rft_id=info:pmid/&rfr_iscdi=true