A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming

With the advancement of Internet technologies and the rapid increase of World Wide Web applications, there has been tremendous growth in the volume of digital data. This takes the digital world into a new era of big data. Various existing data processing technologies are not consistent and scalable...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Mathematical problems in engineering 2023, Vol.2023 (1)
Hauptverfasser:	Natesan, P., Sathishkumar, V. E., Mathivanan, Sandeep Kumar, Venkatasen, Maheshwari, Jayagopal, Prabhu, Allayear, Shaikh Muhammad
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Applications programs Big Data Business metrics Classification Cloud computing Data processing Datasets Digital data Distributed processing (Computers) Engineering Fault tolerance Machine learning Parallel programming Predictive analytics Regression analysis Semantic web Variables
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	1
container_start_page
container_title	Mathematical problems in engineering
container_volume	2023
creator	Natesan, P. Sathishkumar, V. E. Mathivanan, Sandeep Kumar Venkatasen, Maheshwari Jayagopal, Prabhu Allayear, Shaikh Muhammad
description	With the advancement of Internet technologies and the rapid increase of World Wide Web applications, there has been tremendous growth in the volume of digital data. This takes the digital world into a new era of big data. Various existing data processing technologies are not consistent and scalable in handling the complexity as well as the large-size datasets. Recently, there are many distributed data processing, and programming models have been proposed and implemented to handle big data applications. The open-source-implemented MapReduce programming model in Apache Hadoop is the foremost model for data exhaustive and also computational-intensive applications due to its inherent characteristics of scalability, fault tolerance, and simplicity. In this research article, a new approach for the prediction of target labels in big data applications is developed using a multiple linear regression algorithm and MapReduce programming model, named as MR-MLR. This approach promises optimum values for MAE, RMSE, and determination coefficient (R2) and thus shows its effectiveness in predictions in big data applications.
doi_str_mv	10.1155/2023/6048891
format	Article
fullrecord	<record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_journals_2775460491</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A736823440</galeid><sourcerecordid>A736823440</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2861-cc0cb69525e61161248c327e390b2cbfca76ff71f39ab40e1e053ba63f9b1c4f3</originalsourceid><addsrcrecordid>eNp90MlOwzAQBuAIgUQp3HgAS5wQhHq8JTmWlqUSCMQicbMcxw6GNCl2wvL2GLVnTjOHb36N_iQ5BHwGwPmEYEInArM8L2ArGQEXNOXAsu24Y8JSIPRlN9kL4Q1jAhzyUVJO0dyF3rty6E2FLr1amq_OvyPbeXTvTeV07z4Nmraq-emdDug5uLZG565Gc9UrpNoK3arVg6kGbdC98qppTBNPuzpmLaPdT3asaoI52Mxx8nx58TS7Tm_urhaz6U2qSS4g1RrrUhSccCMABBCWa0oyQwtcEl1arTJhbQaWFqpk2IDBnJZKUFuUoJml4-Ronbvy3cdgQi_fusHHv4MkWcZZLKaAqI7XqlaNka7VXdub775WQwhy8fggpxkVOaGM4WhP11b7LgRvrFx5t1T-RwKWf43Lv8blpvHIT9b81bWV-nL_619kLH7d</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2775460491</pqid></control><display><type>article</type><title>A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming</title><source>Wiley Online Library Open Access</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Alma/SFX Local Collection</source><creator>Natesan, P. ; Sathishkumar, V. E. ; Mathivanan, Sandeep Kumar ; Venkatasen, Maheshwari ; Jayagopal, Prabhu ; Allayear, Shaikh Muhammad</creator><contributor>de Almeida-Filho, Adiel T.</contributor><creatorcontrib>Natesan, P. ; Sathishkumar, V. E. ; Mathivanan, Sandeep Kumar ; Venkatasen, Maheshwari ; Jayagopal, Prabhu ; Allayear, Shaikh Muhammad ; de Almeida-Filho, Adiel T.</creatorcontrib><description>With the advancement of Internet technologies and the rapid increase of World Wide Web applications, there has been tremendous growth in the volume of digital data. This takes the digital world into a new era of big data. Various existing data processing technologies are not consistent and scalable in handling the complexity as well as the large-size datasets. Recently, there are many distributed data processing, and programming models have been proposed and implemented to handle big data applications. The open-source-implemented MapReduce programming model in Apache Hadoop is the foremost model for data exhaustive and also computational-intensive applications due to its inherent characteristics of scalability, fault tolerance, and simplicity. In this research article, a new approach for the prediction of target labels in big data applications is developed using a multiple linear regression algorithm and MapReduce programming model, named as MR-MLR. This approach promises optimum values for MAE, RMSE, and determination coefficient (R2) and thus shows its effectiveness in predictions in big data applications.</description><identifier>ISSN: 1024-123X</identifier><identifier>EISSN: 1563-5147</identifier><identifier>DOI: 10.1155/2023/6048891</identifier><language>eng</language><publisher>New York: Hindawi</publisher><subject>Algorithms ; Applications programs ; Big Data ; Business metrics ; Classification ; Cloud computing ; Data processing ; Datasets ; Digital data ; Distributed processing (Computers) ; Engineering ; Fault tolerance ; Machine learning ; Parallel programming ; Predictive analytics ; Regression analysis ; Semantic web ; Variables</subject><ispartof>Mathematical problems in engineering, 2023, Vol.2023 (1)</ispartof><rights>Copyright © 2023 P. Natesan et al.</rights><rights>COPYRIGHT 2023 Hindawi Limited</rights><rights>Copyright © 2023 P. Natesan et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2861-cc0cb69525e61161248c327e390b2cbfca76ff71f39ab40e1e053ba63f9b1c4f3</citedby><cites>FETCH-LOGICAL-c2861-cc0cb69525e61161248c327e390b2cbfca76ff71f39ab40e1e053ba63f9b1c4f3</cites><orcidid>0000-0003-0567-7865</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,4010,27900,27901,27902</link.rule.ids></links><search><contributor>de Almeida-Filho, Adiel T.</contributor><creatorcontrib>Natesan, P.</creatorcontrib><creatorcontrib>Sathishkumar, V. E.</creatorcontrib><creatorcontrib>Mathivanan, Sandeep Kumar</creatorcontrib><creatorcontrib>Venkatasen, Maheshwari</creatorcontrib><creatorcontrib>Jayagopal, Prabhu</creatorcontrib><creatorcontrib>Allayear, Shaikh Muhammad</creatorcontrib><title>A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming</title><title>Mathematical problems in engineering</title><description>With the advancement of Internet technologies and the rapid increase of World Wide Web applications, there has been tremendous growth in the volume of digital data. This takes the digital world into a new era of big data. Various existing data processing technologies are not consistent and scalable in handling the complexity as well as the large-size datasets. Recently, there are many distributed data processing, and programming models have been proposed and implemented to handle big data applications. The open-source-implemented MapReduce programming model in Apache Hadoop is the foremost model for data exhaustive and also computational-intensive applications due to its inherent characteristics of scalability, fault tolerance, and simplicity. In this research article, a new approach for the prediction of target labels in big data applications is developed using a multiple linear regression algorithm and MapReduce programming model, named as MR-MLR. This approach promises optimum values for MAE, RMSE, and determination coefficient (R2) and thus shows its effectiveness in predictions in big data applications.</description><subject>Algorithms</subject><subject>Applications programs</subject><subject>Big Data</subject><subject>Business metrics</subject><subject>Classification</subject><subject>Cloud computing</subject><subject>Data processing</subject><subject>Datasets</subject><subject>Digital data</subject><subject>Distributed processing (Computers)</subject><subject>Engineering</subject><subject>Fault tolerance</subject><subject>Machine learning</subject><subject>Parallel programming</subject><subject>Predictive analytics</subject><subject>Regression analysis</subject><subject>Semantic web</subject><subject>Variables</subject><issn>1024-123X</issn><issn>1563-5147</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RHX</sourceid><sourceid>BENPR</sourceid><recordid>eNp90MlOwzAQBuAIgUQp3HgAS5wQhHq8JTmWlqUSCMQicbMcxw6GNCl2wvL2GLVnTjOHb36N_iQ5BHwGwPmEYEInArM8L2ArGQEXNOXAsu24Y8JSIPRlN9kL4Q1jAhzyUVJO0dyF3rty6E2FLr1amq_OvyPbeXTvTeV07z4Nmraq-emdDug5uLZG565Gc9UrpNoK3arVg6kGbdC98qppTBNPuzpmLaPdT3asaoI52Mxx8nx58TS7Tm_urhaz6U2qSS4g1RrrUhSccCMABBCWa0oyQwtcEl1arTJhbQaWFqpk2IDBnJZKUFuUoJml4-Ronbvy3cdgQi_fusHHv4MkWcZZLKaAqI7XqlaNka7VXdub775WQwhy8fggpxkVOaGM4WhP11b7LgRvrFx5t1T-RwKWf43Lv8blpvHIT9b81bWV-nL_619kLH7d</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Natesan, P.</creator><creator>Sathishkumar, V. E.</creator><creator>Mathivanan, Sandeep Kumar</creator><creator>Venkatasen, Maheshwari</creator><creator>Jayagopal, Prabhu</creator><creator>Allayear, Shaikh Muhammad</creator><general>Hindawi</general><general>Hindawi Limited</general><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>7TB</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CWDGH</scope><scope>DWQXO</scope><scope>FR3</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>KR7</scope><scope>L6V</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><orcidid>https://orcid.org/0000-0003-0567-7865</orcidid></search><sort><creationdate>2023</creationdate><title>A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming</title><author>Natesan, P. ; Sathishkumar, V. E. ; Mathivanan, Sandeep Kumar ; Venkatasen, Maheshwari ; Jayagopal, Prabhu ; Allayear, Shaikh Muhammad</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2861-cc0cb69525e61161248c327e390b2cbfca76ff71f39ab40e1e053ba63f9b1c4f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Applications programs</topic><topic>Big Data</topic><topic>Business metrics</topic><topic>Classification</topic><topic>Cloud computing</topic><topic>Data processing</topic><topic>Datasets</topic><topic>Digital data</topic><topic>Distributed processing (Computers)</topic><topic>Engineering</topic><topic>Fault tolerance</topic><topic>Machine learning</topic><topic>Parallel programming</topic><topic>Predictive analytics</topic><topic>Regression analysis</topic><topic>Semantic web</topic><topic>Variables</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Natesan, P.</creatorcontrib><creatorcontrib>Sathishkumar, V. E.</creatorcontrib><creatorcontrib>Mathivanan, Sandeep Kumar</creatorcontrib><creatorcontrib>Venkatasen, Maheshwari</creatorcontrib><creatorcontrib>Jayagopal, Prabhu</creatorcontrib><creatorcontrib>Allayear, Shaikh Muhammad</creatorcontrib><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Middle East & Africa Database</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Civil Engineering Abstracts</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>Mathematical problems in engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Natesan, P.</au><au>Sathishkumar, V. E.</au><au>Mathivanan, Sandeep Kumar</au><au>Venkatasen, Maheshwari</au><au>Jayagopal, Prabhu</au><au>Allayear, Shaikh Muhammad</au><au>de Almeida-Filho, Adiel T.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming</atitle><jtitle>Mathematical problems in engineering</jtitle><date>2023</date><risdate>2023</risdate><volume>2023</volume><issue>1</issue><issn>1024-123X</issn><eissn>1563-5147</eissn><abstract>With the advancement of Internet technologies and the rapid increase of World Wide Web applications, there has been tremendous growth in the volume of digital data. This takes the digital world into a new era of big data. Various existing data processing technologies are not consistent and scalable in handling the complexity as well as the large-size datasets. Recently, there are many distributed data processing, and programming models have been proposed and implemented to handle big data applications. The open-source-implemented MapReduce programming model in Apache Hadoop is the foremost model for data exhaustive and also computational-intensive applications due to its inherent characteristics of scalability, fault tolerance, and simplicity. In this research article, a new approach for the prediction of target labels in big data applications is developed using a multiple linear regression algorithm and MapReduce programming model, named as MR-MLR. This approach promises optimum values for MAE, RMSE, and determination coefficient (R2) and thus shows its effectiveness in predictions in big data applications.</abstract><cop>New York</cop><pub>Hindawi</pub><doi>10.1155/2023/6048891</doi><orcidid>https://orcid.org/0000-0003-0567-7865</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1024-123X
ispartof	Mathematical problems in engineering, 2023, Vol.2023 (1)
issn	1024-123X 1563-5147
language	eng
recordid	cdi_proquest_journals_2775460491
source	Wiley Online Library Open Access; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Alma/SFX Local Collection
subjects	Algorithms Applications programs Big Data Business metrics Classification Cloud computing Data processing Datasets Digital data Distributed processing (Computers) Engineering Fault tolerance Machine learning Parallel programming Predictive analytics Regression analysis Semantic web Variables
title	A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T22%3A57%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Distributed%20Framework%20for%20Predictive%20Analytics%20Using%20Big%20Data%20and%20MapReduce%20Parallel%20Programming&rft.jtitle=Mathematical%20problems%20in%20engineering&rft.au=Natesan,%20P.&rft.date=2023&rft.volume=2023&rft.issue=1&rft.issn=1024-123X&rft.eissn=1563-5147&rft_id=info:doi/10.1155/2023/6048891&rft_dat=%3Cgale_proqu%3EA736823440%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2775460491&rft_id=info:pmid/&rft_galeid=A736823440&rfr_iscdi=true