Big Data Management Using Hadoop

Today, one of the key issues is the design of systems and software to deal with the storage, management and processing of large amounts of data as a result of the exponential rise in data. In unstructured forms, these data are found. Due to the large and complex data sizes, data management with trad...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of physics. Conference series 2021-02, Vol.1804 (1), p.12109
Hauptverfasser:	khalil, Majida yaseen, Hamad, Murtadha M.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Big Data Computer networks Data collection Data management Distributed processing Fragmentation Physics Queries Query processing Response time Response time (computers) Unstructured data
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	1
container_start_page	12109
container_title	Journal of physics. Conference series
container_volume	1804
creator	khalil, Majida yaseen Hamad, Murtadha M.
description	Today, one of the key issues is the design of systems and software to deal with the storage, management and processing of large amounts of data as a result of the exponential rise in data. In unstructured forms, these data are found. Due to the large and complex data sizes, data management with traditional approaches is unacceptable. Hadoop is an appropriate solution for the continuous growth of data sizes. We have suggested in this paper techniques and algorithms dealing with big data including data collection, preprocessing of data. The Fragmentation algorithm will take the function of a distributed implementation of the traditional file system time-sharing model, where various users share files and storage resources. Also, in this research we used a framework to improve the performance of a query and reduce the response time called the HADOOP. The Apache Hadoop project for safe, scalable and distributed computing. The results showed that Hadoop is the best way to deal with big data during calculating the rate of response time of a complex query for example at (00:00:01) per second and comparing it with the response time of the same queries on the fragmentation algorithm at (00: 01:11) per second and the standard database at (00:05:13) per second. We concluded that Total time Access for complex queries in distributed processing is faster than in non-distributed processing.
doi_str_mv	10.1088/1742-6596/1804/1/012109
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2512974344</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2512974344</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2499-e9d073ed41daae7efd4d0ab6f1aaff245c2e475bd566756b79db287f123cf2f23</originalsourceid><addsrcrecordid>eNo9kM1OwzAQhC0EEqXwDETiHOL1T2wfoVCKVMSFnq1NbEepaFLs9MDbkyioe9mVdjSj-Qi5B_oIVOsClGB5KU1ZgKaigIICA2ouyOL8uTzfWl-Tm5T2lPJx1IJkz22TveCA2Qd22PiD74Zsl9quyTbo-v54S64Cfid_97-XZLd-_Vpt8u3n2_vqaZvXTBiTe-Oo4t4JcIhe-eCEo1iVARBDYELWzAslKyfLUsmyUsZVTKsAjNeBBcaX5GH2Pcb-5-TTYPf9KXZjpGUSmFGCCzGq1KyqY59S9MEeY3vA-GuB2gmHnYraqbSdcFiwMw7-B0JSURU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2512974344</pqid></control><display><type>article</type><title>Big Data Management Using Hadoop</title><source>IOP Publishing Free Content</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>IOPscience extra</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>khalil, Majida yaseen ; Hamad, Murtadha M.</creator><creatorcontrib>khalil, Majida yaseen ; Hamad, Murtadha M.</creatorcontrib><description>Today, one of the key issues is the design of systems and software to deal with the storage, management and processing of large amounts of data as a result of the exponential rise in data. In unstructured forms, these data are found. Due to the large and complex data sizes, data management with traditional approaches is unacceptable. Hadoop is an appropriate solution for the continuous growth of data sizes. We have suggested in this paper techniques and algorithms dealing with big data including data collection, preprocessing of data. The Fragmentation algorithm will take the function of a distributed implementation of the traditional file system time-sharing model, where various users share files and storage resources. Also, in this research we used a framework to improve the performance of a query and reduce the response time called the HADOOP. The Apache Hadoop project for safe, scalable and distributed computing. The results showed that Hadoop is the best way to deal with big data during calculating the rate of response time of a complex query for example at (00:00:01) per second and comparing it with the response time of the same queries on the fragmentation algorithm at (00: 01:11) per second and the standard database at (00:05:13) per second. We concluded that Total time Access for complex queries in distributed processing is faster than in non-distributed processing.</description><identifier>ISSN: 1742-6588</identifier><identifier>EISSN: 1742-6596</identifier><identifier>DOI: 10.1088/1742-6596/1804/1/012109</identifier><language>eng</language><publisher>Bristol: IOP Publishing</publisher><subject>Algorithms ; Big Data ; Computer networks ; Data collection ; Data management ; Distributed processing ; Fragmentation ; Physics ; Queries ; Query processing ; Response time ; Response time (computers) ; Unstructured data</subject><ispartof>Journal of physics. Conference series, 2021-02, Vol.1804 (1), p.12109</ispartof><rights>2021. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2499-e9d073ed41daae7efd4d0ab6f1aaff245c2e475bd566756b79db287f123cf2f23</citedby><cites>FETCH-LOGICAL-c2499-e9d073ed41daae7efd4d0ab6f1aaff245c2e475bd566756b79db287f123cf2f23</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>khalil, Majida yaseen</creatorcontrib><creatorcontrib>Hamad, Murtadha M.</creatorcontrib><title>Big Data Management Using Hadoop</title><title>Journal of physics. Conference series</title><description>Today, one of the key issues is the design of systems and software to deal with the storage, management and processing of large amounts of data as a result of the exponential rise in data. In unstructured forms, these data are found. Due to the large and complex data sizes, data management with traditional approaches is unacceptable. Hadoop is an appropriate solution for the continuous growth of data sizes. We have suggested in this paper techniques and algorithms dealing with big data including data collection, preprocessing of data. The Fragmentation algorithm will take the function of a distributed implementation of the traditional file system time-sharing model, where various users share files and storage resources. Also, in this research we used a framework to improve the performance of a query and reduce the response time called the HADOOP. The Apache Hadoop project for safe, scalable and distributed computing. The results showed that Hadoop is the best way to deal with big data during calculating the rate of response time of a complex query for example at (00:00:01) per second and comparing it with the response time of the same queries on the fragmentation algorithm at (00: 01:11) per second and the standard database at (00:05:13) per second. We concluded that Total time Access for complex queries in distributed processing is faster than in non-distributed processing.</description><subject>Algorithms</subject><subject>Big Data</subject><subject>Computer networks</subject><subject>Data collection</subject><subject>Data management</subject><subject>Distributed processing</subject><subject>Fragmentation</subject><subject>Physics</subject><subject>Queries</subject><subject>Query processing</subject><subject>Response time</subject><subject>Response time (computers)</subject><subject>Unstructured data</subject><issn>1742-6588</issn><issn>1742-6596</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNo9kM1OwzAQhC0EEqXwDETiHOL1T2wfoVCKVMSFnq1NbEepaFLs9MDbkyioe9mVdjSj-Qi5B_oIVOsClGB5KU1ZgKaigIICA2ouyOL8uTzfWl-Tm5T2lPJx1IJkz22TveCA2Qd22PiD74Zsl9quyTbo-v54S64Cfid_97-XZLd-_Vpt8u3n2_vqaZvXTBiTe-Oo4t4JcIhe-eCEo1iVARBDYELWzAslKyfLUsmyUsZVTKsAjNeBBcaX5GH2Pcb-5-TTYPf9KXZjpGUSmFGCCzGq1KyqY59S9MEeY3vA-GuB2gmHnYraqbSdcFiwMw7-B0JSURU</recordid><startdate>20210201</startdate><enddate>20210201</enddate><creator>khalil, Majida yaseen</creator><creator>Hamad, Murtadha M.</creator><general>IOP Publishing</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>H8D</scope><scope>HCIFZ</scope><scope>L7M</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope></search><sort><creationdate>20210201</creationdate><title>Big Data Management Using Hadoop</title><author>khalil, Majida yaseen ; Hamad, Murtadha M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2499-e9d073ed41daae7efd4d0ab6f1aaff245c2e475bd566756b79db287f123cf2f23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Big Data</topic><topic>Computer networks</topic><topic>Data collection</topic><topic>Data management</topic><topic>Distributed processing</topic><topic>Fragmentation</topic><topic>Physics</topic><topic>Queries</topic><topic>Query processing</topic><topic>Response time</topic><topic>Response time (computers)</topic><topic>Unstructured data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>khalil, Majida yaseen</creatorcontrib><creatorcontrib>Hamad, Murtadha M.</creatorcontrib><collection>CrossRef</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Aerospace Database</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Journal of physics. Conference series</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>khalil, Majida yaseen</au><au>Hamad, Murtadha M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Big Data Management Using Hadoop</atitle><jtitle>Journal of physics. Conference series</jtitle><date>2021-02-01</date><risdate>2021</risdate><volume>1804</volume><issue>1</issue><spage>12109</spage><pages>12109-</pages><issn>1742-6588</issn><eissn>1742-6596</eissn><abstract>Today, one of the key issues is the design of systems and software to deal with the storage, management and processing of large amounts of data as a result of the exponential rise in data. In unstructured forms, these data are found. Due to the large and complex data sizes, data management with traditional approaches is unacceptable. Hadoop is an appropriate solution for the continuous growth of data sizes. We have suggested in this paper techniques and algorithms dealing with big data including data collection, preprocessing of data. The Fragmentation algorithm will take the function of a distributed implementation of the traditional file system time-sharing model, where various users share files and storage resources. Also, in this research we used a framework to improve the performance of a query and reduce the response time called the HADOOP. The Apache Hadoop project for safe, scalable and distributed computing. The results showed that Hadoop is the best way to deal with big data during calculating the rate of response time of a complex query for example at (00:00:01) per second and comparing it with the response time of the same queries on the fragmentation algorithm at (00: 01:11) per second and the standard database at (00:05:13) per second. We concluded that Total time Access for complex queries in distributed processing is faster than in non-distributed processing.</abstract><cop>Bristol</cop><pub>IOP Publishing</pub><doi>10.1088/1742-6596/1804/1/012109</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1742-6588
ispartof	Journal of physics. Conference series, 2021-02, Vol.1804 (1), p.12109
issn	1742-6588 1742-6596
language	eng
recordid	cdi_proquest_journals_2512974344
source	IOP Publishing Free Content; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; IOPscience extra; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry
subjects	Algorithms Big Data Computer networks Data collection Data management Distributed processing Fragmentation Physics Queries Query processing Response time Response time (computers) Unstructured data
title	Big Data Management Using Hadoop
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T19%3A24%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Big%20Data%20Management%20Using%20Hadoop&rft.jtitle=Journal%20of%20physics.%20Conference%20series&rft.au=khalil,%20Majida%20yaseen&rft.date=2021-02-01&rft.volume=1804&rft.issue=1&rft.spage=12109&rft.pages=12109-&rft.issn=1742-6588&rft.eissn=1742-6596&rft_id=info:doi/10.1088/1742-6596/1804/1/012109&rft_dat=%3Cproquest_cross%3E2512974344%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2512974344&rft_id=info:pmid/&rfr_iscdi=true