Big Data Management Using Hadoop

Today, one of the key issues is the design of systems and software to deal with the storage, management and processing of large amounts of data as a result of the exponential rise in data. In unstructured forms, these data are found. Due to the large and complex data sizes, data management with trad...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of physics. Conference series 2021-02, Vol.1804 (1), p.12109
Hauptverfasser: khalil, Majida yaseen, Hamad, Murtadha M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 1
container_start_page 12109
container_title Journal of physics. Conference series
container_volume 1804
creator khalil, Majida yaseen
Hamad, Murtadha M.
description Today, one of the key issues is the design of systems and software to deal with the storage, management and processing of large amounts of data as a result of the exponential rise in data. In unstructured forms, these data are found. Due to the large and complex data sizes, data management with traditional approaches is unacceptable. Hadoop is an appropriate solution for the continuous growth of data sizes. We have suggested in this paper techniques and algorithms dealing with big data including data collection, preprocessing of data. The Fragmentation algorithm will take the function of a distributed implementation of the traditional file system time-sharing model, where various users share files and storage resources. Also, in this research we used a framework to improve the performance of a query and reduce the response time called the HADOOP. The Apache Hadoop project for safe, scalable and distributed computing. The results showed that Hadoop is the best way to deal with big data during calculating the rate of response time of a complex query for example at (00:00:01) per second and comparing it with the response time of the same queries on the fragmentation algorithm at (00: 01:11) per second and the standard database at (00:05:13) per second. We concluded that Total time Access for complex queries in distributed processing is faster than in non-distributed processing.
doi_str_mv 10.1088/1742-6596/1804/1/012109
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2512974344</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2512974344</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2499-e9d073ed41daae7efd4d0ab6f1aaff245c2e475bd566756b79db287f123cf2f23</originalsourceid><addsrcrecordid>eNo9kM1OwzAQhC0EEqXwDETiHOL1T2wfoVCKVMSFnq1NbEepaFLs9MDbkyioe9mVdjSj-Qi5B_oIVOsClGB5KU1ZgKaigIICA2ouyOL8uTzfWl-Tm5T2lPJx1IJkz22TveCA2Qd22PiD74Zsl9quyTbo-v54S64Cfid_97-XZLd-_Vpt8u3n2_vqaZvXTBiTe-Oo4t4JcIhe-eCEo1iVARBDYELWzAslKyfLUsmyUsZVTKsAjNeBBcaX5GH2Pcb-5-TTYPf9KXZjpGUSmFGCCzGq1KyqY59S9MEeY3vA-GuB2gmHnYraqbSdcFiwMw7-B0JSURU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2512974344</pqid></control><display><type>article</type><title>Big Data Management Using Hadoop</title><source>IOP Publishing Free Content</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>IOPscience extra</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>khalil, Majida yaseen ; Hamad, Murtadha M.</creator><creatorcontrib>khalil, Majida yaseen ; Hamad, Murtadha M.</creatorcontrib><description>Today, one of the key issues is the design of systems and software to deal with the storage, management and processing of large amounts of data as a result of the exponential rise in data. In unstructured forms, these data are found. Due to the large and complex data sizes, data management with traditional approaches is unacceptable. Hadoop is an appropriate solution for the continuous growth of data sizes. We have suggested in this paper techniques and algorithms dealing with big data including data collection, preprocessing of data. The Fragmentation algorithm will take the function of a distributed implementation of the traditional file system time-sharing model, where various users share files and storage resources. Also, in this research we used a framework to improve the performance of a query and reduce the response time called the HADOOP. The Apache Hadoop project for safe, scalable and distributed computing. The results showed that Hadoop is the best way to deal with big data during calculating the rate of response time of a complex query for example at (00:00:01) per second and comparing it with the response time of the same queries on the fragmentation algorithm at (00: 01:11) per second and the standard database at (00:05:13) per second. We concluded that Total time Access for complex queries in distributed processing is faster than in non-distributed processing.</description><identifier>ISSN: 1742-6588</identifier><identifier>EISSN: 1742-6596</identifier><identifier>DOI: 10.1088/1742-6596/1804/1/012109</identifier><language>eng</language><publisher>Bristol: IOP Publishing</publisher><subject>Algorithms ; Big Data ; Computer networks ; Data collection ; Data management ; Distributed processing ; Fragmentation ; Physics ; Queries ; Query processing ; Response time ; Response time (computers) ; Unstructured data</subject><ispartof>Journal of physics. Conference series, 2021-02, Vol.1804 (1), p.12109</ispartof><rights>2021. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2499-e9d073ed41daae7efd4d0ab6f1aaff245c2e475bd566756b79db287f123cf2f23</citedby><cites>FETCH-LOGICAL-c2499-e9d073ed41daae7efd4d0ab6f1aaff245c2e475bd566756b79db287f123cf2f23</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>khalil, Majida yaseen</creatorcontrib><creatorcontrib>Hamad, Murtadha M.</creatorcontrib><title>Big Data Management Using Hadoop</title><title>Journal of physics. Conference series</title><description>Today, one of the key issues is the design of systems and software to deal with the storage, management and processing of large amounts of data as a result of the exponential rise in data. In unstructured forms, these data are found. Due to the large and complex data sizes, data management with traditional approaches is unacceptable. Hadoop is an appropriate solution for the continuous growth of data sizes. We have suggested in this paper techniques and algorithms dealing with big data including data collection, preprocessing of data. The Fragmentation algorithm will take the function of a distributed implementation of the traditional file system time-sharing model, where various users share files and storage resources. Also, in this research we used a framework to improve the performance of a query and reduce the response time called the HADOOP. The Apache Hadoop project for safe, scalable and distributed computing. The results showed that Hadoop is the best way to deal with big data during calculating the rate of response time of a complex query for example at (00:00:01) per second and comparing it with the response time of the same queries on the fragmentation algorithm at (00: 01:11) per second and the standard database at (00:05:13) per second. We concluded that Total time Access for complex queries in distributed processing is faster than in non-distributed processing.</description><subject>Algorithms</subject><subject>Big Data</subject><subject>Computer networks</subject><subject>Data collection</subject><subject>Data management</subject><subject>Distributed processing</subject><subject>Fragmentation</subject><subject>Physics</subject><subject>Queries</subject><subject>Query processing</subject><subject>Response time</subject><subject>Response time (computers)</subject><subject>Unstructured data</subject><issn>1742-6588</issn><issn>1742-6596</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNo9kM1OwzAQhC0EEqXwDETiHOL1T2wfoVCKVMSFnq1NbEepaFLs9MDbkyioe9mVdjSj-Qi5B_oIVOsClGB5KU1ZgKaigIICA2ouyOL8uTzfWl-Tm5T2lPJx1IJkz22TveCA2Qd22PiD74Zsl9quyTbo-v54S64Cfid_97-XZLd-_Vpt8u3n2_vqaZvXTBiTe-Oo4t4JcIhe-eCEo1iVARBDYELWzAslKyfLUsmyUsZVTKsAjNeBBcaX5GH2Pcb-5-TTYPf9KXZjpGUSmFGCCzGq1KyqY59S9MEeY3vA-GuB2gmHnYraqbSdcFiwMw7-B0JSURU</recordid><startdate>20210201</startdate><enddate>20210201</enddate><creator>khalil, Majida yaseen</creator><creator>Hamad, Murtadha M.</creator><general>IOP Publishing</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>H8D</scope><scope>HCIFZ</scope><scope>L7M</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope></search><sort><creationdate>20210201</creationdate><title>Big Data Management Using Hadoop</title><author>khalil, Majida yaseen ; Hamad, Murtadha M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2499-e9d073ed41daae7efd4d0ab6f1aaff245c2e475bd566756b79db287f123cf2f23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Big Data</topic><topic>Computer networks</topic><topic>Data collection</topic><topic>Data management</topic><topic>Distributed processing</topic><topic>Fragmentation</topic><topic>Physics</topic><topic>Queries</topic><topic>Query processing</topic><topic>Response time</topic><topic>Response time (computers)</topic><topic>Unstructured data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>khalil, Majida yaseen</creatorcontrib><creatorcontrib>Hamad, Murtadha M.</creatorcontrib><collection>CrossRef</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Aerospace Database</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Journal of physics. Conference series</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>khalil, Majida yaseen</au><au>Hamad, Murtadha M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Big Data Management Using Hadoop</atitle><jtitle>Journal of physics. Conference series</jtitle><date>2021-02-01</date><risdate>2021</risdate><volume>1804</volume><issue>1</issue><spage>12109</spage><pages>12109-</pages><issn>1742-6588</issn><eissn>1742-6596</eissn><abstract>Today, one of the key issues is the design of systems and software to deal with the storage, management and processing of large amounts of data as a result of the exponential rise in data. In unstructured forms, these data are found. Due to the large and complex data sizes, data management with traditional approaches is unacceptable. Hadoop is an appropriate solution for the continuous growth of data sizes. We have suggested in this paper techniques and algorithms dealing with big data including data collection, preprocessing of data. The Fragmentation algorithm will take the function of a distributed implementation of the traditional file system time-sharing model, where various users share files and storage resources. Also, in this research we used a framework to improve the performance of a query and reduce the response time called the HADOOP. The Apache Hadoop project for safe, scalable and distributed computing. The results showed that Hadoop is the best way to deal with big data during calculating the rate of response time of a complex query for example at (00:00:01) per second and comparing it with the response time of the same queries on the fragmentation algorithm at (00: 01:11) per second and the standard database at (00:05:13) per second. We concluded that Total time Access for complex queries in distributed processing is faster than in non-distributed processing.</abstract><cop>Bristol</cop><pub>IOP Publishing</pub><doi>10.1088/1742-6596/1804/1/012109</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1742-6588
ispartof Journal of physics. Conference series, 2021-02, Vol.1804 (1), p.12109
issn 1742-6588
1742-6596
language eng
recordid cdi_proquest_journals_2512974344
source IOP Publishing Free Content; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; IOPscience extra; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry
subjects Algorithms
Big Data
Computer networks
Data collection
Data management
Distributed processing
Fragmentation
Physics
Queries
Query processing
Response time
Response time (computers)
Unstructured data
title Big Data Management Using Hadoop
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T19%3A24%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Big%20Data%20Management%20Using%20Hadoop&rft.jtitle=Journal%20of%20physics.%20Conference%20series&rft.au=khalil,%20Majida%20yaseen&rft.date=2021-02-01&rft.volume=1804&rft.issue=1&rft.spage=12109&rft.pages=12109-&rft.issn=1742-6588&rft.eissn=1742-6596&rft_id=info:doi/10.1088/1742-6596/1804/1/012109&rft_dat=%3Cproquest_cross%3E2512974344%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2512974344&rft_id=info:pmid/&rfr_iscdi=true