Storage-Heterogeneity Aware Task-based Programming Models to Optimize I/O Intensive Applications

Task-based programming models have enabled the optimized execution of the computation workloads of applications. These programming models can take advantage of large-scale distributed infrastructures by allowing the parallel and distributed execution of applications in high-level work components cal...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on parallel and distributed systems 2022-12, Vol.33 (12), p.3589-3599
Hauptverfasser:	Elshazly, Hatem, Ejarque, Jorge, Badia, Rosa M.
Format:	Artikel
Sprache:	eng
Schlagworte:	automatic data movement Bandwidth Big Data checkpointing Computational modeling Heterogeneity heterogeneity abstraction Heterogeneous storage systems I/O intensive applications I/O scheduling Optimization Performance evaluation Programming Proposals Random access memory resource pooling Storage systems Task analysis task scheduling task-based programming models
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	3599
container_issue	12
container_start_page	3589
container_title	IEEE transactions on parallel and distributed systems
container_volume	33
creator	Elshazly, Hatem Ejarque, Jorge Badia, Rosa M.
description	Task-based programming models have enabled the optimized execution of the computation workloads of applications. These programming models can take advantage of large-scale distributed infrastructures by allowing the parallel and distributed execution of applications in high-level work components called tasks . Nevertheless, in the era of Big Data and Exascale, the amount of data produced by modern scientific applications has already surpassed terabytes and is rapidly increasing. Hence, I/O performance became the bottleneck to overcome in order to achieve more total performance improvement. New storage technologies offer higher bandwidth and faster solutions than traditional Parallel File Systems (PFS). Such storage devices are deployed in modern day infrastructures to boost I/O performance by offering a fast layer that absorbs the generated data. Therefore, it is necessary for any programming model targeting more performance to manage this heterogeneity and take advantage of it to improve the I/O performance of applications. Towards this goal, we propose in this article a set of programming model capabilities that we refer to as Storage-Heterogeneity Awareness . Such capabilities include: (i) abstracting the heterogeneity of storage systems, and (ii) optimizing I/O performance by supporting dedicated I/O schedulers and an automatic data flushing technique. The evaluation section of this article presents the performance results of different applications on the MareNostrum CTE-Power heterogeneous storage cluster. Our experiments demonstrate that a storage-heterogeneity aware programming model can achieve up to almost 5x I/O performance speedup and 48% total time improvement compared to the reference PFS-based usage of the execution infrastructure.
doi_str_mv	10.1109/TPDS.2022.3161123
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9739916</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9739916</ieee_id><sourcerecordid>2688684838</sourcerecordid><originalsourceid>FETCH-LOGICAL-c336t-7c04bbdbc22a8987af521900598d851973b2141ea8614e2a46b4cbac7a7c9f943</originalsourceid><addsrcrecordid>eNo9kF9PwjAUxRejiYh-AONLE58HvW23tY8E_0CCgQR8nt12R4qwzrZo8NM7AvHpnuSec-_JL4rugQ4AqBquFk_LAaOMDTikAIxfRD1IEhkzkPyy01QksWKgrqMb7zeUgkio6EUfy2CdXmM8wYDOrrFBEw5k9KMdkpX2n3GhPVZk0e2c3u1MsyZvtsKtJ8GSeRvMzvwimQ7nZNoEbLz5RjJq260pdTC28bfRVa23Hu_Osx-9vzyvxpN4Nn-djkezuOQ8DXFWUlEUVVEypqWSma6TriyliZKVTEBlvGAgALVMQSDTIi1EWegy01mpaiV4P3o83W2d_dqjD_nG7l3TvcxZKmUqheSyc8HJVTrrvcM6b53ZaXfIgeZHkPkRZH4EmZ9BdpmHU8Yg4r-_a6QUpPwP8lhvKg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2688684838</pqid></control><display><type>article</type><title>Storage-Heterogeneity Aware Task-based Programming Models to Optimize I/O Intensive Applications</title><source>IEEE Electronic Library (IEL)</source><creator>Elshazly, Hatem ; Ejarque, Jorge ; Badia, Rosa M.</creator><creatorcontrib>Elshazly, Hatem ; Ejarque, Jorge ; Badia, Rosa M.</creatorcontrib><description>Task-based programming models have enabled the optimized execution of the computation workloads of applications. These programming models can take advantage of large-scale distributed infrastructures by allowing the parallel and distributed execution of applications in high-level work components called tasks . Nevertheless, in the era of Big Data and Exascale, the amount of data produced by modern scientific applications has already surpassed terabytes and is rapidly increasing. Hence, I/O performance became the bottleneck to overcome in order to achieve more total performance improvement. New storage technologies offer higher bandwidth and faster solutions than traditional Parallel File Systems (PFS). Such storage devices are deployed in modern day infrastructures to boost I/O performance by offering a fast layer that absorbs the generated data. Therefore, it is necessary for any programming model targeting more performance to manage this heterogeneity and take advantage of it to improve the I/O performance of applications. Towards this goal, we propose in this article a set of programming model capabilities that we refer to as Storage-Heterogeneity Awareness . Such capabilities include: (i) abstracting the heterogeneity of storage systems, and (ii) optimizing I/O performance by supporting dedicated I/O schedulers and an automatic data flushing technique. The evaluation section of this article presents the performance results of different applications on the MareNostrum CTE-Power heterogeneous storage cluster. Our experiments demonstrate that a storage-heterogeneity aware programming model can achieve up to almost 5x I/O performance speedup and 48% total time improvement compared to the reference PFS-based usage of the execution infrastructure.</description><identifier>ISSN: 1045-9219</identifier><identifier>EISSN: 1558-2183</identifier><identifier>DOI: 10.1109/TPDS.2022.3161123</identifier><identifier>CODEN: ITDSEO</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>automatic data movement ; Bandwidth ; Big Data ; checkpointing ; Computational modeling ; Heterogeneity ; heterogeneity abstraction ; Heterogeneous storage systems ; I/O intensive applications ; I/O scheduling ; Optimization ; Performance evaluation ; Programming ; Proposals ; Random access memory ; resource pooling ; Storage systems ; Task analysis ; task scheduling ; task-based programming models</subject><ispartof>IEEE transactions on parallel and distributed systems, 2022-12, Vol.33 (12), p.3589-3599</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c336t-7c04bbdbc22a8987af521900598d851973b2141ea8614e2a46b4cbac7a7c9f943</citedby><cites>FETCH-LOGICAL-c336t-7c04bbdbc22a8987af521900598d851973b2141ea8614e2a46b4cbac7a7c9f943</cites><orcidid>0000-0003-4725-5097 ; 0000-0003-2941-5499 ; 0000-0002-8591-1502</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9739916$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9739916$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Elshazly, Hatem</creatorcontrib><creatorcontrib>Ejarque, Jorge</creatorcontrib><creatorcontrib>Badia, Rosa M.</creatorcontrib><title>Storage-Heterogeneity Aware Task-based Programming Models to Optimize I/O Intensive Applications</title><title>IEEE transactions on parallel and distributed systems</title><addtitle>TPDS</addtitle><description>Task-based programming models have enabled the optimized execution of the computation workloads of applications. These programming models can take advantage of large-scale distributed infrastructures by allowing the parallel and distributed execution of applications in high-level work components called tasks . Nevertheless, in the era of Big Data and Exascale, the amount of data produced by modern scientific applications has already surpassed terabytes and is rapidly increasing. Hence, I/O performance became the bottleneck to overcome in order to achieve more total performance improvement. New storage technologies offer higher bandwidth and faster solutions than traditional Parallel File Systems (PFS). Such storage devices are deployed in modern day infrastructures to boost I/O performance by offering a fast layer that absorbs the generated data. Therefore, it is necessary for any programming model targeting more performance to manage this heterogeneity and take advantage of it to improve the I/O performance of applications. Towards this goal, we propose in this article a set of programming model capabilities that we refer to as Storage-Heterogeneity Awareness . Such capabilities include: (i) abstracting the heterogeneity of storage systems, and (ii) optimizing I/O performance by supporting dedicated I/O schedulers and an automatic data flushing technique. The evaluation section of this article presents the performance results of different applications on the MareNostrum CTE-Power heterogeneous storage cluster. Our experiments demonstrate that a storage-heterogeneity aware programming model can achieve up to almost 5x I/O performance speedup and 48% total time improvement compared to the reference PFS-based usage of the execution infrastructure.</description><subject>automatic data movement</subject><subject>Bandwidth</subject><subject>Big Data</subject><subject>checkpointing</subject><subject>Computational modeling</subject><subject>Heterogeneity</subject><subject>heterogeneity abstraction</subject><subject>Heterogeneous storage systems</subject><subject>I/O intensive applications</subject><subject>I/O scheduling</subject><subject>Optimization</subject><subject>Performance evaluation</subject><subject>Programming</subject><subject>Proposals</subject><subject>Random access memory</subject><subject>resource pooling</subject><subject>Storage systems</subject><subject>Task analysis</subject><subject>task scheduling</subject><subject>task-based programming models</subject><issn>1045-9219</issn><issn>1558-2183</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kF9PwjAUxRejiYh-AONLE58HvW23tY8E_0CCgQR8nt12R4qwzrZo8NM7AvHpnuSec-_JL4rugQ4AqBquFk_LAaOMDTikAIxfRD1IEhkzkPyy01QksWKgrqMb7zeUgkio6EUfy2CdXmM8wYDOrrFBEw5k9KMdkpX2n3GhPVZk0e2c3u1MsyZvtsKtJ8GSeRvMzvwimQ7nZNoEbLz5RjJq260pdTC28bfRVa23Hu_Osx-9vzyvxpN4Nn-djkezuOQ8DXFWUlEUVVEypqWSma6TriyliZKVTEBlvGAgALVMQSDTIi1EWegy01mpaiV4P3o83W2d_dqjD_nG7l3TvcxZKmUqheSyc8HJVTrrvcM6b53ZaXfIgeZHkPkRZH4EmZ9BdpmHU8Yg4r-_a6QUpPwP8lhvKg</recordid><startdate>20221201</startdate><enddate>20221201</enddate><creator>Elshazly, Hatem</creator><creator>Ejarque, Jorge</creator><creator>Badia, Rosa M.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-4725-5097</orcidid><orcidid>https://orcid.org/0000-0003-2941-5499</orcidid><orcidid>https://orcid.org/0000-0002-8591-1502</orcidid></search><sort><creationdate>20221201</creationdate><title>Storage-Heterogeneity Aware Task-based Programming Models to Optimize I/O Intensive Applications</title><author>Elshazly, Hatem ; Ejarque, Jorge ; Badia, Rosa M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c336t-7c04bbdbc22a8987af521900598d851973b2141ea8614e2a46b4cbac7a7c9f943</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>automatic data movement</topic><topic>Bandwidth</topic><topic>Big Data</topic><topic>checkpointing</topic><topic>Computational modeling</topic><topic>Heterogeneity</topic><topic>heterogeneity abstraction</topic><topic>Heterogeneous storage systems</topic><topic>I/O intensive applications</topic><topic>I/O scheduling</topic><topic>Optimization</topic><topic>Performance evaluation</topic><topic>Programming</topic><topic>Proposals</topic><topic>Random access memory</topic><topic>resource pooling</topic><topic>Storage systems</topic><topic>Task analysis</topic><topic>task scheduling</topic><topic>task-based programming models</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Elshazly, Hatem</creatorcontrib><creatorcontrib>Ejarque, Jorge</creatorcontrib><creatorcontrib>Badia, Rosa M.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on parallel and distributed systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Elshazly, Hatem</au><au>Ejarque, Jorge</au><au>Badia, Rosa M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Storage-Heterogeneity Aware Task-based Programming Models to Optimize I/O Intensive Applications</atitle><jtitle>IEEE transactions on parallel and distributed systems</jtitle><stitle>TPDS</stitle><date>2022-12-01</date><risdate>2022</risdate><volume>33</volume><issue>12</issue><spage>3589</spage><epage>3599</epage><pages>3589-3599</pages><issn>1045-9219</issn><eissn>1558-2183</eissn><coden>ITDSEO</coden><abstract>Task-based programming models have enabled the optimized execution of the computation workloads of applications. These programming models can take advantage of large-scale distributed infrastructures by allowing the parallel and distributed execution of applications in high-level work components called tasks . Nevertheless, in the era of Big Data and Exascale, the amount of data produced by modern scientific applications has already surpassed terabytes and is rapidly increasing. Hence, I/O performance became the bottleneck to overcome in order to achieve more total performance improvement. New storage technologies offer higher bandwidth and faster solutions than traditional Parallel File Systems (PFS). Such storage devices are deployed in modern day infrastructures to boost I/O performance by offering a fast layer that absorbs the generated data. Therefore, it is necessary for any programming model targeting more performance to manage this heterogeneity and take advantage of it to improve the I/O performance of applications. Towards this goal, we propose in this article a set of programming model capabilities that we refer to as Storage-Heterogeneity Awareness . Such capabilities include: (i) abstracting the heterogeneity of storage systems, and (ii) optimizing I/O performance by supporting dedicated I/O schedulers and an automatic data flushing technique. The evaluation section of this article presents the performance results of different applications on the MareNostrum CTE-Power heterogeneous storage cluster. Our experiments demonstrate that a storage-heterogeneity aware programming model can achieve up to almost 5x I/O performance speedup and 48% total time improvement compared to the reference PFS-based usage of the execution infrastructure.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TPDS.2022.3161123</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0003-4725-5097</orcidid><orcidid>https://orcid.org/0000-0003-2941-5499</orcidid><orcidid>https://orcid.org/0000-0002-8591-1502</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1045-9219
ispartof	IEEE transactions on parallel and distributed systems, 2022-12, Vol.33 (12), p.3589-3599
issn	1045-9219 1558-2183
language	eng
recordid	cdi_ieee_primary_9739916
source	IEEE Electronic Library (IEL)
subjects	automatic data movement Bandwidth Big Data checkpointing Computational modeling Heterogeneity heterogeneity abstraction Heterogeneous storage systems I/O intensive applications I/O scheduling Optimization Performance evaluation Programming Proposals Random access memory resource pooling Storage systems Task analysis task scheduling task-based programming models
title	Storage-Heterogeneity Aware Task-based Programming Models to Optimize I/O Intensive Applications
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T11%3A59%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Storage-Heterogeneity%20Aware%20Task-based%20Programming%20Models%20to%20Optimize%20I/O%20Intensive%20Applications&rft.jtitle=IEEE%20transactions%20on%20parallel%20and%20distributed%20systems&rft.au=Elshazly,%20Hatem&rft.date=2022-12-01&rft.volume=33&rft.issue=12&rft.spage=3589&rft.epage=3599&rft.pages=3589-3599&rft.issn=1045-9219&rft.eissn=1558-2183&rft.coden=ITDSEO&rft_id=info:doi/10.1109/TPDS.2022.3161123&rft_dat=%3Cproquest_RIE%3E2688684838%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2688684838&rft_id=info:pmid/&rft_ieee_id=9739916&rfr_iscdi=true