A new thread-level speculative automatic parallelization model and library based on duplicate code execution

Loop-efficient automatic parallelization has become increasingly relevant due to the growing number of cores in current processors and the programming effort needed to parallelize codes in these systems efficiently. However, automatic tools fail to extract all the available parallelism in irregular...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of supercomputing 2024, Vol.80 (10), p.13714-13737
Hauptverfasser:	Martínez, Millán A., Fraguela, Basilio B., Cabaleiro, José C., Rivera, Francisco F.
Format:	Artikel
Sprache:	eng
Schlagworte:	Compilers Computer Science Interpreters Libraries Processor Architectures Programming Languages
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	13737
container_issue	10
container_start_page	13714
container_title	The Journal of supercomputing
container_volume	80
creator	Martínez, Millán A. Fraguela, Basilio B. Cabaleiro, José C. Rivera, Francisco F.
description	Loop-efficient automatic parallelization has become increasingly relevant due to the growing number of cores in current processors and the programming effort needed to parallelize codes in these systems efficiently. However, automatic tools fail to extract all the available parallelism in irregular loops with indirections, race conditions or potential data dependency violations, among many other possible causes. One of the successful ways to automatically parallelize these loops is the use of speculative parallelization techniques. This paper presents a new model and the corresponding C++ library that supports the speculative automatic parallelization of loops in shared memory systems, seeking competitive performance and scalability while keeping user effort to a minimum. The primary speculative strategy consists of redundantly executing chunks of loop iterations in a duplicate fashion. Namely, each chunk is executed speculatively in parallel to obtain results as soon as possible and sequentially in a different thread to validate the speculative results. The implementation uses C++11 threads and it makes intensive use of templates and advanced multithreading techniques. An evaluation based on various benchmarks confirms that our proposal provides a competitive level of performance and scalability.
doi_str_mv	10.1007/s11227-024-05987-0
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3066440797</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3066440797</sourcerecordid><originalsourceid>FETCH-LOGICAL-c314t-a468ed7f1774ae8571ebfd150482370e35dfe47d810b221945c7eccf9479f8043</originalsourceid><addsrcrecordid>eNp9kEtPwzAQhC0EEqXwBzhZ4mxYP1Inx6riJSFxgbPl2BtI5SbBTsrj1-NSJG6cdlf-ZqwZQs45XHIAfZU4F0IzEIpBUZV5OyAzXmjJQJXqkMygEsDKQoljcpLSGgCU1HJGwpJ2-E7H14jWs4BbDDQN6KZgx3aL1E5jv8mro4ONNgQM7Vc--45uep9Z23ka2jra-Elrm9DT_OSnIbTOjkhdhih-ZL-d5pQcNTYkPPudc_J8c_20umMPj7f3q-UDc5KrkVm1KNHrhmutLJaF5lg3nhc5ipAaUBa-QaV9yaEWgleqcBqdayqlq6bMwebkYu87xP5twjSadT_FLn9pJCwWSoGudKbEnnKxTyliY4bYbnIQw8HsWjX7Vk1u1fy0aiCL5F6UMty9YPyz_kf1Dd96e6Y</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3066440797</pqid></control><display><type>article</type><title>A new thread-level speculative automatic parallelization model and library based on duplicate code execution</title><source>SpringerNature Journals</source><creator>Martínez, Millán A. ; Fraguela, Basilio B. ; Cabaleiro, José C. ; Rivera, Francisco F.</creator><creatorcontrib>Martínez, Millán A. ; Fraguela, Basilio B. ; Cabaleiro, José C. ; Rivera, Francisco F.</creatorcontrib><description>Loop-efficient automatic parallelization has become increasingly relevant due to the growing number of cores in current processors and the programming effort needed to parallelize codes in these systems efficiently. However, automatic tools fail to extract all the available parallelism in irregular loops with indirections, race conditions or potential data dependency violations, among many other possible causes. One of the successful ways to automatically parallelize these loops is the use of speculative parallelization techniques. This paper presents a new model and the corresponding C++ library that supports the speculative automatic parallelization of loops in shared memory systems, seeking competitive performance and scalability while keeping user effort to a minimum. The primary speculative strategy consists of redundantly executing chunks of loop iterations in a duplicate fashion. Namely, each chunk is executed speculatively in parallel to obtain results as soon as possible and sequentially in a different thread to validate the speculative results. The implementation uses C++11 threads and it makes intensive use of templates and advanced multithreading techniques. An evaluation based on various benchmarks confirms that our proposal provides a competitive level of performance and scalability.</description><identifier>ISSN: 0920-8542</identifier><identifier>EISSN: 1573-0484</identifier><identifier>DOI: 10.1007/s11227-024-05987-0</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Compilers ; Computer Science ; Interpreters ; Libraries ; Processor Architectures ; Programming Languages</subject><ispartof>The Journal of supercomputing, 2024, Vol.80 (10), p.13714-13737</ispartof><rights>The Author(s) 2024</rights><rights>The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c314t-a468ed7f1774ae8571ebfd150482370e35dfe47d810b221945c7eccf9479f8043</cites><orcidid>0000-0002-5674-5162 ; 0000-0002-1442-7668 ; 0000-0002-6728-9350 ; 0000-0002-3438-5960</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11227-024-05987-0$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11227-024-05987-0$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Martínez, Millán A.</creatorcontrib><creatorcontrib>Fraguela, Basilio B.</creatorcontrib><creatorcontrib>Cabaleiro, José C.</creatorcontrib><creatorcontrib>Rivera, Francisco F.</creatorcontrib><title>A new thread-level speculative automatic parallelization model and library based on duplicate code execution</title><title>The Journal of supercomputing</title><addtitle>J Supercomput</addtitle><description>Loop-efficient automatic parallelization has become increasingly relevant due to the growing number of cores in current processors and the programming effort needed to parallelize codes in these systems efficiently. However, automatic tools fail to extract all the available parallelism in irregular loops with indirections, race conditions or potential data dependency violations, among many other possible causes. One of the successful ways to automatically parallelize these loops is the use of speculative parallelization techniques. This paper presents a new model and the corresponding C++ library that supports the speculative automatic parallelization of loops in shared memory systems, seeking competitive performance and scalability while keeping user effort to a minimum. The primary speculative strategy consists of redundantly executing chunks of loop iterations in a duplicate fashion. Namely, each chunk is executed speculatively in parallel to obtain results as soon as possible and sequentially in a different thread to validate the speculative results. The implementation uses C++11 threads and it makes intensive use of templates and advanced multithreading techniques. An evaluation based on various benchmarks confirms that our proposal provides a competitive level of performance and scalability.</description><subject>Compilers</subject><subject>Computer Science</subject><subject>Interpreters</subject><subject>Libraries</subject><subject>Processor Architectures</subject><subject>Programming Languages</subject><issn>0920-8542</issn><issn>1573-0484</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><recordid>eNp9kEtPwzAQhC0EEqXwBzhZ4mxYP1Inx6riJSFxgbPl2BtI5SbBTsrj1-NSJG6cdlf-ZqwZQs45XHIAfZU4F0IzEIpBUZV5OyAzXmjJQJXqkMygEsDKQoljcpLSGgCU1HJGwpJ2-E7H14jWs4BbDDQN6KZgx3aL1E5jv8mro4ONNgQM7Vc--45uep9Z23ka2jra-Elrm9DT_OSnIbTOjkhdhih-ZL-d5pQcNTYkPPudc_J8c_20umMPj7f3q-UDc5KrkVm1KNHrhmutLJaF5lg3nhc5ipAaUBa-QaV9yaEWgleqcBqdayqlq6bMwebkYu87xP5twjSadT_FLn9pJCwWSoGudKbEnnKxTyliY4bYbnIQw8HsWjX7Vk1u1fy0aiCL5F6UMty9YPyz_kf1Dd96e6Y</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Martínez, Millán A.</creator><creator>Fraguela, Basilio B.</creator><creator>Cabaleiro, José C.</creator><creator>Rivera, Francisco F.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-5674-5162</orcidid><orcidid>https://orcid.org/0000-0002-1442-7668</orcidid><orcidid>https://orcid.org/0000-0002-6728-9350</orcidid><orcidid>https://orcid.org/0000-0002-3438-5960</orcidid></search><sort><creationdate>2024</creationdate><title>A new thread-level speculative automatic parallelization model and library based on duplicate code execution</title><author>Martínez, Millán A. ; Fraguela, Basilio B. ; Cabaleiro, José C. ; Rivera, Francisco F.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c314t-a468ed7f1774ae8571ebfd150482370e35dfe47d810b221945c7eccf9479f8043</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Compilers</topic><topic>Computer Science</topic><topic>Interpreters</topic><topic>Libraries</topic><topic>Processor Architectures</topic><topic>Programming Languages</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Martínez, Millán A.</creatorcontrib><creatorcontrib>Fraguela, Basilio B.</creatorcontrib><creatorcontrib>Cabaleiro, José C.</creatorcontrib><creatorcontrib>Rivera, Francisco F.</creatorcontrib><collection>Springer Nature OA/Free Journals</collection><collection>CrossRef</collection><jtitle>The Journal of supercomputing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Martínez, Millán A.</au><au>Fraguela, Basilio B.</au><au>Cabaleiro, José C.</au><au>Rivera, Francisco F.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A new thread-level speculative automatic parallelization model and library based on duplicate code execution</atitle><jtitle>The Journal of supercomputing</jtitle><stitle>J Supercomput</stitle><date>2024</date><risdate>2024</risdate><volume>80</volume><issue>10</issue><spage>13714</spage><epage>13737</epage><pages>13714-13737</pages><issn>0920-8542</issn><eissn>1573-0484</eissn><abstract>Loop-efficient automatic parallelization has become increasingly relevant due to the growing number of cores in current processors and the programming effort needed to parallelize codes in these systems efficiently. However, automatic tools fail to extract all the available parallelism in irregular loops with indirections, race conditions or potential data dependency violations, among many other possible causes. One of the successful ways to automatically parallelize these loops is the use of speculative parallelization techniques. This paper presents a new model and the corresponding C++ library that supports the speculative automatic parallelization of loops in shared memory systems, seeking competitive performance and scalability while keeping user effort to a minimum. The primary speculative strategy consists of redundantly executing chunks of loop iterations in a duplicate fashion. Namely, each chunk is executed speculatively in parallel to obtain results as soon as possible and sequentially in a different thread to validate the speculative results. The implementation uses C++11 threads and it makes intensive use of templates and advanced multithreading techniques. An evaluation based on various benchmarks confirms that our proposal provides a competitive level of performance and scalability.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11227-024-05987-0</doi><tpages>24</tpages><orcidid>https://orcid.org/0000-0002-5674-5162</orcidid><orcidid>https://orcid.org/0000-0002-1442-7668</orcidid><orcidid>https://orcid.org/0000-0002-6728-9350</orcidid><orcidid>https://orcid.org/0000-0002-3438-5960</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0920-8542
ispartof	The Journal of supercomputing, 2024, Vol.80 (10), p.13714-13737
issn	0920-8542 1573-0484
language	eng
recordid	cdi_proquest_journals_3066440797
source	SpringerNature Journals
subjects	Compilers Computer Science Interpreters Libraries Processor Architectures Programming Languages
title	A new thread-level speculative automatic parallelization model and library based on duplicate code execution
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T15%3A56%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20new%20thread-level%20speculative%20automatic%20parallelization%20model%20and%20library%20based%20on%20duplicate%20code%20execution&rft.jtitle=The%20Journal%20of%20supercomputing&rft.au=Mart%C3%ADnez,%20Mill%C3%A1n%20A.&rft.date=2024&rft.volume=80&rft.issue=10&rft.spage=13714&rft.epage=13737&rft.pages=13714-13737&rft.issn=0920-8542&rft.eissn=1573-0484&rft_id=info:doi/10.1007/s11227-024-05987-0&rft_dat=%3Cproquest_cross%3E3066440797%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3066440797&rft_id=info:pmid/&rfr_iscdi=true