Using hardware operations to reduce the synchronization overhead of task pools

We consider the task-based execution of parallel irregular applications, which are characterized by an unpredictable computational structure induced by the input data. The dynamic load balancing required to execute such applications efficiently can be provided by task pools. Thus, the performance of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Hoffmann, R., Korch, M., Rauber, T.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 249 vol.1
container_issue
container_start_page 241
container_title
container_volume
creator Hoffmann, R.
Korch, M.
Rauber, T.
description We consider the task-based execution of parallel irregular applications, which are characterized by an unpredictable computational structure induced by the input data. The dynamic load balancing required to execute such applications efficiently can be provided by task pools. Thus, the performance of a task-based irregular application is tightly coupled to the scalability and the overhead of the task pool used to execute it. In order to reduce this overhead this article considers the use of the hardware-specific synchronization operations compare & swap and load & reserve/store conditional. We present several different realizations of task pools using these operations. Runtime experiments on two shared-memory machines, a SunFire 6800 and an IBM p690, show that the new implementations obtain a significantly higher performance than implementations relying on the POSIX thread library for synchronization.
doi_str_mv 10.1109/ICPP.2004.1327927
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_1327927</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1327927</ieee_id><sourcerecordid>1327927</sourcerecordid><originalsourceid>FETCH-LOGICAL-i241t-5c8086d0686172bc53a073fa871e61b73b841b8347adc739f747b5bc26f25b463</originalsourceid><addsrcrecordid>eNotkMtKAzEYRoMXsNY-gLjJC8z4554sZfBSKNqFXZckk3GidTIko1KfXtGuvgMHzuJD6JJATQiY62WzXtcUgNeEUWWoOkIzyhithDRwjBZGaVDSCEp-6QTNgBiomCH6DJ2X8gpAgQk-Q4-bEocX3NvcftkccBpDtlNMQ8FTwjm0Hz7gqQ-47Aff5zTE7z-N02fIfbAtTh2ebHnDY0q7coFOO7srYXHYOdrc3T43D9Xq6X7Z3KyqSDmZKuE1aNmC1JIo6rxgFhTrrFYkSOIUc5oTpxlXtvWKmU5x5YTzVHZUOC7ZHF39d2MIYTvm-G7zfnu4gv0AOgRQkA</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Using hardware operations to reduce the synchronization overhead of task pools</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Hoffmann, R. ; Korch, M. ; Rauber, T.</creator><creatorcontrib>Hoffmann, R. ; Korch, M. ; Rauber, T.</creatorcontrib><description>We consider the task-based execution of parallel irregular applications, which are characterized by an unpredictable computational structure induced by the input data. The dynamic load balancing required to execute such applications efficiently can be provided by task pools. Thus, the performance of a task-based irregular application is tightly coupled to the scalability and the overhead of the task pool used to execute it. In order to reduce this overhead this article considers the use of the hardware-specific synchronization operations compare &amp; swap and load &amp; reserve/store conditional. We present several different realizations of task pools using these operations. Runtime experiments on two shared-memory machines, a SunFire 6800 and an IBM p690, show that the new implementations obtain a significantly higher performance than implementations relying on the POSIX thread library for synchronization.</description><identifier>ISSN: 0190-3918</identifier><identifier>ISBN: 9780769521978</identifier><identifier>ISBN: 0769521975</identifier><identifier>EISSN: 2332-5690</identifier><identifier>DOI: 10.1109/ICPP.2004.1327927</identifier><language>eng</language><publisher>IEEE</publisher><subject>Application software ; Computer science ; Concurrent computing ; Content addressable storage ; Hardware ; Load management ; Registers ; Runtime library ; Scalability ; Yarn</subject><ispartof>International Conference on Parallel Processing, 2004. ICPP 2004, 2004, p.241-249 vol.1</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1327927$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>310,311,782,786,791,792,2060,4052,4053,27932,54927</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1327927$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hoffmann, R.</creatorcontrib><creatorcontrib>Korch, M.</creatorcontrib><creatorcontrib>Rauber, T.</creatorcontrib><title>Using hardware operations to reduce the synchronization overhead of task pools</title><title>International Conference on Parallel Processing, 2004. ICPP 2004</title><addtitle>ICPP</addtitle><description>We consider the task-based execution of parallel irregular applications, which are characterized by an unpredictable computational structure induced by the input data. The dynamic load balancing required to execute such applications efficiently can be provided by task pools. Thus, the performance of a task-based irregular application is tightly coupled to the scalability and the overhead of the task pool used to execute it. In order to reduce this overhead this article considers the use of the hardware-specific synchronization operations compare &amp; swap and load &amp; reserve/store conditional. We present several different realizations of task pools using these operations. Runtime experiments on two shared-memory machines, a SunFire 6800 and an IBM p690, show that the new implementations obtain a significantly higher performance than implementations relying on the POSIX thread library for synchronization.</description><subject>Application software</subject><subject>Computer science</subject><subject>Concurrent computing</subject><subject>Content addressable storage</subject><subject>Hardware</subject><subject>Load management</subject><subject>Registers</subject><subject>Runtime library</subject><subject>Scalability</subject><subject>Yarn</subject><issn>0190-3918</issn><issn>2332-5690</issn><isbn>9780769521978</isbn><isbn>0769521975</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2004</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotkMtKAzEYRoMXsNY-gLjJC8z4554sZfBSKNqFXZckk3GidTIko1KfXtGuvgMHzuJD6JJATQiY62WzXtcUgNeEUWWoOkIzyhithDRwjBZGaVDSCEp-6QTNgBiomCH6DJ2X8gpAgQk-Q4-bEocX3NvcftkccBpDtlNMQ8FTwjm0Hz7gqQ-47Aff5zTE7z-N02fIfbAtTh2ebHnDY0q7coFOO7srYXHYOdrc3T43D9Xq6X7Z3KyqSDmZKuE1aNmC1JIo6rxgFhTrrFYkSOIUc5oTpxlXtvWKmU5x5YTzVHZUOC7ZHF39d2MIYTvm-G7zfnu4gv0AOgRQkA</recordid><startdate>2004</startdate><enddate>2004</enddate><creator>Hoffmann, R.</creator><creator>Korch, M.</creator><creator>Rauber, T.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>2004</creationdate><title>Using hardware operations to reduce the synchronization overhead of task pools</title><author>Hoffmann, R. ; Korch, M. ; Rauber, T.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i241t-5c8086d0686172bc53a073fa871e61b73b841b8347adc739f747b5bc26f25b463</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Application software</topic><topic>Computer science</topic><topic>Concurrent computing</topic><topic>Content addressable storage</topic><topic>Hardware</topic><topic>Load management</topic><topic>Registers</topic><topic>Runtime library</topic><topic>Scalability</topic><topic>Yarn</topic><toplevel>online_resources</toplevel><creatorcontrib>Hoffmann, R.</creatorcontrib><creatorcontrib>Korch, M.</creatorcontrib><creatorcontrib>Rauber, T.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hoffmann, R.</au><au>Korch, M.</au><au>Rauber, T.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Using hardware operations to reduce the synchronization overhead of task pools</atitle><btitle>International Conference on Parallel Processing, 2004. ICPP 2004</btitle><stitle>ICPP</stitle><date>2004</date><risdate>2004</risdate><spage>241</spage><epage>249 vol.1</epage><pages>241-249 vol.1</pages><issn>0190-3918</issn><eissn>2332-5690</eissn><isbn>9780769521978</isbn><isbn>0769521975</isbn><abstract>We consider the task-based execution of parallel irregular applications, which are characterized by an unpredictable computational structure induced by the input data. The dynamic load balancing required to execute such applications efficiently can be provided by task pools. Thus, the performance of a task-based irregular application is tightly coupled to the scalability and the overhead of the task pool used to execute it. In order to reduce this overhead this article considers the use of the hardware-specific synchronization operations compare &amp; swap and load &amp; reserve/store conditional. We present several different realizations of task pools using these operations. Runtime experiments on two shared-memory machines, a SunFire 6800 and an IBM p690, show that the new implementations obtain a significantly higher performance than implementations relying on the POSIX thread library for synchronization.</abstract><pub>IEEE</pub><doi>10.1109/ICPP.2004.1327927</doi></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0190-3918
ispartof International Conference on Parallel Processing, 2004. ICPP 2004, 2004, p.241-249 vol.1
issn 0190-3918
2332-5690
language eng
recordid cdi_ieee_primary_1327927
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Application software
Computer science
Concurrent computing
Content addressable storage
Hardware
Load management
Registers
Runtime library
Scalability
Yarn
title Using hardware operations to reduce the synchronization overhead of task pools
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-04T14%3A17%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Using%20hardware%20operations%20to%20reduce%20the%20synchronization%20overhead%20of%20task%20pools&rft.btitle=International%20Conference%20on%20Parallel%20Processing,%202004.%20ICPP%202004&rft.au=Hoffmann,%20R.&rft.date=2004&rft.spage=241&rft.epage=249%20vol.1&rft.pages=241-249%20vol.1&rft.issn=0190-3918&rft.eissn=2332-5690&rft.isbn=9780769521978&rft.isbn_list=0769521975&rft_id=info:doi/10.1109/ICPP.2004.1327927&rft_dat=%3Cieee_6IE%3E1327927%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1327927&rfr_iscdi=true