Using hardware operations to reduce the synchronization overhead of task pools
We consider the task-based execution of parallel irregular applications, which are characterized by an unpredictable computational structure induced by the input data. The dynamic load balancing required to execute such applications efficiently can be provided by task pools. Thus, the performance of...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 249 vol.1 |
---|---|
container_issue | |
container_start_page | 241 |
container_title | |
container_volume | |
creator | Hoffmann, R. Korch, M. Rauber, T. |
description | We consider the task-based execution of parallel irregular applications, which are characterized by an unpredictable computational structure induced by the input data. The dynamic load balancing required to execute such applications efficiently can be provided by task pools. Thus, the performance of a task-based irregular application is tightly coupled to the scalability and the overhead of the task pool used to execute it. In order to reduce this overhead this article considers the use of the hardware-specific synchronization operations compare & swap and load & reserve/store conditional. We present several different realizations of task pools using these operations. Runtime experiments on two shared-memory machines, a SunFire 6800 and an IBM p690, show that the new implementations obtain a significantly higher performance than implementations relying on the POSIX thread library for synchronization. |
doi_str_mv | 10.1109/ICPP.2004.1327927 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_1327927</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1327927</ieee_id><sourcerecordid>1327927</sourcerecordid><originalsourceid>FETCH-LOGICAL-i241t-5c8086d0686172bc53a073fa871e61b73b841b8347adc739f747b5bc26f25b463</originalsourceid><addsrcrecordid>eNotkMtKAzEYRoMXsNY-gLjJC8z4554sZfBSKNqFXZckk3GidTIko1KfXtGuvgMHzuJD6JJATQiY62WzXtcUgNeEUWWoOkIzyhithDRwjBZGaVDSCEp-6QTNgBiomCH6DJ2X8gpAgQk-Q4-bEocX3NvcftkccBpDtlNMQ8FTwjm0Hz7gqQ-47Aff5zTE7z-N02fIfbAtTh2ebHnDY0q7coFOO7srYXHYOdrc3T43D9Xq6X7Z3KyqSDmZKuE1aNmC1JIo6rxgFhTrrFYkSOIUc5oTpxlXtvWKmU5x5YTzVHZUOC7ZHF39d2MIYTvm-G7zfnu4gv0AOgRQkA</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Using hardware operations to reduce the synchronization overhead of task pools</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Hoffmann, R. ; Korch, M. ; Rauber, T.</creator><creatorcontrib>Hoffmann, R. ; Korch, M. ; Rauber, T.</creatorcontrib><description>We consider the task-based execution of parallel irregular applications, which are characterized by an unpredictable computational structure induced by the input data. The dynamic load balancing required to execute such applications efficiently can be provided by task pools. Thus, the performance of a task-based irregular application is tightly coupled to the scalability and the overhead of the task pool used to execute it. In order to reduce this overhead this article considers the use of the hardware-specific synchronization operations compare & swap and load & reserve/store conditional. We present several different realizations of task pools using these operations. Runtime experiments on two shared-memory machines, a SunFire 6800 and an IBM p690, show that the new implementations obtain a significantly higher performance than implementations relying on the POSIX thread library for synchronization.</description><identifier>ISSN: 0190-3918</identifier><identifier>ISBN: 9780769521978</identifier><identifier>ISBN: 0769521975</identifier><identifier>EISSN: 2332-5690</identifier><identifier>DOI: 10.1109/ICPP.2004.1327927</identifier><language>eng</language><publisher>IEEE</publisher><subject>Application software ; Computer science ; Concurrent computing ; Content addressable storage ; Hardware ; Load management ; Registers ; Runtime library ; Scalability ; Yarn</subject><ispartof>International Conference on Parallel Processing, 2004. ICPP 2004, 2004, p.241-249 vol.1</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1327927$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>310,311,782,786,791,792,2060,4052,4053,27932,54927</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1327927$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hoffmann, R.</creatorcontrib><creatorcontrib>Korch, M.</creatorcontrib><creatorcontrib>Rauber, T.</creatorcontrib><title>Using hardware operations to reduce the synchronization overhead of task pools</title><title>International Conference on Parallel Processing, 2004. ICPP 2004</title><addtitle>ICPP</addtitle><description>We consider the task-based execution of parallel irregular applications, which are characterized by an unpredictable computational structure induced by the input data. The dynamic load balancing required to execute such applications efficiently can be provided by task pools. Thus, the performance of a task-based irregular application is tightly coupled to the scalability and the overhead of the task pool used to execute it. In order to reduce this overhead this article considers the use of the hardware-specific synchronization operations compare & swap and load & reserve/store conditional. We present several different realizations of task pools using these operations. Runtime experiments on two shared-memory machines, a SunFire 6800 and an IBM p690, show that the new implementations obtain a significantly higher performance than implementations relying on the POSIX thread library for synchronization.</description><subject>Application software</subject><subject>Computer science</subject><subject>Concurrent computing</subject><subject>Content addressable storage</subject><subject>Hardware</subject><subject>Load management</subject><subject>Registers</subject><subject>Runtime library</subject><subject>Scalability</subject><subject>Yarn</subject><issn>0190-3918</issn><issn>2332-5690</issn><isbn>9780769521978</isbn><isbn>0769521975</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2004</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotkMtKAzEYRoMXsNY-gLjJC8z4554sZfBSKNqFXZckk3GidTIko1KfXtGuvgMHzuJD6JJATQiY62WzXtcUgNeEUWWoOkIzyhithDRwjBZGaVDSCEp-6QTNgBiomCH6DJ2X8gpAgQk-Q4-bEocX3NvcftkccBpDtlNMQ8FTwjm0Hz7gqQ-47Aff5zTE7z-N02fIfbAtTh2ebHnDY0q7coFOO7srYXHYOdrc3T43D9Xq6X7Z3KyqSDmZKuE1aNmC1JIo6rxgFhTrrFYkSOIUc5oTpxlXtvWKmU5x5YTzVHZUOC7ZHF39d2MIYTvm-G7zfnu4gv0AOgRQkA</recordid><startdate>2004</startdate><enddate>2004</enddate><creator>Hoffmann, R.</creator><creator>Korch, M.</creator><creator>Rauber, T.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>2004</creationdate><title>Using hardware operations to reduce the synchronization overhead of task pools</title><author>Hoffmann, R. ; Korch, M. ; Rauber, T.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i241t-5c8086d0686172bc53a073fa871e61b73b841b8347adc739f747b5bc26f25b463</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Application software</topic><topic>Computer science</topic><topic>Concurrent computing</topic><topic>Content addressable storage</topic><topic>Hardware</topic><topic>Load management</topic><topic>Registers</topic><topic>Runtime library</topic><topic>Scalability</topic><topic>Yarn</topic><toplevel>online_resources</toplevel><creatorcontrib>Hoffmann, R.</creatorcontrib><creatorcontrib>Korch, M.</creatorcontrib><creatorcontrib>Rauber, T.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hoffmann, R.</au><au>Korch, M.</au><au>Rauber, T.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Using hardware operations to reduce the synchronization overhead of task pools</atitle><btitle>International Conference on Parallel Processing, 2004. ICPP 2004</btitle><stitle>ICPP</stitle><date>2004</date><risdate>2004</risdate><spage>241</spage><epage>249 vol.1</epage><pages>241-249 vol.1</pages><issn>0190-3918</issn><eissn>2332-5690</eissn><isbn>9780769521978</isbn><isbn>0769521975</isbn><abstract>We consider the task-based execution of parallel irregular applications, which are characterized by an unpredictable computational structure induced by the input data. The dynamic load balancing required to execute such applications efficiently can be provided by task pools. Thus, the performance of a task-based irregular application is tightly coupled to the scalability and the overhead of the task pool used to execute it. In order to reduce this overhead this article considers the use of the hardware-specific synchronization operations compare & swap and load & reserve/store conditional. We present several different realizations of task pools using these operations. Runtime experiments on two shared-memory machines, a SunFire 6800 and an IBM p690, show that the new implementations obtain a significantly higher performance than implementations relying on the POSIX thread library for synchronization.</abstract><pub>IEEE</pub><doi>10.1109/ICPP.2004.1327927</doi></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0190-3918 |
ispartof | International Conference on Parallel Processing, 2004. ICPP 2004, 2004, p.241-249 vol.1 |
issn | 0190-3918 2332-5690 |
language | eng |
recordid | cdi_ieee_primary_1327927 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Application software Computer science Concurrent computing Content addressable storage Hardware Load management Registers Runtime library Scalability Yarn |
title | Using hardware operations to reduce the synchronization overhead of task pools |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-04T14%3A17%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Using%20hardware%20operations%20to%20reduce%20the%20synchronization%20overhead%20of%20task%20pools&rft.btitle=International%20Conference%20on%20Parallel%20Processing,%202004.%20ICPP%202004&rft.au=Hoffmann,%20R.&rft.date=2004&rft.spage=241&rft.epage=249%20vol.1&rft.pages=241-249%20vol.1&rft.issn=0190-3918&rft.eissn=2332-5690&rft.isbn=9780769521978&rft.isbn_list=0769521975&rft_id=info:doi/10.1109/ICPP.2004.1327927&rft_dat=%3Cieee_6IE%3E1327927%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1327927&rfr_iscdi=true |