Software Pipelined Execution of Stream Programs on GPUs

The StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multi-core architectures. This model allows programmers to specify the structure of a program as a set of filters that act upon data, and a set of communication channels between them....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Udupa, Abhishek, Govindarajan, R., Thazhuthaveetil, Matthew J.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Application software Bandwidth Communication channels Computer architecture Computing methodologies > Computer graphics > Graphics systems and interfaces > Graphics processors CUDA Filters GPU Programming Graphics Parallel programming Pipelines Processor scheduling Programming profession Software and its engineering > Software notations and tools > Compilers Software Pipelining Stream Programming
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	209
container_issue
container_start_page	200
container_title
container_volume
creator	Udupa, Abhishek Govindarajan, R. Thazhuthaveetil, Matthew J.
description	The StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multi-core architectures. This model allows programmers to specify the structure of a program as a set of filters that act upon data, and a set of communication channels between them. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on modern Graphics Processing Units (GPUs), as they support abundant parallelism in hardware. In this paper, we describe the challenges in mapping StreamIt to GPUs and propose an efficient technique to software pipeline the execution of stream programs on GPUs. We formulate this problem --- both scheduling and assignment of filters to processors --- as an efficient Integer Linear Program (ILP), which is then solved using ILP solvers. We also describe a novel buffer layout technique for GPUs which facilitates exploiting the high memory bandwidth available in GPUs. The proposed scheduling utilizes both the scalar units in GPU, to exploit data parallelism, and multiprocessors, to exploit task and pipeline parallelism. Further it takes into consideration the synchronization and bandwidth limitations of GPUs, and yields speedups between 1.87X and 36.83X over a single threaded CPU.
doi_str_mv	10.1109/CGO.2009.20
format	Conference Proceeding
fullrecord	<record><control><sourceid>acm_6IE</sourceid><recordid>TN_cdi_ieee_primary_4907664</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4907664</ieee_id><sourcerecordid>acm_books_10_1109_CGO_2009_20</sourcerecordid><originalsourceid>FETCH-LOGICAL-a305t-af2fe87e51c3088ad61baed7fe78326db520b84ce5183471d109e57f9a3ffb5e3</originalsourceid><addsrcrecordid>eNqNkMFLwzAUxgMyUGdPHr3k4EnofEmaJjlKmVUYrDB3Dmn7ItV1HWlF_e-XMf8A3-F78L6PB9-PkFsGC8bAPBblesEBTJQLkhilQeVGCqlymJHr6GiT8UzzS5KM4wfEkVKDYFdEbQY_fbuAtOoOuOv22NLlDzZfUzfs6eDpZgroelqF4T24fqTxWlbb8YbMvNuNmPztOdk-L9-Kl3S1Ll-Lp1XqBMgpdZ571AolawRo7dqc1Q5b5VFpwfO2lhxqnTUxoEWmWBvboFTeOOF9LVHMyd35b4eI9hC63oVfm5lYMM-i-3B2XdPbehg-R8vAnpDYiMSekESxdejQx_D9P8LiCB3JXZM</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Software Pipelined Execution of Stream Programs on GPUs</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Udupa, Abhishek ; Govindarajan, R. ; Thazhuthaveetil, Matthew J.</creator><creatorcontrib>Udupa, Abhishek ; Govindarajan, R. ; Thazhuthaveetil, Matthew J.</creatorcontrib><description>The StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multi-core architectures. This model allows programmers to specify the structure of a program as a set of filters that act upon data, and a set of communication channels between them. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on modern Graphics Processing Units (GPUs), as they support abundant parallelism in hardware. In this paper, we describe the challenges in mapping StreamIt to GPUs and propose an efficient technique to software pipeline the execution of stream programs on GPUs. We formulate this problem --- both scheduling and assignment of filters to processors --- as an efficient Integer Linear Program (ILP), which is then solved using ILP solvers. We also describe a novel buffer layout technique for GPUs which facilitates exploiting the high memory bandwidth available in GPUs. The proposed scheduling utilizes both the scalar units in GPU, to exploit data parallelism, and multiprocessors, to exploit task and pipeline parallelism. Further it takes into consideration the synchronization and bandwidth limitations of GPUs, and yields speedups between 1.87X and 36.83X over a single threaded CPU.</description><identifier>ISBN: 9780769535760</identifier><identifier>ISBN: 0769535763</identifier><identifier>DOI: 10.1109/CGO.2009.20</identifier><identifier>LCCN: 2008942482</identifier><language>eng</language><publisher>Washington, DC, USA: IEEE Computer Society</publisher><subject>Application software ; Bandwidth ; Communication channels ; Computer architecture ; Computing methodologies -- Computer graphics -- Graphics systems and interfaces -- Graphics processors ; CUDA ; Filters ; GPU Programming ; Graphics ; Parallel programming ; Pipelines ; Processor scheduling ; Programming profession ; Software and its engineering -- Software notations and tools -- Compilers ; Software Pipelining ; Stream Programming</subject><ispartof>2009 International Symposium on Code Generation and Optimization, 2009, p.200-209</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4907664$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4907664$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Udupa, Abhishek</creatorcontrib><creatorcontrib>Govindarajan, R.</creatorcontrib><creatorcontrib>Thazhuthaveetil, Matthew J.</creatorcontrib><title>Software Pipelined Execution of Stream Programs on GPUs</title><title>2009 International Symposium on Code Generation and Optimization</title><addtitle>CGO</addtitle><description>The StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multi-core architectures. This model allows programmers to specify the structure of a program as a set of filters that act upon data, and a set of communication channels between them. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on modern Graphics Processing Units (GPUs), as they support abundant parallelism in hardware. In this paper, we describe the challenges in mapping StreamIt to GPUs and propose an efficient technique to software pipeline the execution of stream programs on GPUs. We formulate this problem --- both scheduling and assignment of filters to processors --- as an efficient Integer Linear Program (ILP), which is then solved using ILP solvers. We also describe a novel buffer layout technique for GPUs which facilitates exploiting the high memory bandwidth available in GPUs. The proposed scheduling utilizes both the scalar units in GPU, to exploit data parallelism, and multiprocessors, to exploit task and pipeline parallelism. Further it takes into consideration the synchronization and bandwidth limitations of GPUs, and yields speedups between 1.87X and 36.83X over a single threaded CPU.</description><subject>Application software</subject><subject>Bandwidth</subject><subject>Communication channels</subject><subject>Computer architecture</subject><subject>Computing methodologies -- Computer graphics -- Graphics systems and interfaces -- Graphics processors</subject><subject>CUDA</subject><subject>Filters</subject><subject>GPU Programming</subject><subject>Graphics</subject><subject>Parallel programming</subject><subject>Pipelines</subject><subject>Processor scheduling</subject><subject>Programming profession</subject><subject>Software and its engineering -- Software notations and tools -- Compilers</subject><subject>Software Pipelining</subject><subject>Stream Programming</subject><isbn>9780769535760</isbn><isbn>0769535763</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2009</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNqNkMFLwzAUxgMyUGdPHr3k4EnofEmaJjlKmVUYrDB3Dmn7ItV1HWlF_e-XMf8A3-F78L6PB9-PkFsGC8bAPBblesEBTJQLkhilQeVGCqlymJHr6GiT8UzzS5KM4wfEkVKDYFdEbQY_fbuAtOoOuOv22NLlDzZfUzfs6eDpZgroelqF4T24fqTxWlbb8YbMvNuNmPztOdk-L9-Kl3S1Ll-Lp1XqBMgpdZ571AolawRo7dqc1Q5b5VFpwfO2lhxqnTUxoEWmWBvboFTeOOF9LVHMyd35b4eI9hC63oVfm5lYMM-i-3B2XdPbehg-R8vAnpDYiMSekESxdejQx_D9P8LiCB3JXZM</recordid><startdate>200903</startdate><enddate>200903</enddate><creator>Udupa, Abhishek</creator><creator>Govindarajan, R.</creator><creator>Thazhuthaveetil, Matthew J.</creator><general>IEEE Computer Society</general><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200903</creationdate><title>Software Pipelined Execution of Stream Programs on GPUs</title><author>Udupa, Abhishek ; Govindarajan, R. ; Thazhuthaveetil, Matthew J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a305t-af2fe87e51c3088ad61baed7fe78326db520b84ce5183471d109e57f9a3ffb5e3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Application software</topic><topic>Bandwidth</topic><topic>Communication channels</topic><topic>Computer architecture</topic><topic>Computing methodologies -- Computer graphics -- Graphics systems and interfaces -- Graphics processors</topic><topic>CUDA</topic><topic>Filters</topic><topic>GPU Programming</topic><topic>Graphics</topic><topic>Parallel programming</topic><topic>Pipelines</topic><topic>Processor scheduling</topic><topic>Programming profession</topic><topic>Software and its engineering -- Software notations and tools -- Compilers</topic><topic>Software Pipelining</topic><topic>Stream Programming</topic><toplevel>online_resources</toplevel><creatorcontrib>Udupa, Abhishek</creatorcontrib><creatorcontrib>Govindarajan, R.</creatorcontrib><creatorcontrib>Thazhuthaveetil, Matthew J.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Udupa, Abhishek</au><au>Govindarajan, R.</au><au>Thazhuthaveetil, Matthew J.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Software Pipelined Execution of Stream Programs on GPUs</atitle><btitle>2009 International Symposium on Code Generation and Optimization</btitle><stitle>CGO</stitle><date>2009-03</date><risdate>2009</risdate><spage>200</spage><epage>209</epage><pages>200-209</pages><isbn>9780769535760</isbn><isbn>0769535763</isbn><abstract>The StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multi-core architectures. This model allows programmers to specify the structure of a program as a set of filters that act upon data, and a set of communication channels between them. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on modern Graphics Processing Units (GPUs), as they support abundant parallelism in hardware. In this paper, we describe the challenges in mapping StreamIt to GPUs and propose an efficient technique to software pipeline the execution of stream programs on GPUs. We formulate this problem --- both scheduling and assignment of filters to processors --- as an efficient Integer Linear Program (ILP), which is then solved using ILP solvers. We also describe a novel buffer layout technique for GPUs which facilitates exploiting the high memory bandwidth available in GPUs. The proposed scheduling utilizes both the scalar units in GPU, to exploit data parallelism, and multiprocessors, to exploit task and pipeline parallelism. Further it takes into consideration the synchronization and bandwidth limitations of GPUs, and yields speedups between 1.87X and 36.83X over a single threaded CPU.</abstract><cop>Washington, DC, USA</cop><pub>IEEE Computer Society</pub><doi>10.1109/CGO.2009.20</doi><tpages>10</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISBN: 9780769535760
ispartof	2009 International Symposium on Code Generation and Optimization, 2009, p.200-209
issn
language	eng
recordid	cdi_ieee_primary_4907664
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Application software Bandwidth Communication channels Computer architecture Computing methodologies -- Computer graphics -- Graphics systems and interfaces -- Graphics processors CUDA Filters GPU Programming Graphics Parallel programming Pipelines Processor scheduling Programming profession Software and its engineering -- Software notations and tools -- Compilers Software Pipelining Stream Programming
title	Software Pipelined Execution of Stream Programs on GPUs
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T07%3A21%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Software%20Pipelined%20Execution%20of%20Stream%20Programs%20on%20GPUs&rft.btitle=2009%20International%20Symposium%20on%20Code%20Generation%20and%20Optimization&rft.au=Udupa,%20Abhishek&rft.date=2009-03&rft.spage=200&rft.epage=209&rft.pages=200-209&rft.isbn=9780769535760&rft.isbn_list=0769535763&rft_id=info:doi/10.1109/CGO.2009.20&rft_dat=%3Cacm_6IE%3Eacm_books_10_1109_CGO_2009_20%3C/acm_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=4907664&rfr_iscdi=true