Heterogeneous tasks and conduits framework for rapid application portability and deployment

Emerging heterogeneous and homogeneous processing architectures demonstrate significant increases in throughput for scientific applications over traditional single core processors. Each of these processing architectures vary widely in their processing capabilities, memory hierarchies, and programmin...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Brock, J., Leeser, M., Niedre, M.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Abstracts Joining processes Structural beams US Department of Transportation
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	9
container_issue
container_start_page	1
container_title
container_volume
creator	Brock, J. Leeser, M. Niedre, M.
description	Emerging heterogeneous and homogeneous processing architectures demonstrate significant increases in throughput for scientific applications over traditional single core processors. Each of these processing architectures vary widely in their processing capabilities, memory hierarchies, and programming models. Determining the system architecture best suited to an application or deploying an application that is portable across a number of different platforms is increasingly complex and error prone within this rapidly increasing and evolving design space. Quickly and easily designing portable, high-performance applications that can function and maintain their correctness properly across these widely varied systems has become paramount. To deal with these programming challenges, there is a great need for new models and tools to be developed. One example is MIT Lincoln Laboratory's Parallel Vector Tile Optimizing Library (PVTOL) which simplifies the task of developing software in C++ for these complex systems. This work extends the Tasks and Conduits framework in PVTOL to support GPU architectures and other heterogeneous platforms supported by the NVIDIA CUDA and OpenCL programming models. This allows the rapid portability of applications to a very wide range of architectures and clusters. Using this framework, porting applications from a single CPU core to a GPU requires a change of only 5 source lines of code (SLOC) in addition to the CUDA or OpenCL kernel. Using GPU-PVTOL we have achieved 22x speedup in an application of Monte Carlo simulations of photon propagation through a biological medium, and a 60x speedup of a 3D cone beam computed tomography (CT) image reconstruction algorithm.
doi_str_mv	10.1109/InPar.2012.6339588
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6339588</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6339588</ieee_id><sourcerecordid>6339588</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-c263b340a6e092eabb405144f923bf87de283795a09bc369b2a05fe291f5f29d3</originalsourceid><addsrcrecordid>eNo1kE1LAzEYhCMiqHX_gF7yB3bNx37lKEVtoaCHHgQP5c3mjcTubpYkRfbfW2ydy_AcZhiGkHvOCs6ZelyP7xAKwbgoailV1bYX5JaXdSPFkT8uSaaa9p-FuCZZjN_sqJYz0dQ35HOFCYP_whH9IdIEcR8pjIZ2fjQHlyK1AQb88WFPrQ80wOQMhWnqXQfJ-ZFOPiTQrndp_gsanHo_DzimO3JloY-YnX1Bti_P2-Uq37y9rpdPm9wplvLuuEzLkkGNTAkErUtW8bK0Skht28agaGWjKmBKd7JWWgCrLArFbWWFMnJBHk61DhF3U3ADhHl3vkP-AvqkVh4</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Heterogeneous tasks and conduits framework for rapid application portability and deployment</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Brock, J. ; Leeser, M. ; Niedre, M.</creator><creatorcontrib>Brock, J. ; Leeser, M. ; Niedre, M.</creatorcontrib><description>Emerging heterogeneous and homogeneous processing architectures demonstrate significant increases in throughput for scientific applications over traditional single core processors. Each of these processing architectures vary widely in their processing capabilities, memory hierarchies, and programming models. Determining the system architecture best suited to an application or deploying an application that is portable across a number of different platforms is increasingly complex and error prone within this rapidly increasing and evolving design space. Quickly and easily designing portable, high-performance applications that can function and maintain their correctness properly across these widely varied systems has become paramount. To deal with these programming challenges, there is a great need for new models and tools to be developed. One example is MIT Lincoln Laboratory's Parallel Vector Tile Optimizing Library (PVTOL) which simplifies the task of developing software in C++ for these complex systems. This work extends the Tasks and Conduits framework in PVTOL to support GPU architectures and other heterogeneous platforms supported by the NVIDIA CUDA and OpenCL programming models. This allows the rapid portability of applications to a very wide range of architectures and clusters. Using this framework, porting applications from a single CPU core to a GPU requires a change of only 5 source lines of code (SLOC) in addition to the CUDA or OpenCL kernel. Using GPU-PVTOL we have achieved 22x speedup in an application of Monte Carlo simulations of photon propagation through a biological medium, and a 60x speedup of a 3D cone beam computed tomography (CT) image reconstruction algorithm.</description><identifier>ISBN: 9781467326322</identifier><identifier>ISBN: 1467326321</identifier><identifier>EISBN: 146732633X</identifier><identifier>EISBN: 9781467326339</identifier><identifier>EISBN: 9781467326315</identifier><identifier>EISBN: 1467326313</identifier><identifier>DOI: 10.1109/InPar.2012.6339588</identifier><language>eng</language><publisher>IEEE</publisher><subject>Abstracts ; Joining processes ; Structural beams ; US Department of Transportation</subject><ispartof>2012 Innovative Parallel Computing (InPar), 2012, p.1-9</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6339588$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2052,27902,54895</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6339588$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Brock, J.</creatorcontrib><creatorcontrib>Leeser, M.</creatorcontrib><creatorcontrib>Niedre, M.</creatorcontrib><title>Heterogeneous tasks and conduits framework for rapid application portability and deployment</title><title>2012 Innovative Parallel Computing (InPar)</title><addtitle>InPar</addtitle><description>Emerging heterogeneous and homogeneous processing architectures demonstrate significant increases in throughput for scientific applications over traditional single core processors. Each of these processing architectures vary widely in their processing capabilities, memory hierarchies, and programming models. Determining the system architecture best suited to an application or deploying an application that is portable across a number of different platforms is increasingly complex and error prone within this rapidly increasing and evolving design space. Quickly and easily designing portable, high-performance applications that can function and maintain their correctness properly across these widely varied systems has become paramount. To deal with these programming challenges, there is a great need for new models and tools to be developed. One example is MIT Lincoln Laboratory's Parallel Vector Tile Optimizing Library (PVTOL) which simplifies the task of developing software in C++ for these complex systems. This work extends the Tasks and Conduits framework in PVTOL to support GPU architectures and other heterogeneous platforms supported by the NVIDIA CUDA and OpenCL programming models. This allows the rapid portability of applications to a very wide range of architectures and clusters. Using this framework, porting applications from a single CPU core to a GPU requires a change of only 5 source lines of code (SLOC) in addition to the CUDA or OpenCL kernel. Using GPU-PVTOL we have achieved 22x speedup in an application of Monte Carlo simulations of photon propagation through a biological medium, and a 60x speedup of a 3D cone beam computed tomography (CT) image reconstruction algorithm.</description><subject>Abstracts</subject><subject>Joining processes</subject><subject>Structural beams</subject><subject>US Department of Transportation</subject><isbn>9781467326322</isbn><isbn>1467326321</isbn><isbn>146732633X</isbn><isbn>9781467326339</isbn><isbn>9781467326315</isbn><isbn>1467326313</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1kE1LAzEYhCMiqHX_gF7yB3bNx37lKEVtoaCHHgQP5c3mjcTubpYkRfbfW2ydy_AcZhiGkHvOCs6ZelyP7xAKwbgoailV1bYX5JaXdSPFkT8uSaaa9p-FuCZZjN_sqJYz0dQ35HOFCYP_whH9IdIEcR8pjIZ2fjQHlyK1AQb88WFPrQ80wOQMhWnqXQfJ-ZFOPiTQrndp_gsanHo_DzimO3JloY-YnX1Bti_P2-Uq37y9rpdPm9wplvLuuEzLkkGNTAkErUtW8bK0Skht28agaGWjKmBKd7JWWgCrLArFbWWFMnJBHk61DhF3U3ADhHl3vkP-AvqkVh4</recordid><startdate>201205</startdate><enddate>201205</enddate><creator>Brock, J.</creator><creator>Leeser, M.</creator><creator>Niedre, M.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201205</creationdate><title>Heterogeneous tasks and conduits framework for rapid application portability and deployment</title><author>Brock, J. ; Leeser, M. ; Niedre, M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-c263b340a6e092eabb405144f923bf87de283795a09bc369b2a05fe291f5f29d3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Abstracts</topic><topic>Joining processes</topic><topic>Structural beams</topic><topic>US Department of Transportation</topic><toplevel>online_resources</toplevel><creatorcontrib>Brock, J.</creatorcontrib><creatorcontrib>Leeser, M.</creatorcontrib><creatorcontrib>Niedre, M.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Brock, J.</au><au>Leeser, M.</au><au>Niedre, M.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Heterogeneous tasks and conduits framework for rapid application portability and deployment</atitle><btitle>2012 Innovative Parallel Computing (InPar)</btitle><stitle>InPar</stitle><date>2012-05</date><risdate>2012</risdate><spage>1</spage><epage>9</epage><pages>1-9</pages><isbn>9781467326322</isbn><isbn>1467326321</isbn><eisbn>146732633X</eisbn><eisbn>9781467326339</eisbn><eisbn>9781467326315</eisbn><eisbn>1467326313</eisbn><abstract>Emerging heterogeneous and homogeneous processing architectures demonstrate significant increases in throughput for scientific applications over traditional single core processors. Each of these processing architectures vary widely in their processing capabilities, memory hierarchies, and programming models. Determining the system architecture best suited to an application or deploying an application that is portable across a number of different platforms is increasingly complex and error prone within this rapidly increasing and evolving design space. Quickly and easily designing portable, high-performance applications that can function and maintain their correctness properly across these widely varied systems has become paramount. To deal with these programming challenges, there is a great need for new models and tools to be developed. One example is MIT Lincoln Laboratory's Parallel Vector Tile Optimizing Library (PVTOL) which simplifies the task of developing software in C++ for these complex systems. This work extends the Tasks and Conduits framework in PVTOL to support GPU architectures and other heterogeneous platforms supported by the NVIDIA CUDA and OpenCL programming models. This allows the rapid portability of applications to a very wide range of architectures and clusters. Using this framework, porting applications from a single CPU core to a GPU requires a change of only 5 source lines of code (SLOC) in addition to the CUDA or OpenCL kernel. Using GPU-PVTOL we have achieved 22x speedup in an application of Monte Carlo simulations of photon propagation through a biological medium, and a 60x speedup of a 3D cone beam computed tomography (CT) image reconstruction algorithm.</abstract><pub>IEEE</pub><doi>10.1109/InPar.2012.6339588</doi><tpages>9</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISBN: 9781467326322
ispartof	2012 Innovative Parallel Computing (InPar), 2012, p.1-9
issn
language	eng
recordid	cdi_ieee_primary_6339588
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Abstracts Joining processes Structural beams US Department of Transportation
title	Heterogeneous tasks and conduits framework for rapid application portability and deployment
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-15T10%3A45%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Heterogeneous%20tasks%20and%20conduits%20framework%20for%20rapid%20application%20portability%20and%20deployment&rft.btitle=2012%20Innovative%20Parallel%20Computing%20(InPar)&rft.au=Brock,%20J.&rft.date=2012-05&rft.spage=1&rft.epage=9&rft.pages=1-9&rft.isbn=9781467326322&rft.isbn_list=1467326321&rft_id=info:doi/10.1109/InPar.2012.6339588&rft_dat=%3Cieee_6IE%3E6339588%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=146732633X&rft.eisbn_list=9781467326339&rft.eisbn_list=9781467326315&rft.eisbn_list=1467326313&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6339588&rfr_iscdi=true