PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems

In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning cur...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on computers 2022-09, Vol.71 (9), p.2234-2247
Hauptverfasser:	Ghose, Anirban, Singh, Siddharth, Kulaharia, Vivek, Dokara, Lokesh, Maity, Srijeeta, Dey, Soumyajit
Format:	Artikel
Sprache:	eng
Schlagworte:	Central Processing Unit coarse-grained scheduling Concurrency Decisions Deep learning Engines fine-grained scheduling GPGPU Graphics processing units Kernel Learning curves OpenCL Parallel programming Platforms Processor scheduling Programming languages Schedules Scheduling Task analysis
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	2247
container_issue	9
container_start_page	2234
container_title	IEEE transactions on computers
container_volume	71
creator	Ghose, Anirban Singh, Siddharth Kulaharia, Vivek Dokara, Lokesh Maity, Srijeeta Dey, Soumyajit
description	In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning curve due to the complexity of the underlying heterogeneous compute devices and their impact on performance. This has led to the emergence of several High Performance Computing frameworks which provide high-level abstractions for easing the development of data-parallel applications on heterogeneous platforms. However, the scheduling decisions undertaken by such frameworks only exploit coarse-grained concurrency in data parallel applications. In this paper, we propose PySchedCL , a framework which explores fine-grained concurrency aware scheduling decisions that harness the power of heterogeneous CPU/GPU architectures efficiently. We showcase the efficacy of such scheduling mechanisms over existing coarse-grained dynamic scheduling schemes by conducting extensive experimental evaluations for a diverse set of popular Deep Learning benchmarks.
doi_str_mv	10.1109/TC.2021.3125792
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TC_2021_3125792</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9606595</ieee_id><sourcerecordid>2700412667</sourcerecordid><originalsourceid>FETCH-LOGICAL-c284t-8fa1743a5b4a583537e703d488e69274a02b8b3f87188afc6fa195cee60518ce3</originalsourceid><addsrcrecordid>eNo9kEtLw0AUhQdRsFbXLtwEXKe9M5N5uZP4qBCw0LoeptObmpImdSYV8u9NaXF1Nt85Bz5C7ilMKAUzXeYTBoxOOGVCGXZBRlQIlRoj5CUZAVCdGp7BNbmJcQsAkoEZkdm8X_hvXOfFU1LgLwa3qZpNkreNP4SAje-Tqklm2GFoN9hge4jJi-tcOnfB1TXWyaKPHe7iLbkqXR3x7pxj8vX2usxnafH5_pE_F6lnOutSXTqqMu7EKnNCc8EVKuDrTGuUhqnMAVvpFS-1olq70suBN8IjShBUe-Rj8nja3Yf254Cxs9v2EJrh0jIFkFEmpRqo6YnyoY0xYGn3odq50FsK9qjLLnN71GXPuobGw6lRIeI_bSRIYQT_AyxcZMw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2700412667</pqid></control><display><type>article</type><title>PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems</title><source>IEEE Electronic Library (IEL)</source><creator>Ghose, Anirban ; Singh, Siddharth ; Kulaharia, Vivek ; Dokara, Lokesh ; Maity, Srijeeta ; Dey, Soumyajit</creator><creatorcontrib>Ghose, Anirban ; Singh, Siddharth ; Kulaharia, Vivek ; Dokara, Lokesh ; Maity, Srijeeta ; Dey, Soumyajit</creatorcontrib><description>In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning curve due to the complexity of the underlying heterogeneous compute devices and their impact on performance. This has led to the emergence of several High Performance Computing frameworks which provide high-level abstractions for easing the development of data-parallel applications on heterogeneous platforms. However, the scheduling decisions undertaken by such frameworks only exploit coarse-grained concurrency in data parallel applications. In this paper, we propose PySchedCL , a framework which explores fine-grained concurrency aware scheduling decisions that harness the power of heterogeneous CPU/GPU architectures efficiently. We showcase the efficacy of such scheduling mechanisms over existing coarse-grained dynamic scheduling schemes by conducting extensive experimental evaluations for a diverse set of popular Deep Learning benchmarks.</description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/TC.2021.3125792</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Central Processing Unit ; coarse-grained scheduling ; Concurrency ; Decisions ; Deep learning ; Engines ; fine-grained scheduling ; GPGPU ; Graphics processing units ; Kernel ; Learning curves ; OpenCL ; Parallel programming ; Platforms ; Processor scheduling ; Programming languages ; Schedules ; Scheduling ; Task analysis</subject><ispartof>IEEE transactions on computers, 2022-09, Vol.71 (9), p.2234-2247</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c284t-8fa1743a5b4a583537e703d488e69274a02b8b3f87188afc6fa195cee60518ce3</cites><orcidid>0000-0003-1108-4572 ; 0000-0002-2756-4290 ; 0000-0001-9329-6389</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9606595$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9606595$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ghose, Anirban</creatorcontrib><creatorcontrib>Singh, Siddharth</creatorcontrib><creatorcontrib>Kulaharia, Vivek</creatorcontrib><creatorcontrib>Dokara, Lokesh</creatorcontrib><creatorcontrib>Maity, Srijeeta</creatorcontrib><creatorcontrib>Dey, Soumyajit</creatorcontrib><title>PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description>In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning curve due to the complexity of the underlying heterogeneous compute devices and their impact on performance. This has led to the emergence of several High Performance Computing frameworks which provide high-level abstractions for easing the development of data-parallel applications on heterogeneous platforms. However, the scheduling decisions undertaken by such frameworks only exploit coarse-grained concurrency in data parallel applications. In this paper, we propose PySchedCL , a framework which explores fine-grained concurrency aware scheduling decisions that harness the power of heterogeneous CPU/GPU architectures efficiently. We showcase the efficacy of such scheduling mechanisms over existing coarse-grained dynamic scheduling schemes by conducting extensive experimental evaluations for a diverse set of popular Deep Learning benchmarks.</description><subject>Central Processing Unit</subject><subject>coarse-grained scheduling</subject><subject>Concurrency</subject><subject>Decisions</subject><subject>Deep learning</subject><subject>Engines</subject><subject>fine-grained scheduling</subject><subject>GPGPU</subject><subject>Graphics processing units</subject><subject>Kernel</subject><subject>Learning curves</subject><subject>OpenCL</subject><subject>Parallel programming</subject><subject>Platforms</subject><subject>Processor scheduling</subject><subject>Programming languages</subject><subject>Schedules</subject><subject>Scheduling</subject><subject>Task analysis</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kEtLw0AUhQdRsFbXLtwEXKe9M5N5uZP4qBCw0LoeptObmpImdSYV8u9NaXF1Nt85Bz5C7ilMKAUzXeYTBoxOOGVCGXZBRlQIlRoj5CUZAVCdGp7BNbmJcQsAkoEZkdm8X_hvXOfFU1LgLwa3qZpNkreNP4SAje-Tqklm2GFoN9hge4jJi-tcOnfB1TXWyaKPHe7iLbkqXR3x7pxj8vX2usxnafH5_pE_F6lnOutSXTqqMu7EKnNCc8EVKuDrTGuUhqnMAVvpFS-1olq70suBN8IjShBUe-Rj8nja3Yf254Cxs9v2EJrh0jIFkFEmpRqo6YnyoY0xYGn3odq50FsK9qjLLnN71GXPuobGw6lRIeI_bSRIYQT_AyxcZMw</recordid><startdate>20220901</startdate><enddate>20220901</enddate><creator>Ghose, Anirban</creator><creator>Singh, Siddharth</creator><creator>Kulaharia, Vivek</creator><creator>Dokara, Lokesh</creator><creator>Maity, Srijeeta</creator><creator>Dey, Soumyajit</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-1108-4572</orcidid><orcidid>https://orcid.org/0000-0002-2756-4290</orcidid><orcidid>https://orcid.org/0000-0001-9329-6389</orcidid></search><sort><creationdate>20220901</creationdate><title>PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems</title><author>Ghose, Anirban ; Singh, Siddharth ; Kulaharia, Vivek ; Dokara, Lokesh ; Maity, Srijeeta ; Dey, Soumyajit</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c284t-8fa1743a5b4a583537e703d488e69274a02b8b3f87188afc6fa195cee60518ce3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Central Processing Unit</topic><topic>coarse-grained scheduling</topic><topic>Concurrency</topic><topic>Decisions</topic><topic>Deep learning</topic><topic>Engines</topic><topic>fine-grained scheduling</topic><topic>GPGPU</topic><topic>Graphics processing units</topic><topic>Kernel</topic><topic>Learning curves</topic><topic>OpenCL</topic><topic>Parallel programming</topic><topic>Platforms</topic><topic>Processor scheduling</topic><topic>Programming languages</topic><topic>Schedules</topic><topic>Scheduling</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ghose, Anirban</creatorcontrib><creatorcontrib>Singh, Siddharth</creatorcontrib><creatorcontrib>Kulaharia, Vivek</creatorcontrib><creatorcontrib>Dokara, Lokesh</creatorcontrib><creatorcontrib>Maity, Srijeeta</creatorcontrib><creatorcontrib>Dey, Soumyajit</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ghose, Anirban</au><au>Singh, Siddharth</au><au>Kulaharia, Vivek</au><au>Dokara, Lokesh</au><au>Maity, Srijeeta</au><au>Dey, Soumyajit</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2022-09-01</date><risdate>2022</risdate><volume>71</volume><issue>9</issue><spage>2234</spage><epage>2247</epage><pages>2234-2247</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract>In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning curve due to the complexity of the underlying heterogeneous compute devices and their impact on performance. This has led to the emergence of several High Performance Computing frameworks which provide high-level abstractions for easing the development of data-parallel applications on heterogeneous platforms. However, the scheduling decisions undertaken by such frameworks only exploit coarse-grained concurrency in data parallel applications. In this paper, we propose PySchedCL , a framework which explores fine-grained concurrency aware scheduling decisions that harness the power of heterogeneous CPU/GPU architectures efficiently. We showcase the efficacy of such scheduling mechanisms over existing coarse-grained dynamic scheduling schemes by conducting extensive experimental evaluations for a diverse set of popular Deep Learning benchmarks.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TC.2021.3125792</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0003-1108-4572</orcidid><orcidid>https://orcid.org/0000-0002-2756-4290</orcidid><orcidid>https://orcid.org/0000-0001-9329-6389</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0018-9340
ispartof	IEEE transactions on computers, 2022-09, Vol.71 (9), p.2234-2247
issn	0018-9340 1557-9956
language	eng
recordid	cdi_crossref_primary_10_1109_TC_2021_3125792
source	IEEE Electronic Library (IEL)
subjects	Central Processing Unit coarse-grained scheduling Concurrency Decisions Deep learning Engines fine-grained scheduling GPGPU Graphics processing units Kernel Learning curves OpenCL Parallel programming Platforms Processor scheduling Programming languages Schedules Scheduling Task analysis
title	PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T12%3A03%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PySchedCL:%20Leveraging%20Concurrency%20in%20Heterogeneous%20Data-Parallel%20Systems&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Ghose,%20Anirban&rft.date=2022-09-01&rft.volume=71&rft.issue=9&rft.spage=2234&rft.epage=2247&rft.pages=2234-2247&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/TC.2021.3125792&rft_dat=%3Cproquest_RIE%3E2700412667%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2700412667&rft_id=info:pmid/&rft_ieee_id=9606595&rfr_iscdi=true