PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems

In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning cur...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on computers 2022-09, Vol.71 (9), p.2234-2247
Hauptverfasser: Ghose, Anirban, Singh, Siddharth, Kulaharia, Vivek, Dokara, Lokesh, Maity, Srijeeta, Dey, Soumyajit
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2247
container_issue 9
container_start_page 2234
container_title IEEE transactions on computers
container_volume 71
creator Ghose, Anirban
Singh, Siddharth
Kulaharia, Vivek
Dokara, Lokesh
Maity, Srijeeta
Dey, Soumyajit
description In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning curve due to the complexity of the underlying heterogeneous compute devices and their impact on performance. This has led to the emergence of several High Performance Computing frameworks which provide high-level abstractions for easing the development of data-parallel applications on heterogeneous platforms. However, the scheduling decisions undertaken by such frameworks only exploit coarse-grained concurrency in data parallel applications. In this paper, we propose PySchedCL , a framework which explores fine-grained concurrency aware scheduling decisions that harness the power of heterogeneous CPU/GPU architectures efficiently. We showcase the efficacy of such scheduling mechanisms over existing coarse-grained dynamic scheduling schemes by conducting extensive experimental evaluations for a diverse set of popular Deep Learning benchmarks.
doi_str_mv 10.1109/TC.2021.3125792
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TC_2021_3125792</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9606595</ieee_id><sourcerecordid>2700412667</sourcerecordid><originalsourceid>FETCH-LOGICAL-c284t-8fa1743a5b4a583537e703d488e69274a02b8b3f87188afc6fa195cee60518ce3</originalsourceid><addsrcrecordid>eNo9kEtLw0AUhQdRsFbXLtwEXKe9M5N5uZP4qBCw0LoeptObmpImdSYV8u9NaXF1Nt85Bz5C7ilMKAUzXeYTBoxOOGVCGXZBRlQIlRoj5CUZAVCdGp7BNbmJcQsAkoEZkdm8X_hvXOfFU1LgLwa3qZpNkreNP4SAje-Tqklm2GFoN9hge4jJi-tcOnfB1TXWyaKPHe7iLbkqXR3x7pxj8vX2usxnafH5_pE_F6lnOutSXTqqMu7EKnNCc8EVKuDrTGuUhqnMAVvpFS-1olq70suBN8IjShBUe-Rj8nja3Yf254Cxs9v2EJrh0jIFkFEmpRqo6YnyoY0xYGn3odq50FsK9qjLLnN71GXPuobGw6lRIeI_bSRIYQT_AyxcZMw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2700412667</pqid></control><display><type>article</type><title>PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems</title><source>IEEE Electronic Library (IEL)</source><creator>Ghose, Anirban ; Singh, Siddharth ; Kulaharia, Vivek ; Dokara, Lokesh ; Maity, Srijeeta ; Dey, Soumyajit</creator><creatorcontrib>Ghose, Anirban ; Singh, Siddharth ; Kulaharia, Vivek ; Dokara, Lokesh ; Maity, Srijeeta ; Dey, Soumyajit</creatorcontrib><description>In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning curve due to the complexity of the underlying heterogeneous compute devices and their impact on performance. This has led to the emergence of several High Performance Computing frameworks which provide high-level abstractions for easing the development of data-parallel applications on heterogeneous platforms. However, the scheduling decisions undertaken by such frameworks only exploit coarse-grained concurrency in data parallel applications. In this paper, we propose PySchedCL , a framework which explores fine-grained concurrency aware scheduling decisions that harness the power of heterogeneous CPU/GPU architectures efficiently. We showcase the efficacy of such scheduling mechanisms over existing coarse-grained dynamic scheduling schemes by conducting extensive experimental evaluations for a diverse set of popular Deep Learning benchmarks.</description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/TC.2021.3125792</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Central Processing Unit ; coarse-grained scheduling ; Concurrency ; Decisions ; Deep learning ; Engines ; fine-grained scheduling ; GPGPU ; Graphics processing units ; Kernel ; Learning curves ; OpenCL ; Parallel programming ; Platforms ; Processor scheduling ; Programming languages ; Schedules ; Scheduling ; Task analysis</subject><ispartof>IEEE transactions on computers, 2022-09, Vol.71 (9), p.2234-2247</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c284t-8fa1743a5b4a583537e703d488e69274a02b8b3f87188afc6fa195cee60518ce3</cites><orcidid>0000-0003-1108-4572 ; 0000-0002-2756-4290 ; 0000-0001-9329-6389</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9606595$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9606595$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ghose, Anirban</creatorcontrib><creatorcontrib>Singh, Siddharth</creatorcontrib><creatorcontrib>Kulaharia, Vivek</creatorcontrib><creatorcontrib>Dokara, Lokesh</creatorcontrib><creatorcontrib>Maity, Srijeeta</creatorcontrib><creatorcontrib>Dey, Soumyajit</creatorcontrib><title>PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description>In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning curve due to the complexity of the underlying heterogeneous compute devices and their impact on performance. This has led to the emergence of several High Performance Computing frameworks which provide high-level abstractions for easing the development of data-parallel applications on heterogeneous platforms. However, the scheduling decisions undertaken by such frameworks only exploit coarse-grained concurrency in data parallel applications. In this paper, we propose PySchedCL , a framework which explores fine-grained concurrency aware scheduling decisions that harness the power of heterogeneous CPU/GPU architectures efficiently. We showcase the efficacy of such scheduling mechanisms over existing coarse-grained dynamic scheduling schemes by conducting extensive experimental evaluations for a diverse set of popular Deep Learning benchmarks.</description><subject>Central Processing Unit</subject><subject>coarse-grained scheduling</subject><subject>Concurrency</subject><subject>Decisions</subject><subject>Deep learning</subject><subject>Engines</subject><subject>fine-grained scheduling</subject><subject>GPGPU</subject><subject>Graphics processing units</subject><subject>Kernel</subject><subject>Learning curves</subject><subject>OpenCL</subject><subject>Parallel programming</subject><subject>Platforms</subject><subject>Processor scheduling</subject><subject>Programming languages</subject><subject>Schedules</subject><subject>Scheduling</subject><subject>Task analysis</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kEtLw0AUhQdRsFbXLtwEXKe9M5N5uZP4qBCw0LoeptObmpImdSYV8u9NaXF1Nt85Bz5C7ilMKAUzXeYTBoxOOGVCGXZBRlQIlRoj5CUZAVCdGp7BNbmJcQsAkoEZkdm8X_hvXOfFU1LgLwa3qZpNkreNP4SAje-Tqklm2GFoN9hge4jJi-tcOnfB1TXWyaKPHe7iLbkqXR3x7pxj8vX2usxnafH5_pE_F6lnOutSXTqqMu7EKnNCc8EVKuDrTGuUhqnMAVvpFS-1olq70suBN8IjShBUe-Rj8nja3Yf254Cxs9v2EJrh0jIFkFEmpRqo6YnyoY0xYGn3odq50FsK9qjLLnN71GXPuobGw6lRIeI_bSRIYQT_AyxcZMw</recordid><startdate>20220901</startdate><enddate>20220901</enddate><creator>Ghose, Anirban</creator><creator>Singh, Siddharth</creator><creator>Kulaharia, Vivek</creator><creator>Dokara, Lokesh</creator><creator>Maity, Srijeeta</creator><creator>Dey, Soumyajit</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-1108-4572</orcidid><orcidid>https://orcid.org/0000-0002-2756-4290</orcidid><orcidid>https://orcid.org/0000-0001-9329-6389</orcidid></search><sort><creationdate>20220901</creationdate><title>PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems</title><author>Ghose, Anirban ; Singh, Siddharth ; Kulaharia, Vivek ; Dokara, Lokesh ; Maity, Srijeeta ; Dey, Soumyajit</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c284t-8fa1743a5b4a583537e703d488e69274a02b8b3f87188afc6fa195cee60518ce3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Central Processing Unit</topic><topic>coarse-grained scheduling</topic><topic>Concurrency</topic><topic>Decisions</topic><topic>Deep learning</topic><topic>Engines</topic><topic>fine-grained scheduling</topic><topic>GPGPU</topic><topic>Graphics processing units</topic><topic>Kernel</topic><topic>Learning curves</topic><topic>OpenCL</topic><topic>Parallel programming</topic><topic>Platforms</topic><topic>Processor scheduling</topic><topic>Programming languages</topic><topic>Schedules</topic><topic>Scheduling</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ghose, Anirban</creatorcontrib><creatorcontrib>Singh, Siddharth</creatorcontrib><creatorcontrib>Kulaharia, Vivek</creatorcontrib><creatorcontrib>Dokara, Lokesh</creatorcontrib><creatorcontrib>Maity, Srijeeta</creatorcontrib><creatorcontrib>Dey, Soumyajit</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ghose, Anirban</au><au>Singh, Siddharth</au><au>Kulaharia, Vivek</au><au>Dokara, Lokesh</au><au>Maity, Srijeeta</au><au>Dey, Soumyajit</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2022-09-01</date><risdate>2022</risdate><volume>71</volume><issue>9</issue><spage>2234</spage><epage>2247</epage><pages>2234-2247</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract>In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning curve due to the complexity of the underlying heterogeneous compute devices and their impact on performance. This has led to the emergence of several High Performance Computing frameworks which provide high-level abstractions for easing the development of data-parallel applications on heterogeneous platforms. However, the scheduling decisions undertaken by such frameworks only exploit coarse-grained concurrency in data parallel applications. In this paper, we propose PySchedCL , a framework which explores fine-grained concurrency aware scheduling decisions that harness the power of heterogeneous CPU/GPU architectures efficiently. We showcase the efficacy of such scheduling mechanisms over existing coarse-grained dynamic scheduling schemes by conducting extensive experimental evaluations for a diverse set of popular Deep Learning benchmarks.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TC.2021.3125792</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0003-1108-4572</orcidid><orcidid>https://orcid.org/0000-0002-2756-4290</orcidid><orcidid>https://orcid.org/0000-0001-9329-6389</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0018-9340
ispartof IEEE transactions on computers, 2022-09, Vol.71 (9), p.2234-2247
issn 0018-9340
1557-9956
language eng
recordid cdi_crossref_primary_10_1109_TC_2021_3125792
source IEEE Electronic Library (IEL)
subjects Central Processing Unit
coarse-grained scheduling
Concurrency
Decisions
Deep learning
Engines
fine-grained scheduling
GPGPU
Graphics processing units
Kernel
Learning curves
OpenCL
Parallel programming
Platforms
Processor scheduling
Programming languages
Schedules
Scheduling
Task analysis
title PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T12%3A03%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PySchedCL:%20Leveraging%20Concurrency%20in%20Heterogeneous%20Data-Parallel%20Systems&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Ghose,%20Anirban&rft.date=2022-09-01&rft.volume=71&rft.issue=9&rft.spage=2234&rft.epage=2247&rft.pages=2234-2247&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/TC.2021.3125792&rft_dat=%3Cproquest_RIE%3E2700412667%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2700412667&rft_id=info:pmid/&rft_ieee_id=9606595&rfr_iscdi=true