PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems
In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning cur...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on computers 2022-09, Vol.71 (9), p.2234-2247 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2247 |
---|---|
container_issue | 9 |
container_start_page | 2234 |
container_title | IEEE transactions on computers |
container_volume | 71 |
creator | Ghose, Anirban Singh, Siddharth Kulaharia, Vivek Dokara, Lokesh Maity, Srijeeta Dey, Soumyajit |
description | In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning curve due to the complexity of the underlying heterogeneous compute devices and their impact on performance. This has led to the emergence of several High Performance Computing frameworks which provide high-level abstractions for easing the development of data-parallel applications on heterogeneous platforms. However, the scheduling decisions undertaken by such frameworks only exploit coarse-grained concurrency in data parallel applications. In this paper, we propose PySchedCL , a framework which explores fine-grained concurrency aware scheduling decisions that harness the power of heterogeneous CPU/GPU architectures efficiently. We showcase the efficacy of such scheduling mechanisms over existing coarse-grained dynamic scheduling schemes by conducting extensive experimental evaluations for a diverse set of popular Deep Learning benchmarks. |
doi_str_mv | 10.1109/TC.2021.3125792 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TC_2021_3125792</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9606595</ieee_id><sourcerecordid>2700412667</sourcerecordid><originalsourceid>FETCH-LOGICAL-c284t-8fa1743a5b4a583537e703d488e69274a02b8b3f87188afc6fa195cee60518ce3</originalsourceid><addsrcrecordid>eNo9kEtLw0AUhQdRsFbXLtwEXKe9M5N5uZP4qBCw0LoeptObmpImdSYV8u9NaXF1Nt85Bz5C7ilMKAUzXeYTBoxOOGVCGXZBRlQIlRoj5CUZAVCdGp7BNbmJcQsAkoEZkdm8X_hvXOfFU1LgLwa3qZpNkreNP4SAje-Tqklm2GFoN9hge4jJi-tcOnfB1TXWyaKPHe7iLbkqXR3x7pxj8vX2usxnafH5_pE_F6lnOutSXTqqMu7EKnNCc8EVKuDrTGuUhqnMAVvpFS-1olq70suBN8IjShBUe-Rj8nja3Yf254Cxs9v2EJrh0jIFkFEmpRqo6YnyoY0xYGn3odq50FsK9qjLLnN71GXPuobGw6lRIeI_bSRIYQT_AyxcZMw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2700412667</pqid></control><display><type>article</type><title>PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems</title><source>IEEE Electronic Library (IEL)</source><creator>Ghose, Anirban ; Singh, Siddharth ; Kulaharia, Vivek ; Dokara, Lokesh ; Maity, Srijeeta ; Dey, Soumyajit</creator><creatorcontrib>Ghose, Anirban ; Singh, Siddharth ; Kulaharia, Vivek ; Dokara, Lokesh ; Maity, Srijeeta ; Dey, Soumyajit</creatorcontrib><description>In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning curve due to the complexity of the underlying heterogeneous compute devices and their impact on performance. This has led to the emergence of several High Performance Computing frameworks which provide high-level abstractions for easing the development of data-parallel applications on heterogeneous platforms. However, the scheduling decisions undertaken by such frameworks only exploit coarse-grained concurrency in data parallel applications. In this paper, we propose PySchedCL , a framework which explores fine-grained concurrency aware scheduling decisions that harness the power of heterogeneous CPU/GPU architectures efficiently. We showcase the efficacy of such scheduling mechanisms over existing coarse-grained dynamic scheduling schemes by conducting extensive experimental evaluations for a diverse set of popular Deep Learning benchmarks.</description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/TC.2021.3125792</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Central Processing Unit ; coarse-grained scheduling ; Concurrency ; Decisions ; Deep learning ; Engines ; fine-grained scheduling ; GPGPU ; Graphics processing units ; Kernel ; Learning curves ; OpenCL ; Parallel programming ; Platforms ; Processor scheduling ; Programming languages ; Schedules ; Scheduling ; Task analysis</subject><ispartof>IEEE transactions on computers, 2022-09, Vol.71 (9), p.2234-2247</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c284t-8fa1743a5b4a583537e703d488e69274a02b8b3f87188afc6fa195cee60518ce3</cites><orcidid>0000-0003-1108-4572 ; 0000-0002-2756-4290 ; 0000-0001-9329-6389</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9606595$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9606595$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ghose, Anirban</creatorcontrib><creatorcontrib>Singh, Siddharth</creatorcontrib><creatorcontrib>Kulaharia, Vivek</creatorcontrib><creatorcontrib>Dokara, Lokesh</creatorcontrib><creatorcontrib>Maity, Srijeeta</creatorcontrib><creatorcontrib>Dey, Soumyajit</creatorcontrib><title>PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description>In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning curve due to the complexity of the underlying heterogeneous compute devices and their impact on performance. This has led to the emergence of several High Performance Computing frameworks which provide high-level abstractions for easing the development of data-parallel applications on heterogeneous platforms. However, the scheduling decisions undertaken by such frameworks only exploit coarse-grained concurrency in data parallel applications. In this paper, we propose PySchedCL , a framework which explores fine-grained concurrency aware scheduling decisions that harness the power of heterogeneous CPU/GPU architectures efficiently. We showcase the efficacy of such scheduling mechanisms over existing coarse-grained dynamic scheduling schemes by conducting extensive experimental evaluations for a diverse set of popular Deep Learning benchmarks.</description><subject>Central Processing Unit</subject><subject>coarse-grained scheduling</subject><subject>Concurrency</subject><subject>Decisions</subject><subject>Deep learning</subject><subject>Engines</subject><subject>fine-grained scheduling</subject><subject>GPGPU</subject><subject>Graphics processing units</subject><subject>Kernel</subject><subject>Learning curves</subject><subject>OpenCL</subject><subject>Parallel programming</subject><subject>Platforms</subject><subject>Processor scheduling</subject><subject>Programming languages</subject><subject>Schedules</subject><subject>Scheduling</subject><subject>Task analysis</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kEtLw0AUhQdRsFbXLtwEXKe9M5N5uZP4qBCw0LoeptObmpImdSYV8u9NaXF1Nt85Bz5C7ilMKAUzXeYTBoxOOGVCGXZBRlQIlRoj5CUZAVCdGp7BNbmJcQsAkoEZkdm8X_hvXOfFU1LgLwa3qZpNkreNP4SAje-Tqklm2GFoN9hge4jJi-tcOnfB1TXWyaKPHe7iLbkqXR3x7pxj8vX2usxnafH5_pE_F6lnOutSXTqqMu7EKnNCc8EVKuDrTGuUhqnMAVvpFS-1olq70suBN8IjShBUe-Rj8nja3Yf254Cxs9v2EJrh0jIFkFEmpRqo6YnyoY0xYGn3odq50FsK9qjLLnN71GXPuobGw6lRIeI_bSRIYQT_AyxcZMw</recordid><startdate>20220901</startdate><enddate>20220901</enddate><creator>Ghose, Anirban</creator><creator>Singh, Siddharth</creator><creator>Kulaharia, Vivek</creator><creator>Dokara, Lokesh</creator><creator>Maity, Srijeeta</creator><creator>Dey, Soumyajit</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-1108-4572</orcidid><orcidid>https://orcid.org/0000-0002-2756-4290</orcidid><orcidid>https://orcid.org/0000-0001-9329-6389</orcidid></search><sort><creationdate>20220901</creationdate><title>PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems</title><author>Ghose, Anirban ; Singh, Siddharth ; Kulaharia, Vivek ; Dokara, Lokesh ; Maity, Srijeeta ; Dey, Soumyajit</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c284t-8fa1743a5b4a583537e703d488e69274a02b8b3f87188afc6fa195cee60518ce3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Central Processing Unit</topic><topic>coarse-grained scheduling</topic><topic>Concurrency</topic><topic>Decisions</topic><topic>Deep learning</topic><topic>Engines</topic><topic>fine-grained scheduling</topic><topic>GPGPU</topic><topic>Graphics processing units</topic><topic>Kernel</topic><topic>Learning curves</topic><topic>OpenCL</topic><topic>Parallel programming</topic><topic>Platforms</topic><topic>Processor scheduling</topic><topic>Programming languages</topic><topic>Schedules</topic><topic>Scheduling</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ghose, Anirban</creatorcontrib><creatorcontrib>Singh, Siddharth</creatorcontrib><creatorcontrib>Kulaharia, Vivek</creatorcontrib><creatorcontrib>Dokara, Lokesh</creatorcontrib><creatorcontrib>Maity, Srijeeta</creatorcontrib><creatorcontrib>Dey, Soumyajit</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ghose, Anirban</au><au>Singh, Siddharth</au><au>Kulaharia, Vivek</au><au>Dokara, Lokesh</au><au>Maity, Srijeeta</au><au>Dey, Soumyajit</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2022-09-01</date><risdate>2022</risdate><volume>71</volume><issue>9</issue><spage>2234</spage><epage>2247</epage><pages>2234-2247</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract>In the past decade, high performance compute capabilities exhibited by heterogeneous GPGPU platforms have led to the popularity of data parallel programming languages such as CUDA and OpenCL. Developing high performance parallel programming solutions using such languages involve a steep learning curve due to the complexity of the underlying heterogeneous compute devices and their impact on performance. This has led to the emergence of several High Performance Computing frameworks which provide high-level abstractions for easing the development of data-parallel applications on heterogeneous platforms. However, the scheduling decisions undertaken by such frameworks only exploit coarse-grained concurrency in data parallel applications. In this paper, we propose PySchedCL , a framework which explores fine-grained concurrency aware scheduling decisions that harness the power of heterogeneous CPU/GPU architectures efficiently. We showcase the efficacy of such scheduling mechanisms over existing coarse-grained dynamic scheduling schemes by conducting extensive experimental evaluations for a diverse set of popular Deep Learning benchmarks.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TC.2021.3125792</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0003-1108-4572</orcidid><orcidid>https://orcid.org/0000-0002-2756-4290</orcidid><orcidid>https://orcid.org/0000-0001-9329-6389</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0018-9340 |
ispartof | IEEE transactions on computers, 2022-09, Vol.71 (9), p.2234-2247 |
issn | 0018-9340 1557-9956 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TC_2021_3125792 |
source | IEEE Electronic Library (IEL) |
subjects | Central Processing Unit coarse-grained scheduling Concurrency Decisions Deep learning Engines fine-grained scheduling GPGPU Graphics processing units Kernel Learning curves OpenCL Parallel programming Platforms Processor scheduling Programming languages Schedules Scheduling Task analysis |
title | PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T12%3A03%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PySchedCL:%20Leveraging%20Concurrency%20in%20Heterogeneous%20Data-Parallel%20Systems&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Ghose,%20Anirban&rft.date=2022-09-01&rft.volume=71&rft.issue=9&rft.spage=2234&rft.epage=2247&rft.pages=2234-2247&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/TC.2021.3125792&rft_dat=%3Cproquest_RIE%3E2700412667%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2700412667&rft_id=info:pmid/&rft_ieee_id=9606595&rfr_iscdi=true |