Runtime Task Scheduling Using Imitation Learning for Heterogeneous Many-Core Systems

Domain-specific systems-on-chip, a class of heterogeneous many-core systems, is recognized as a key approach to narrow down the performance and energy-efficiency gap between custom hardware accelerators and programmable processors. Reaching the full potential of these architectures depends criticall...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on computer-aided design of integrated circuits and systems 2020-11, Vol.39 (11), p.4064-4077
Hauptverfasser: Krishnakumar, Anish, Arda, Samet E., Goksoy, A. Alper, Mandal, Sumit K., Ogras, Umit Y., Sartor, Anderson L., Marculescu, Radu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4077
container_issue 11
container_start_page 4064
container_title IEEE transactions on computer-aided design of integrated circuits and systems
container_volume 39
creator Krishnakumar, Anish
Arda, Samet E.
Goksoy, A. Alper
Mandal, Sumit K.
Ogras, Umit Y.
Sartor, Anderson L.
Marculescu, Radu
description Domain-specific systems-on-chip, a class of heterogeneous many-core systems, is recognized as a key approach to narrow down the performance and energy-efficiency gap between custom hardware accelerators and programmable processors. Reaching the full potential of these architectures depends critically on optimally scheduling the applications to available resources at runtime. Existing optimization-based techniques cannot achieve this objective at runtime due to the combinatorial nature of the task scheduling problem. As the main theoretical contribution, this article poses scheduling as a classification problem and proposes a hierarchical imitation learning (IL)-based scheduler that learns from an Oracle to maximize the performance of multiple domain-specific applications. Extensive evaluations with six streaming applications from wireless communications and radar domains show that the proposed IL-based scheduler approximates an offline Oracle policy with more than 99% accuracy for performance- and energy-based optimization objectives. Furthermore, it achieves almost identical performance to the Oracle with a low runtime overhead and successfully adapts to new applications, many-core system configurations, and runtime variations in application characteristics.
doi_str_mv 10.1109/TCAD.2020.3012861
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9211494</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9211494</ieee_id><sourcerecordid>2458744956</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-ae04b9cb4346b11dd168ff8929d02ef257a24330bd2e3cee989ee9cd4d5bc3b53</originalsourceid><addsrcrecordid>eNo9kE1Lw0AQhhdRsFZ_gHgJeE6d2d187LHUjwoVwabnZZNMamqTrbvJof_ehBYv78DwvDPwMHaPMEME9ZQt5s8zDhxmApCnMV6wCSqRhBIjvGQT4EkaAiRwzW683wGgjLiasOyrb7u6oSAz_idYF99U9vu63QYbP-Z7U3emq20brMi4dlxV1gVL6sjZLbVkex98mPYYLqyjYH30HTX-ll1VZu_p7jynbPP6ki2W4erz7X0xX4UFV6ILDYHMVZFLIeMcsSwxTqsqVVyVwKniUWK4FALykpMoiFSqhihKWUZ5IfJITNnj6e7B2d-efKd3tnft8FJzGaWJlCqKBwpPVOGs944qfXB1Y9xRI-hRnh7l6VGePssbOg-nTk1E_7ziiFJJ8QdbeGt-</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2458744956</pqid></control><display><type>article</type><title>Runtime Task Scheduling Using Imitation Learning for Heterogeneous Many-Core Systems</title><source>IEEE Electronic Library (IEL)</source><creator>Krishnakumar, Anish ; Arda, Samet E. ; Goksoy, A. Alper ; Mandal, Sumit K. ; Ogras, Umit Y. ; Sartor, Anderson L. ; Marculescu, Radu</creator><creatorcontrib>Krishnakumar, Anish ; Arda, Samet E. ; Goksoy, A. Alper ; Mandal, Sumit K. ; Ogras, Umit Y. ; Sartor, Anderson L. ; Marculescu, Radu</creatorcontrib><description>Domain-specific systems-on-chip, a class of heterogeneous many-core systems, is recognized as a key approach to narrow down the performance and energy-efficiency gap between custom hardware accelerators and programmable processors. Reaching the full potential of these architectures depends critically on optimally scheduling the applications to available resources at runtime. Existing optimization-based techniques cannot achieve this objective at runtime due to the combinatorial nature of the task scheduling problem. As the main theoretical contribution, this article poses scheduling as a classification problem and proposes a hierarchical imitation learning (IL)-based scheduler that learns from an Oracle to maximize the performance of multiple domain-specific applications. Extensive evaluations with six streaming applications from wireless communications and radar domains show that the proposed IL-based scheduler approximates an offline Oracle policy with more than 99% accuracy for performance- and energy-based optimization objectives. Furthermore, it achieves almost identical performance to the Oracle with a low runtime overhead and successfully adapts to new applications, many-core system configurations, and runtime variations in application characteristics.</description><identifier>ISSN: 0278-0070</identifier><identifier>EISSN: 1937-4151</identifier><identifier>DOI: 10.1109/TCAD.2020.3012861</identifier><identifier>CODEN: ITCSDI</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Accelerators ; Combinatorial analysis ; Computer architecture ; Digital media ; Domain-specific SoC (DSSoC) ; heterogeneous computing ; imitation learning (IL) ; Learning ; many-core architectures ; Optimal scheduling ; Optimization ; Processor scheduling ; Run time (computers) ; Runtime ; Scheduling ; System on chip ; Task analysis ; Task scheduling ; Wireless communications</subject><ispartof>IEEE transactions on computer-aided design of integrated circuits and systems, 2020-11, Vol.39 (11), p.4064-4077</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c293t-ae04b9cb4346b11dd168ff8929d02ef257a24330bd2e3cee989ee9cd4d5bc3b53</citedby><cites>FETCH-LOGICAL-c293t-ae04b9cb4346b11dd168ff8929d02ef257a24330bd2e3cee989ee9cd4d5bc3b53</cites><orcidid>0000-0002-5045-5535 ; 0000-0003-1759-2762 ; 0000-0003-2419-1860 ; 0000-0002-9294-1603 ; 0000-0001-8679-9842 ; 0000-0003-1826-7646 ; 0000-0003-3269-7095</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9211494$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9211494$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Krishnakumar, Anish</creatorcontrib><creatorcontrib>Arda, Samet E.</creatorcontrib><creatorcontrib>Goksoy, A. Alper</creatorcontrib><creatorcontrib>Mandal, Sumit K.</creatorcontrib><creatorcontrib>Ogras, Umit Y.</creatorcontrib><creatorcontrib>Sartor, Anderson L.</creatorcontrib><creatorcontrib>Marculescu, Radu</creatorcontrib><title>Runtime Task Scheduling Using Imitation Learning for Heterogeneous Many-Core Systems</title><title>IEEE transactions on computer-aided design of integrated circuits and systems</title><addtitle>TCAD</addtitle><description>Domain-specific systems-on-chip, a class of heterogeneous many-core systems, is recognized as a key approach to narrow down the performance and energy-efficiency gap between custom hardware accelerators and programmable processors. Reaching the full potential of these architectures depends critically on optimally scheduling the applications to available resources at runtime. Existing optimization-based techniques cannot achieve this objective at runtime due to the combinatorial nature of the task scheduling problem. As the main theoretical contribution, this article poses scheduling as a classification problem and proposes a hierarchical imitation learning (IL)-based scheduler that learns from an Oracle to maximize the performance of multiple domain-specific applications. Extensive evaluations with six streaming applications from wireless communications and radar domains show that the proposed IL-based scheduler approximates an offline Oracle policy with more than 99% accuracy for performance- and energy-based optimization objectives. Furthermore, it achieves almost identical performance to the Oracle with a low runtime overhead and successfully adapts to new applications, many-core system configurations, and runtime variations in application characteristics.</description><subject>Accelerators</subject><subject>Combinatorial analysis</subject><subject>Computer architecture</subject><subject>Digital media</subject><subject>Domain-specific SoC (DSSoC)</subject><subject>heterogeneous computing</subject><subject>imitation learning (IL)</subject><subject>Learning</subject><subject>many-core architectures</subject><subject>Optimal scheduling</subject><subject>Optimization</subject><subject>Processor scheduling</subject><subject>Run time (computers)</subject><subject>Runtime</subject><subject>Scheduling</subject><subject>System on chip</subject><subject>Task analysis</subject><subject>Task scheduling</subject><subject>Wireless communications</subject><issn>0278-0070</issn><issn>1937-4151</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1Lw0AQhhdRsFZ_gHgJeE6d2d187LHUjwoVwabnZZNMamqTrbvJof_ehBYv78DwvDPwMHaPMEME9ZQt5s8zDhxmApCnMV6wCSqRhBIjvGQT4EkaAiRwzW683wGgjLiasOyrb7u6oSAz_idYF99U9vu63QYbP-Z7U3emq20brMi4dlxV1gVL6sjZLbVkex98mPYYLqyjYH30HTX-ll1VZu_p7jynbPP6ki2W4erz7X0xX4UFV6ILDYHMVZFLIeMcsSwxTqsqVVyVwKniUWK4FALykpMoiFSqhihKWUZ5IfJITNnj6e7B2d-efKd3tnft8FJzGaWJlCqKBwpPVOGs944qfXB1Y9xRI-hRnh7l6VGePssbOg-nTk1E_7ziiFJJ8QdbeGt-</recordid><startdate>20201101</startdate><enddate>20201101</enddate><creator>Krishnakumar, Anish</creator><creator>Arda, Samet E.</creator><creator>Goksoy, A. Alper</creator><creator>Mandal, Sumit K.</creator><creator>Ogras, Umit Y.</creator><creator>Sartor, Anderson L.</creator><creator>Marculescu, Radu</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-5045-5535</orcidid><orcidid>https://orcid.org/0000-0003-1759-2762</orcidid><orcidid>https://orcid.org/0000-0003-2419-1860</orcidid><orcidid>https://orcid.org/0000-0002-9294-1603</orcidid><orcidid>https://orcid.org/0000-0001-8679-9842</orcidid><orcidid>https://orcid.org/0000-0003-1826-7646</orcidid><orcidid>https://orcid.org/0000-0003-3269-7095</orcidid></search><sort><creationdate>20201101</creationdate><title>Runtime Task Scheduling Using Imitation Learning for Heterogeneous Many-Core Systems</title><author>Krishnakumar, Anish ; Arda, Samet E. ; Goksoy, A. Alper ; Mandal, Sumit K. ; Ogras, Umit Y. ; Sartor, Anderson L. ; Marculescu, Radu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-ae04b9cb4346b11dd168ff8929d02ef257a24330bd2e3cee989ee9cd4d5bc3b53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Accelerators</topic><topic>Combinatorial analysis</topic><topic>Computer architecture</topic><topic>Digital media</topic><topic>Domain-specific SoC (DSSoC)</topic><topic>heterogeneous computing</topic><topic>imitation learning (IL)</topic><topic>Learning</topic><topic>many-core architectures</topic><topic>Optimal scheduling</topic><topic>Optimization</topic><topic>Processor scheduling</topic><topic>Run time (computers)</topic><topic>Runtime</topic><topic>Scheduling</topic><topic>System on chip</topic><topic>Task analysis</topic><topic>Task scheduling</topic><topic>Wireless communications</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Krishnakumar, Anish</creatorcontrib><creatorcontrib>Arda, Samet E.</creatorcontrib><creatorcontrib>Goksoy, A. Alper</creatorcontrib><creatorcontrib>Mandal, Sumit K.</creatorcontrib><creatorcontrib>Ogras, Umit Y.</creatorcontrib><creatorcontrib>Sartor, Anderson L.</creatorcontrib><creatorcontrib>Marculescu, Radu</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on computer-aided design of integrated circuits and systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Krishnakumar, Anish</au><au>Arda, Samet E.</au><au>Goksoy, A. Alper</au><au>Mandal, Sumit K.</au><au>Ogras, Umit Y.</au><au>Sartor, Anderson L.</au><au>Marculescu, Radu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Runtime Task Scheduling Using Imitation Learning for Heterogeneous Many-Core Systems</atitle><jtitle>IEEE transactions on computer-aided design of integrated circuits and systems</jtitle><stitle>TCAD</stitle><date>2020-11-01</date><risdate>2020</risdate><volume>39</volume><issue>11</issue><spage>4064</spage><epage>4077</epage><pages>4064-4077</pages><issn>0278-0070</issn><eissn>1937-4151</eissn><coden>ITCSDI</coden><abstract>Domain-specific systems-on-chip, a class of heterogeneous many-core systems, is recognized as a key approach to narrow down the performance and energy-efficiency gap between custom hardware accelerators and programmable processors. Reaching the full potential of these architectures depends critically on optimally scheduling the applications to available resources at runtime. Existing optimization-based techniques cannot achieve this objective at runtime due to the combinatorial nature of the task scheduling problem. As the main theoretical contribution, this article poses scheduling as a classification problem and proposes a hierarchical imitation learning (IL)-based scheduler that learns from an Oracle to maximize the performance of multiple domain-specific applications. Extensive evaluations with six streaming applications from wireless communications and radar domains show that the proposed IL-based scheduler approximates an offline Oracle policy with more than 99% accuracy for performance- and energy-based optimization objectives. Furthermore, it achieves almost identical performance to the Oracle with a low runtime overhead and successfully adapts to new applications, many-core system configurations, and runtime variations in application characteristics.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCAD.2020.3012861</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-5045-5535</orcidid><orcidid>https://orcid.org/0000-0003-1759-2762</orcidid><orcidid>https://orcid.org/0000-0003-2419-1860</orcidid><orcidid>https://orcid.org/0000-0002-9294-1603</orcidid><orcidid>https://orcid.org/0000-0001-8679-9842</orcidid><orcidid>https://orcid.org/0000-0003-1826-7646</orcidid><orcidid>https://orcid.org/0000-0003-3269-7095</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0278-0070
ispartof IEEE transactions on computer-aided design of integrated circuits and systems, 2020-11, Vol.39 (11), p.4064-4077
issn 0278-0070
1937-4151
language eng
recordid cdi_ieee_primary_9211494
source IEEE Electronic Library (IEL)
subjects Accelerators
Combinatorial analysis
Computer architecture
Digital media
Domain-specific SoC (DSSoC)
heterogeneous computing
imitation learning (IL)
Learning
many-core architectures
Optimal scheduling
Optimization
Processor scheduling
Run time (computers)
Runtime
Scheduling
System on chip
Task analysis
Task scheduling
Wireless communications
title Runtime Task Scheduling Using Imitation Learning for Heterogeneous Many-Core Systems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T22%3A11%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Runtime%20Task%20Scheduling%20Using%20Imitation%20Learning%20for%20Heterogeneous%20Many-Core%20Systems&rft.jtitle=IEEE%20transactions%20on%20computer-aided%20design%20of%20integrated%20circuits%20and%20systems&rft.au=Krishnakumar,%20Anish&rft.date=2020-11-01&rft.volume=39&rft.issue=11&rft.spage=4064&rft.epage=4077&rft.pages=4064-4077&rft.issn=0278-0070&rft.eissn=1937-4151&rft.coden=ITCSDI&rft_id=info:doi/10.1109/TCAD.2020.3012861&rft_dat=%3Cproquest_RIE%3E2458744956%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2458744956&rft_id=info:pmid/&rft_ieee_id=9211494&rfr_iscdi=true