URMP: using reconfigurable multicast path for NoC-based deep neural network accelerators

Network-on-chip (NoC) exists with the advantages of high communication efficiency, scalability and reliability. In recent years, NoC-based deep neural network (DNN) accelerators have been proposed. Although existing NoC research solutions can solve the problem of the existence of one-to-one traffic...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of supercomputing 2023-09, Vol.79 (13), p.14827-14847
Hauptverfasser: Ouyang, Yiming, Wang, Jiaxin, Sun, Chenglong, Wang, Qi, Liang, Huaguo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 14847
container_issue 13
container_start_page 14827
container_title The Journal of supercomputing
container_volume 79
creator Ouyang, Yiming
Wang, Jiaxin
Sun, Chenglong
Wang, Qi
Liang, Huaguo
description Network-on-chip (NoC) exists with the advantages of high communication efficiency, scalability and reliability. In recent years, NoC-based deep neural network (DNN) accelerators have been proposed. Although existing NoC research solutions can solve the problem of the existence of one-to-one traffic in the network and transmit unicast traffic efficiently. However, due to the traffic characteristics of neural networks, there exists a large amount of one-to-many traffic, and if unicast is used to transmit multicast traffic, it may rapidly exhaust the network bandwidth and greatly degrade the performance of the platform. To solve the problem of a large amount of one-to-many multicast traffic existing in the network, we propose a path-based multicast mechanism that greatly exploits the traffic characteristics of neural networks and has excellent scalability. Also a router architecture that can efficiently replicate multicast packets and provide single-cycle per-hop transmission for multicast packets was designed. Detailed simulation results indicate that our proposed scheme can effectively reduce the classification delay, the average packet delay and the number of packets transmitted by the network.
doi_str_mv 10.1007/s11227-023-05255-7
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2830306181</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2830306181</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-93d565a530a1d6eb5150a028ede8f48862c93d18bd4d46dec66f5c434c5a5193</originalsourceid><addsrcrecordid>eNp9kEtLxDAUhYMoOI7-AVcB19GbV5txJ4MvGB_ICO5CmqRjx05Tkxbx3xsdwZ2rw4XvOxcOQscUTilAeZYoZawkwDgByaQk5Q6aUFnmUyixiyYwY0CUFGwfHaS0BgDBSz5BL89Pd4_neExNt8LR29DVzWqMpmo93ozt0FiTBtyb4RXXIeL7MCeVSd5h532PO5_RNsfwEeIbNtb61kczhJgO0V5t2uSPfnOKlleXy_kNWTxc384vFsSyEgYy404W0kgOhrrCV5JKMMCUd17VQqmC2YxQVTnhROG8LYpaWsGFzRKd8Sk62db2MbyPPg16HcbY5Y-aKQ4cCqpoptiWsjGkFH2t-9hsTPzUFPT3gHo7oM4D6p8BdZklvpVShruVj3_V_1hfeAFzRw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2830306181</pqid></control><display><type>article</type><title>URMP: using reconfigurable multicast path for NoC-based deep neural network accelerators</title><source>SpringerLink (Online service)</source><creator>Ouyang, Yiming ; Wang, Jiaxin ; Sun, Chenglong ; Wang, Qi ; Liang, Huaguo</creator><creatorcontrib>Ouyang, Yiming ; Wang, Jiaxin ; Sun, Chenglong ; Wang, Qi ; Liang, Huaguo</creatorcontrib><description>Network-on-chip (NoC) exists with the advantages of high communication efficiency, scalability and reliability. In recent years, NoC-based deep neural network (DNN) accelerators have been proposed. Although existing NoC research solutions can solve the problem of the existence of one-to-one traffic in the network and transmit unicast traffic efficiently. However, due to the traffic characteristics of neural networks, there exists a large amount of one-to-many traffic, and if unicast is used to transmit multicast traffic, it may rapidly exhaust the network bandwidth and greatly degrade the performance of the platform. To solve the problem of a large amount of one-to-many multicast traffic existing in the network, we propose a path-based multicast mechanism that greatly exploits the traffic characteristics of neural networks and has excellent scalability. Also a router architecture that can efficiently replicate multicast packets and provide single-cycle per-hop transmission for multicast packets was designed. Detailed simulation results indicate that our proposed scheme can effectively reduce the classification delay, the average packet delay and the number of packets transmitted by the network.</description><identifier>ISSN: 0920-8542</identifier><identifier>EISSN: 1573-0484</identifier><identifier>DOI: 10.1007/s11227-023-05255-7</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Accelerators ; Artificial neural networks ; Communications traffic ; Compilers ; Computer Science ; Interpreters ; Multicasting ; Neural networks ; Packet transmission ; Performance degradation ; Processor Architectures ; Programming Languages ; System on chip</subject><ispartof>The Journal of supercomputing, 2023-09, Vol.79 (13), p.14827-14847</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-93d565a530a1d6eb5150a028ede8f48862c93d18bd4d46dec66f5c434c5a5193</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11227-023-05255-7$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11227-023-05255-7$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Ouyang, Yiming</creatorcontrib><creatorcontrib>Wang, Jiaxin</creatorcontrib><creatorcontrib>Sun, Chenglong</creatorcontrib><creatorcontrib>Wang, Qi</creatorcontrib><creatorcontrib>Liang, Huaguo</creatorcontrib><title>URMP: using reconfigurable multicast path for NoC-based deep neural network accelerators</title><title>The Journal of supercomputing</title><addtitle>J Supercomput</addtitle><description>Network-on-chip (NoC) exists with the advantages of high communication efficiency, scalability and reliability. In recent years, NoC-based deep neural network (DNN) accelerators have been proposed. Although existing NoC research solutions can solve the problem of the existence of one-to-one traffic in the network and transmit unicast traffic efficiently. However, due to the traffic characteristics of neural networks, there exists a large amount of one-to-many traffic, and if unicast is used to transmit multicast traffic, it may rapidly exhaust the network bandwidth and greatly degrade the performance of the platform. To solve the problem of a large amount of one-to-many multicast traffic existing in the network, we propose a path-based multicast mechanism that greatly exploits the traffic characteristics of neural networks and has excellent scalability. Also a router architecture that can efficiently replicate multicast packets and provide single-cycle per-hop transmission for multicast packets was designed. Detailed simulation results indicate that our proposed scheme can effectively reduce the classification delay, the average packet delay and the number of packets transmitted by the network.</description><subject>Accelerators</subject><subject>Artificial neural networks</subject><subject>Communications traffic</subject><subject>Compilers</subject><subject>Computer Science</subject><subject>Interpreters</subject><subject>Multicasting</subject><subject>Neural networks</subject><subject>Packet transmission</subject><subject>Performance degradation</subject><subject>Processor Architectures</subject><subject>Programming Languages</subject><subject>System on chip</subject><issn>0920-8542</issn><issn>1573-0484</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLxDAUhYMoOI7-AVcB19GbV5txJ4MvGB_ICO5CmqRjx05Tkxbx3xsdwZ2rw4XvOxcOQscUTilAeZYoZawkwDgByaQk5Q6aUFnmUyixiyYwY0CUFGwfHaS0BgDBSz5BL89Pd4_neExNt8LR29DVzWqMpmo93ozt0FiTBtyb4RXXIeL7MCeVSd5h532PO5_RNsfwEeIbNtb61kczhJgO0V5t2uSPfnOKlleXy_kNWTxc384vFsSyEgYy404W0kgOhrrCV5JKMMCUd17VQqmC2YxQVTnhROG8LYpaWsGFzRKd8Sk62db2MbyPPg16HcbY5Y-aKQ4cCqpoptiWsjGkFH2t-9hsTPzUFPT3gHo7oM4D6p8BdZklvpVShruVj3_V_1hfeAFzRw</recordid><startdate>20230901</startdate><enddate>20230901</enddate><creator>Ouyang, Yiming</creator><creator>Wang, Jiaxin</creator><creator>Sun, Chenglong</creator><creator>Wang, Qi</creator><creator>Liang, Huaguo</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20230901</creationdate><title>URMP: using reconfigurable multicast path for NoC-based deep neural network accelerators</title><author>Ouyang, Yiming ; Wang, Jiaxin ; Sun, Chenglong ; Wang, Qi ; Liang, Huaguo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-93d565a530a1d6eb5150a028ede8f48862c93d18bd4d46dec66f5c434c5a5193</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accelerators</topic><topic>Artificial neural networks</topic><topic>Communications traffic</topic><topic>Compilers</topic><topic>Computer Science</topic><topic>Interpreters</topic><topic>Multicasting</topic><topic>Neural networks</topic><topic>Packet transmission</topic><topic>Performance degradation</topic><topic>Processor Architectures</topic><topic>Programming Languages</topic><topic>System on chip</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ouyang, Yiming</creatorcontrib><creatorcontrib>Wang, Jiaxin</creatorcontrib><creatorcontrib>Sun, Chenglong</creatorcontrib><creatorcontrib>Wang, Qi</creatorcontrib><creatorcontrib>Liang, Huaguo</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of supercomputing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ouyang, Yiming</au><au>Wang, Jiaxin</au><au>Sun, Chenglong</au><au>Wang, Qi</au><au>Liang, Huaguo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>URMP: using reconfigurable multicast path for NoC-based deep neural network accelerators</atitle><jtitle>The Journal of supercomputing</jtitle><stitle>J Supercomput</stitle><date>2023-09-01</date><risdate>2023</risdate><volume>79</volume><issue>13</issue><spage>14827</spage><epage>14847</epage><pages>14827-14847</pages><issn>0920-8542</issn><eissn>1573-0484</eissn><abstract>Network-on-chip (NoC) exists with the advantages of high communication efficiency, scalability and reliability. In recent years, NoC-based deep neural network (DNN) accelerators have been proposed. Although existing NoC research solutions can solve the problem of the existence of one-to-one traffic in the network and transmit unicast traffic efficiently. However, due to the traffic characteristics of neural networks, there exists a large amount of one-to-many traffic, and if unicast is used to transmit multicast traffic, it may rapidly exhaust the network bandwidth and greatly degrade the performance of the platform. To solve the problem of a large amount of one-to-many multicast traffic existing in the network, we propose a path-based multicast mechanism that greatly exploits the traffic characteristics of neural networks and has excellent scalability. Also a router architecture that can efficiently replicate multicast packets and provide single-cycle per-hop transmission for multicast packets was designed. Detailed simulation results indicate that our proposed scheme can effectively reduce the classification delay, the average packet delay and the number of packets transmitted by the network.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11227-023-05255-7</doi><tpages>21</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0920-8542
ispartof The Journal of supercomputing, 2023-09, Vol.79 (13), p.14827-14847
issn 0920-8542
1573-0484
language eng
recordid cdi_proquest_journals_2830306181
source SpringerLink (Online service)
subjects Accelerators
Artificial neural networks
Communications traffic
Compilers
Computer Science
Interpreters
Multicasting
Neural networks
Packet transmission
Performance degradation
Processor Architectures
Programming Languages
System on chip
title URMP: using reconfigurable multicast path for NoC-based deep neural network accelerators
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T09%3A55%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=URMP:%20using%20reconfigurable%20multicast%20path%20for%20NoC-based%20deep%20neural%20network%20accelerators&rft.jtitle=The%20Journal%20of%20supercomputing&rft.au=Ouyang,%20Yiming&rft.date=2023-09-01&rft.volume=79&rft.issue=13&rft.spage=14827&rft.epage=14847&rft.pages=14827-14847&rft.issn=0920-8542&rft.eissn=1573-0484&rft_id=info:doi/10.1007/s11227-023-05255-7&rft_dat=%3Cproquest_cross%3E2830306181%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2830306181&rft_id=info:pmid/&rfr_iscdi=true