URMP: using reconfigurable multicast path for NoC-based deep neural network accelerators
Network-on-chip (NoC) exists with the advantages of high communication efficiency, scalability and reliability. In recent years, NoC-based deep neural network (DNN) accelerators have been proposed. Although existing NoC research solutions can solve the problem of the existence of one-to-one traffic...
Gespeichert in:
Veröffentlicht in: | The Journal of supercomputing 2023-09, Vol.79 (13), p.14827-14847 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 14847 |
---|---|
container_issue | 13 |
container_start_page | 14827 |
container_title | The Journal of supercomputing |
container_volume | 79 |
creator | Ouyang, Yiming Wang, Jiaxin Sun, Chenglong Wang, Qi Liang, Huaguo |
description | Network-on-chip (NoC) exists with the advantages of high communication efficiency, scalability and reliability. In recent years, NoC-based deep neural network (DNN) accelerators have been proposed. Although existing NoC research solutions can solve the problem of the existence of one-to-one traffic in the network and transmit unicast traffic efficiently. However, due to the traffic characteristics of neural networks, there exists a large amount of one-to-many traffic, and if unicast is used to transmit multicast traffic, it may rapidly exhaust the network bandwidth and greatly degrade the performance of the platform. To solve the problem of a large amount of one-to-many multicast traffic existing in the network, we propose a path-based multicast mechanism that greatly exploits the traffic characteristics of neural networks and has excellent scalability. Also a router architecture that can efficiently replicate multicast packets and provide single-cycle per-hop transmission for multicast packets was designed. Detailed simulation results indicate that our proposed scheme can effectively reduce the classification delay, the average packet delay and the number of packets transmitted by the network. |
doi_str_mv | 10.1007/s11227-023-05255-7 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2830306181</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2830306181</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-93d565a530a1d6eb5150a028ede8f48862c93d18bd4d46dec66f5c434c5a5193</originalsourceid><addsrcrecordid>eNp9kEtLxDAUhYMoOI7-AVcB19GbV5txJ4MvGB_ICO5CmqRjx05Tkxbx3xsdwZ2rw4XvOxcOQscUTilAeZYoZawkwDgByaQk5Q6aUFnmUyixiyYwY0CUFGwfHaS0BgDBSz5BL89Pd4_neExNt8LR29DVzWqMpmo93ozt0FiTBtyb4RXXIeL7MCeVSd5h532PO5_RNsfwEeIbNtb61kczhJgO0V5t2uSPfnOKlleXy_kNWTxc384vFsSyEgYy404W0kgOhrrCV5JKMMCUd17VQqmC2YxQVTnhROG8LYpaWsGFzRKd8Sk62db2MbyPPg16HcbY5Y-aKQ4cCqpoptiWsjGkFH2t-9hsTPzUFPT3gHo7oM4D6p8BdZklvpVShruVj3_V_1hfeAFzRw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2830306181</pqid></control><display><type>article</type><title>URMP: using reconfigurable multicast path for NoC-based deep neural network accelerators</title><source>SpringerLink (Online service)</source><creator>Ouyang, Yiming ; Wang, Jiaxin ; Sun, Chenglong ; Wang, Qi ; Liang, Huaguo</creator><creatorcontrib>Ouyang, Yiming ; Wang, Jiaxin ; Sun, Chenglong ; Wang, Qi ; Liang, Huaguo</creatorcontrib><description>Network-on-chip (NoC) exists with the advantages of high communication efficiency, scalability and reliability. In recent years, NoC-based deep neural network (DNN) accelerators have been proposed. Although existing NoC research solutions can solve the problem of the existence of one-to-one traffic in the network and transmit unicast traffic efficiently. However, due to the traffic characteristics of neural networks, there exists a large amount of one-to-many traffic, and if unicast is used to transmit multicast traffic, it may rapidly exhaust the network bandwidth and greatly degrade the performance of the platform. To solve the problem of a large amount of one-to-many multicast traffic existing in the network, we propose a path-based multicast mechanism that greatly exploits the traffic characteristics of neural networks and has excellent scalability. Also a router architecture that can efficiently replicate multicast packets and provide single-cycle per-hop transmission for multicast packets was designed. Detailed simulation results indicate that our proposed scheme can effectively reduce the classification delay, the average packet delay and the number of packets transmitted by the network.</description><identifier>ISSN: 0920-8542</identifier><identifier>EISSN: 1573-0484</identifier><identifier>DOI: 10.1007/s11227-023-05255-7</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Accelerators ; Artificial neural networks ; Communications traffic ; Compilers ; Computer Science ; Interpreters ; Multicasting ; Neural networks ; Packet transmission ; Performance degradation ; Processor Architectures ; Programming Languages ; System on chip</subject><ispartof>The Journal of supercomputing, 2023-09, Vol.79 (13), p.14827-14847</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-93d565a530a1d6eb5150a028ede8f48862c93d18bd4d46dec66f5c434c5a5193</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11227-023-05255-7$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11227-023-05255-7$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Ouyang, Yiming</creatorcontrib><creatorcontrib>Wang, Jiaxin</creatorcontrib><creatorcontrib>Sun, Chenglong</creatorcontrib><creatorcontrib>Wang, Qi</creatorcontrib><creatorcontrib>Liang, Huaguo</creatorcontrib><title>URMP: using reconfigurable multicast path for NoC-based deep neural network accelerators</title><title>The Journal of supercomputing</title><addtitle>J Supercomput</addtitle><description>Network-on-chip (NoC) exists with the advantages of high communication efficiency, scalability and reliability. In recent years, NoC-based deep neural network (DNN) accelerators have been proposed. Although existing NoC research solutions can solve the problem of the existence of one-to-one traffic in the network and transmit unicast traffic efficiently. However, due to the traffic characteristics of neural networks, there exists a large amount of one-to-many traffic, and if unicast is used to transmit multicast traffic, it may rapidly exhaust the network bandwidth and greatly degrade the performance of the platform. To solve the problem of a large amount of one-to-many multicast traffic existing in the network, we propose a path-based multicast mechanism that greatly exploits the traffic characteristics of neural networks and has excellent scalability. Also a router architecture that can efficiently replicate multicast packets and provide single-cycle per-hop transmission for multicast packets was designed. Detailed simulation results indicate that our proposed scheme can effectively reduce the classification delay, the average packet delay and the number of packets transmitted by the network.</description><subject>Accelerators</subject><subject>Artificial neural networks</subject><subject>Communications traffic</subject><subject>Compilers</subject><subject>Computer Science</subject><subject>Interpreters</subject><subject>Multicasting</subject><subject>Neural networks</subject><subject>Packet transmission</subject><subject>Performance degradation</subject><subject>Processor Architectures</subject><subject>Programming Languages</subject><subject>System on chip</subject><issn>0920-8542</issn><issn>1573-0484</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLxDAUhYMoOI7-AVcB19GbV5txJ4MvGB_ICO5CmqRjx05Tkxbx3xsdwZ2rw4XvOxcOQscUTilAeZYoZawkwDgByaQk5Q6aUFnmUyixiyYwY0CUFGwfHaS0BgDBSz5BL89Pd4_neExNt8LR29DVzWqMpmo93ozt0FiTBtyb4RXXIeL7MCeVSd5h532PO5_RNsfwEeIbNtb61kczhJgO0V5t2uSPfnOKlleXy_kNWTxc384vFsSyEgYy404W0kgOhrrCV5JKMMCUd17VQqmC2YxQVTnhROG8LYpaWsGFzRKd8Sk62db2MbyPPg16HcbY5Y-aKQ4cCqpoptiWsjGkFH2t-9hsTPzUFPT3gHo7oM4D6p8BdZklvpVShruVj3_V_1hfeAFzRw</recordid><startdate>20230901</startdate><enddate>20230901</enddate><creator>Ouyang, Yiming</creator><creator>Wang, Jiaxin</creator><creator>Sun, Chenglong</creator><creator>Wang, Qi</creator><creator>Liang, Huaguo</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20230901</creationdate><title>URMP: using reconfigurable multicast path for NoC-based deep neural network accelerators</title><author>Ouyang, Yiming ; Wang, Jiaxin ; Sun, Chenglong ; Wang, Qi ; Liang, Huaguo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-93d565a530a1d6eb5150a028ede8f48862c93d18bd4d46dec66f5c434c5a5193</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accelerators</topic><topic>Artificial neural networks</topic><topic>Communications traffic</topic><topic>Compilers</topic><topic>Computer Science</topic><topic>Interpreters</topic><topic>Multicasting</topic><topic>Neural networks</topic><topic>Packet transmission</topic><topic>Performance degradation</topic><topic>Processor Architectures</topic><topic>Programming Languages</topic><topic>System on chip</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ouyang, Yiming</creatorcontrib><creatorcontrib>Wang, Jiaxin</creatorcontrib><creatorcontrib>Sun, Chenglong</creatorcontrib><creatorcontrib>Wang, Qi</creatorcontrib><creatorcontrib>Liang, Huaguo</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of supercomputing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ouyang, Yiming</au><au>Wang, Jiaxin</au><au>Sun, Chenglong</au><au>Wang, Qi</au><au>Liang, Huaguo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>URMP: using reconfigurable multicast path for NoC-based deep neural network accelerators</atitle><jtitle>The Journal of supercomputing</jtitle><stitle>J Supercomput</stitle><date>2023-09-01</date><risdate>2023</risdate><volume>79</volume><issue>13</issue><spage>14827</spage><epage>14847</epage><pages>14827-14847</pages><issn>0920-8542</issn><eissn>1573-0484</eissn><abstract>Network-on-chip (NoC) exists with the advantages of high communication efficiency, scalability and reliability. In recent years, NoC-based deep neural network (DNN) accelerators have been proposed. Although existing NoC research solutions can solve the problem of the existence of one-to-one traffic in the network and transmit unicast traffic efficiently. However, due to the traffic characteristics of neural networks, there exists a large amount of one-to-many traffic, and if unicast is used to transmit multicast traffic, it may rapidly exhaust the network bandwidth and greatly degrade the performance of the platform. To solve the problem of a large amount of one-to-many multicast traffic existing in the network, we propose a path-based multicast mechanism that greatly exploits the traffic characteristics of neural networks and has excellent scalability. Also a router architecture that can efficiently replicate multicast packets and provide single-cycle per-hop transmission for multicast packets was designed. Detailed simulation results indicate that our proposed scheme can effectively reduce the classification delay, the average packet delay and the number of packets transmitted by the network.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11227-023-05255-7</doi><tpages>21</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0920-8542 |
ispartof | The Journal of supercomputing, 2023-09, Vol.79 (13), p.14827-14847 |
issn | 0920-8542 1573-0484 |
language | eng |
recordid | cdi_proquest_journals_2830306181 |
source | SpringerLink (Online service) |
subjects | Accelerators Artificial neural networks Communications traffic Compilers Computer Science Interpreters Multicasting Neural networks Packet transmission Performance degradation Processor Architectures Programming Languages System on chip |
title | URMP: using reconfigurable multicast path for NoC-based deep neural network accelerators |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T09%3A55%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=URMP:%20using%20reconfigurable%20multicast%20path%20for%20NoC-based%20deep%20neural%20network%20accelerators&rft.jtitle=The%20Journal%20of%20supercomputing&rft.au=Ouyang,%20Yiming&rft.date=2023-09-01&rft.volume=79&rft.issue=13&rft.spage=14827&rft.epage=14847&rft.pages=14827-14847&rft.issn=0920-8542&rft.eissn=1573-0484&rft_id=info:doi/10.1007/s11227-023-05255-7&rft_dat=%3Cproquest_cross%3E2830306181%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2830306181&rft_id=info:pmid/&rfr_iscdi=true |