1.63 pJ/SOP Neuromorphic Processor With Integrated Partial Sum Routers for In-Network Computing
Neuromorphic computing is promising to achieve unprecedented energy efficiency by emulating the human brain's mechanism. Conventional neuromorphic accelerators employ split-and-merge method to map spiking neural networks' inputs to surpass the fan-in capabilities of a single neuron core. H...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on very large scale integration (VLSI) systems 2024-11, Vol.32 (11), p.2085-2092 |
---|---|
Hauptverfasser: | , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2092 |
---|---|
container_issue | 11 |
container_start_page | 2085 |
container_title | IEEE transactions on very large scale integration (VLSI) systems |
container_volume | 32 |
creator | Li, Dongrui Wong, Ming Ming Chong, Yi Sheng Zhou, Jun Upadhyay, Mohit Balaji, Ananta Mani, Aarthy Wong, Weng Fai Peh, Li Shiuan Do, Anh Tuan Wang, Bo |
description | Neuromorphic computing is promising to achieve unprecedented energy efficiency by emulating the human brain's mechanism. Conventional neuromorphic accelerators employ split-and-merge method to map spiking neural networks' inputs to surpass the fan-in capabilities of a single neuron core. However, this approach gives rise to the risk of accuracy compromise and extra core usage for the merging process. Moreover, it requires excessive data movement and clock cycles to aggregate spikes generated by partial sums instead of total sums obtained from different cores with substantial power and energy overhead. This work presents a novel approach to addressing the challenges imposed by the split-and-merge method. We propose an energy-efficient, reconfigurable neuromorphic processor that leverages several key techniques to mitigate the above issues. First, we introduce a partial sum router circuitry that enables in-network computing (INC), eliminating the need for extra merge cores. Second, we adopt software-defined Networks-on-Chip (NoCs) by leveraging predefined, efficient routing, eliminating power-hungry routing computation. At last, we incorporate fine-grained power gating and clock gating techniques for further power reduction. Experimental results from our test chip demonstrate the lossless mapping of the algorithm and exceptional energy efficiency, achieving an energy consumption of 1.63 pJ/SOP at 0.48 V. This energy efficiency represents a 22.4% improvement compared to the state-of-the-art results. Our proposed neuromorphic processor provides an efficient and flexible solution for neural network processing, mitigating the limitations of the traditional split-and-merge approach while delivering superior energy efficiency. |
doi_str_mv | 10.1109/TVLSI.2024.3409652 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_10570234</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10570234</ieee_id><sourcerecordid>3120655976</sourcerecordid><originalsourceid>FETCH-LOGICAL-c177t-474cfec03b09e104734502ee65193268134104038890e5baaaf33a7ea4cf0f2d3</originalsourceid><addsrcrecordid>eNpNkE1Lw0AQhoMoWKt_QDwseE46-5WPoxQ_IqUttupx2aaTNtVk4-4G8d-b2h6cywzD-8zAEwTXFCJKIRst3yaLPGLARMQFZLFkJ8GASpmEWV-n_QwxD1NG4Ty4cG4HQIXIYBAoGsWctM-jxWxOpthZUxvbbquCzK0p0DljyXvltyRvPG6s9rgmc219pT_JoqvJi-k8WkfKPpc34RT9t7EfZGzqtvNVs7kMzkr96fDq2IfB68P9cvwUTmaP-fhuEhY0SXwoElGUWABfQYYURMKFBIYYS5pxFqeUi34LPE0zQLnSWpec6wR1j0HJ1nwY3B7uttZ8dei82pnONv1LxSmDWMosifsUO6QKa5yzWKrWVrW2P4qC2otUfyLVXqQ6iuyhmwNUIeI_QCbAuOC_JD1uDQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3120655976</pqid></control><display><type>article</type><title>1.63 pJ/SOP Neuromorphic Processor With Integrated Partial Sum Routers for In-Network Computing</title><source>IEEE Electronic Library (IEL)</source><creator>Li, Dongrui ; Wong, Ming Ming ; Chong, Yi Sheng ; Zhou, Jun ; Upadhyay, Mohit ; Balaji, Ananta ; Mani, Aarthy ; Wong, Weng Fai ; Peh, Li Shiuan ; Do, Anh Tuan ; Wang, Bo</creator><creatorcontrib>Li, Dongrui ; Wong, Ming Ming ; Chong, Yi Sheng ; Zhou, Jun ; Upadhyay, Mohit ; Balaji, Ananta ; Mani, Aarthy ; Wong, Weng Fai ; Peh, Li Shiuan ; Do, Anh Tuan ; Wang, Bo</creatorcontrib><description>Neuromorphic computing is promising to achieve unprecedented energy efficiency by emulating the human brain's mechanism. Conventional neuromorphic accelerators employ split-and-merge method to map spiking neural networks' inputs to surpass the fan-in capabilities of a single neuron core. However, this approach gives rise to the risk of accuracy compromise and extra core usage for the merging process. Moreover, it requires excessive data movement and clock cycles to aggregate spikes generated by partial sums instead of total sums obtained from different cores with substantial power and energy overhead. This work presents a novel approach to addressing the challenges imposed by the split-and-merge method. We propose an energy-efficient, reconfigurable neuromorphic processor that leverages several key techniques to mitigate the above issues. First, we introduce a partial sum router circuitry that enables in-network computing (INC), eliminating the need for extra merge cores. Second, we adopt software-defined Networks-on-Chip (NoCs) by leveraging predefined, efficient routing, eliminating power-hungry routing computation. At last, we incorporate fine-grained power gating and clock gating techniques for further power reduction. Experimental results from our test chip demonstrate the lossless mapping of the algorithm and exceptional energy efficiency, achieving an energy consumption of 1.63 pJ/SOP at 0.48 V. This energy efficiency represents a 22.4% improvement compared to the state-of-the-art results. Our proposed neuromorphic processor provides an efficient and flexible solution for neural network processing, mitigating the limitations of the traditional split-and-merge approach while delivering superior energy efficiency.</description><identifier>ISSN: 1063-8210</identifier><identifier>EISSN: 1557-9999</identifier><identifier>DOI: 10.1109/TVLSI.2024.3409652</identifier><identifier>CODEN: IEVSE9</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Accuracy ; Aggregates ; Algorithms ; Circuits ; Computer architecture ; Energy consumption ; Energy efficiency ; Energy efficient ; Human motion ; in-network computing (INC) ; Microprocessors ; network on chip (NoC) ; Neural networks ; Neuromorphic computing ; neuromorphic processor ; Neuromorphics ; Neurons ; Power management ; Routers ; Routing (telecommunications) ; Sums ; System on chip ; Task analysis</subject><ispartof>IEEE transactions on very large scale integration (VLSI) systems, 2024-11, Vol.32 (11), p.2085-2092</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c177t-474cfec03b09e104734502ee65193268134104038890e5baaaf33a7ea4cf0f2d3</cites><orcidid>0000-0001-9010-6519 ; 0000-0002-9836-3023 ; 0000-0002-8320-6818 ; 0000-0002-4281-2053 ; 0000-0001-9199-0799</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10570234$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10570234$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Li, Dongrui</creatorcontrib><creatorcontrib>Wong, Ming Ming</creatorcontrib><creatorcontrib>Chong, Yi Sheng</creatorcontrib><creatorcontrib>Zhou, Jun</creatorcontrib><creatorcontrib>Upadhyay, Mohit</creatorcontrib><creatorcontrib>Balaji, Ananta</creatorcontrib><creatorcontrib>Mani, Aarthy</creatorcontrib><creatorcontrib>Wong, Weng Fai</creatorcontrib><creatorcontrib>Peh, Li Shiuan</creatorcontrib><creatorcontrib>Do, Anh Tuan</creatorcontrib><creatorcontrib>Wang, Bo</creatorcontrib><title>1.63 pJ/SOP Neuromorphic Processor With Integrated Partial Sum Routers for In-Network Computing</title><title>IEEE transactions on very large scale integration (VLSI) systems</title><addtitle>TVLSI</addtitle><description>Neuromorphic computing is promising to achieve unprecedented energy efficiency by emulating the human brain's mechanism. Conventional neuromorphic accelerators employ split-and-merge method to map spiking neural networks' inputs to surpass the fan-in capabilities of a single neuron core. However, this approach gives rise to the risk of accuracy compromise and extra core usage for the merging process. Moreover, it requires excessive data movement and clock cycles to aggregate spikes generated by partial sums instead of total sums obtained from different cores with substantial power and energy overhead. This work presents a novel approach to addressing the challenges imposed by the split-and-merge method. We propose an energy-efficient, reconfigurable neuromorphic processor that leverages several key techniques to mitigate the above issues. First, we introduce a partial sum router circuitry that enables in-network computing (INC), eliminating the need for extra merge cores. Second, we adopt software-defined Networks-on-Chip (NoCs) by leveraging predefined, efficient routing, eliminating power-hungry routing computation. At last, we incorporate fine-grained power gating and clock gating techniques for further power reduction. Experimental results from our test chip demonstrate the lossless mapping of the algorithm and exceptional energy efficiency, achieving an energy consumption of 1.63 pJ/SOP at 0.48 V. This energy efficiency represents a 22.4% improvement compared to the state-of-the-art results. Our proposed neuromorphic processor provides an efficient and flexible solution for neural network processing, mitigating the limitations of the traditional split-and-merge approach while delivering superior energy efficiency.</description><subject>Accuracy</subject><subject>Aggregates</subject><subject>Algorithms</subject><subject>Circuits</subject><subject>Computer architecture</subject><subject>Energy consumption</subject><subject>Energy efficiency</subject><subject>Energy efficient</subject><subject>Human motion</subject><subject>in-network computing (INC)</subject><subject>Microprocessors</subject><subject>network on chip (NoC)</subject><subject>Neural networks</subject><subject>Neuromorphic computing</subject><subject>neuromorphic processor</subject><subject>Neuromorphics</subject><subject>Neurons</subject><subject>Power management</subject><subject>Routers</subject><subject>Routing (telecommunications)</subject><subject>Sums</subject><subject>System on chip</subject><subject>Task analysis</subject><issn>1063-8210</issn><issn>1557-9999</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE1Lw0AQhoMoWKt_QDwseE46-5WPoxQ_IqUttupx2aaTNtVk4-4G8d-b2h6cywzD-8zAEwTXFCJKIRst3yaLPGLARMQFZLFkJ8GASpmEWV-n_QwxD1NG4Ty4cG4HQIXIYBAoGsWctM-jxWxOpthZUxvbbquCzK0p0DljyXvltyRvPG6s9rgmc219pT_JoqvJi-k8WkfKPpc34RT9t7EfZGzqtvNVs7kMzkr96fDq2IfB68P9cvwUTmaP-fhuEhY0SXwoElGUWABfQYYURMKFBIYYS5pxFqeUi34LPE0zQLnSWpec6wR1j0HJ1nwY3B7uttZ8dei82pnONv1LxSmDWMosifsUO6QKa5yzWKrWVrW2P4qC2otUfyLVXqQ6iuyhmwNUIeI_QCbAuOC_JD1uDQ</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>Li, Dongrui</creator><creator>Wong, Ming Ming</creator><creator>Chong, Yi Sheng</creator><creator>Zhou, Jun</creator><creator>Upadhyay, Mohit</creator><creator>Balaji, Ananta</creator><creator>Mani, Aarthy</creator><creator>Wong, Weng Fai</creator><creator>Peh, Li Shiuan</creator><creator>Do, Anh Tuan</creator><creator>Wang, Bo</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0001-9010-6519</orcidid><orcidid>https://orcid.org/0000-0002-9836-3023</orcidid><orcidid>https://orcid.org/0000-0002-8320-6818</orcidid><orcidid>https://orcid.org/0000-0002-4281-2053</orcidid><orcidid>https://orcid.org/0000-0001-9199-0799</orcidid></search><sort><creationdate>20241101</creationdate><title>1.63 pJ/SOP Neuromorphic Processor With Integrated Partial Sum Routers for In-Network Computing</title><author>Li, Dongrui ; Wong, Ming Ming ; Chong, Yi Sheng ; Zhou, Jun ; Upadhyay, Mohit ; Balaji, Ananta ; Mani, Aarthy ; Wong, Weng Fai ; Peh, Li Shiuan ; Do, Anh Tuan ; Wang, Bo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c177t-474cfec03b09e104734502ee65193268134104038890e5baaaf33a7ea4cf0f2d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Aggregates</topic><topic>Algorithms</topic><topic>Circuits</topic><topic>Computer architecture</topic><topic>Energy consumption</topic><topic>Energy efficiency</topic><topic>Energy efficient</topic><topic>Human motion</topic><topic>in-network computing (INC)</topic><topic>Microprocessors</topic><topic>network on chip (NoC)</topic><topic>Neural networks</topic><topic>Neuromorphic computing</topic><topic>neuromorphic processor</topic><topic>Neuromorphics</topic><topic>Neurons</topic><topic>Power management</topic><topic>Routers</topic><topic>Routing (telecommunications)</topic><topic>Sums</topic><topic>System on chip</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Dongrui</creatorcontrib><creatorcontrib>Wong, Ming Ming</creatorcontrib><creatorcontrib>Chong, Yi Sheng</creatorcontrib><creatorcontrib>Zhou, Jun</creatorcontrib><creatorcontrib>Upadhyay, Mohit</creatorcontrib><creatorcontrib>Balaji, Ananta</creatorcontrib><creatorcontrib>Mani, Aarthy</creatorcontrib><creatorcontrib>Wong, Weng Fai</creatorcontrib><creatorcontrib>Peh, Li Shiuan</creatorcontrib><creatorcontrib>Do, Anh Tuan</creatorcontrib><creatorcontrib>Wang, Bo</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on very large scale integration (VLSI) systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Dongrui</au><au>Wong, Ming Ming</au><au>Chong, Yi Sheng</au><au>Zhou, Jun</au><au>Upadhyay, Mohit</au><au>Balaji, Ananta</au><au>Mani, Aarthy</au><au>Wong, Weng Fai</au><au>Peh, Li Shiuan</au><au>Do, Anh Tuan</au><au>Wang, Bo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>1.63 pJ/SOP Neuromorphic Processor With Integrated Partial Sum Routers for In-Network Computing</atitle><jtitle>IEEE transactions on very large scale integration (VLSI) systems</jtitle><stitle>TVLSI</stitle><date>2024-11-01</date><risdate>2024</risdate><volume>32</volume><issue>11</issue><spage>2085</spage><epage>2092</epage><pages>2085-2092</pages><issn>1063-8210</issn><eissn>1557-9999</eissn><coden>IEVSE9</coden><abstract>Neuromorphic computing is promising to achieve unprecedented energy efficiency by emulating the human brain's mechanism. Conventional neuromorphic accelerators employ split-and-merge method to map spiking neural networks' inputs to surpass the fan-in capabilities of a single neuron core. However, this approach gives rise to the risk of accuracy compromise and extra core usage for the merging process. Moreover, it requires excessive data movement and clock cycles to aggregate spikes generated by partial sums instead of total sums obtained from different cores with substantial power and energy overhead. This work presents a novel approach to addressing the challenges imposed by the split-and-merge method. We propose an energy-efficient, reconfigurable neuromorphic processor that leverages several key techniques to mitigate the above issues. First, we introduce a partial sum router circuitry that enables in-network computing (INC), eliminating the need for extra merge cores. Second, we adopt software-defined Networks-on-Chip (NoCs) by leveraging predefined, efficient routing, eliminating power-hungry routing computation. At last, we incorporate fine-grained power gating and clock gating techniques for further power reduction. Experimental results from our test chip demonstrate the lossless mapping of the algorithm and exceptional energy efficiency, achieving an energy consumption of 1.63 pJ/SOP at 0.48 V. This energy efficiency represents a 22.4% improvement compared to the state-of-the-art results. Our proposed neuromorphic processor provides an efficient and flexible solution for neural network processing, mitigating the limitations of the traditional split-and-merge approach while delivering superior energy efficiency.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TVLSI.2024.3409652</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0001-9010-6519</orcidid><orcidid>https://orcid.org/0000-0002-9836-3023</orcidid><orcidid>https://orcid.org/0000-0002-8320-6818</orcidid><orcidid>https://orcid.org/0000-0002-4281-2053</orcidid><orcidid>https://orcid.org/0000-0001-9199-0799</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1063-8210 |
ispartof | IEEE transactions on very large scale integration (VLSI) systems, 2024-11, Vol.32 (11), p.2085-2092 |
issn | 1063-8210 1557-9999 |
language | eng |
recordid | cdi_ieee_primary_10570234 |
source | IEEE Electronic Library (IEL) |
subjects | Accuracy Aggregates Algorithms Circuits Computer architecture Energy consumption Energy efficiency Energy efficient Human motion in-network computing (INC) Microprocessors network on chip (NoC) Neural networks Neuromorphic computing neuromorphic processor Neuromorphics Neurons Power management Routers Routing (telecommunications) Sums System on chip Task analysis |
title | 1.63 pJ/SOP Neuromorphic Processor With Integrated Partial Sum Routers for In-Network Computing |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T06%3A01%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=1.63%20pJ/SOP%20Neuromorphic%20Processor%20With%20Integrated%20Partial%20Sum%20Routers%20for%20In-Network%20Computing&rft.jtitle=IEEE%20transactions%20on%20very%20large%20scale%20integration%20(VLSI)%20systems&rft.au=Li,%20Dongrui&rft.date=2024-11-01&rft.volume=32&rft.issue=11&rft.spage=2085&rft.epage=2092&rft.pages=2085-2092&rft.issn=1063-8210&rft.eissn=1557-9999&rft.coden=IEVSE9&rft_id=info:doi/10.1109/TVLSI.2024.3409652&rft_dat=%3Cproquest_RIE%3E3120655976%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3120655976&rft_id=info:pmid/&rft_ieee_id=10570234&rfr_iscdi=true |