A 1-8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks
This work presents a 1-8b reconfigurable digital SRAM compute-in-memory (CIM) macro, which significantly improves array utilization and energy efficiency under different input and weight configurations compared to previous works. To ensure the array utilization under different configurations, a row-...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2024-04, Vol.71 (4), p.1602-1614 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1614 |
---|---|
container_issue | 4 |
container_start_page | 1602 |
container_title | IEEE transactions on circuits and systems. I, Regular papers |
container_volume | 71 |
creator | You, Heng Li, Weijun Shang, Delong Zhou, Yumei Qiao, Shushan |
description | This work presents a 1-8b reconfigurable digital SRAM compute-in-memory (CIM) macro, which significantly improves array utilization and energy efficiency under different input and weight configurations compared to previous works. To ensure the array utilization under different configurations, a row-based bitwise-summation-first digital CIM architecture is proposed. In addition, to realize flexible switching between signed and unsigned operations, a complete 2's complement encoding method is adopted, which makes the computation of the sign bits consistent with that of the magnitude bits when performing signed operations, thus ensuring that each row of the CIM array can store the sign of the weight. Due to the support of reconfigurable bit width, the proposed CIM macro can be widely used in various neural networks for optimal efficiency. In order to better apply the CIM macro to binarized neural networks, a configurable bitwise multiplier is presented, which supports both AND and XNOR operations. Moreover, since the power consumption of the adder tree occupies a major part of the digital CIM macro, a 4-2 compressor based adder tree is presented to further improve the energy efficiency. Measurement results based on 55nm CMOS process show that the proposed CIM macro achieves an energy efficiency of up to 2238TOPS/W at 1b/1b and 44.82TOPS/W at 4b/4b MAC operations. |
doi_str_mv | 10.1109/TCSI.2024.3355944 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3015025728</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10417860</ieee_id><sourcerecordid>3015025728</sourcerecordid><originalsourceid>FETCH-LOGICAL-c246t-87f0df39885915bc44f080b40a56c40dcd389dc9554a4bf26002ef1400a9b12a3</originalsourceid><addsrcrecordid>eNpNkMtOwzAURCMEEqXwAUgsLLF2uX6l9rIKr0otoLYsWEVOYlcpaVzsRKh_j6t2wWruYmbu6CTJLYERIaAeVtlyOqJA-YgxIRTnZ8mACCExSEjPDzdXWDIqL5OrEDYAVAEjg-RrggiWBVqY0rW2XvdeF41Bj_W67nSDlovJHGVuu-s7g-sWz83W-T2a69I7ZJ1HH96VJoS6XaM3E8NNlO7X-e9wnVxY3QRzc9Jh8vn8tMpe8ez9ZZpNZrikPO2wHFuoLFNSCkVEUXJu4-SCgxZpyaEqKyZVVSohuOaFpWmcbizhAFoVhGo2TO6PvTvvfnoTunzjet_GlzkDIoCKMZXRRY6uODwEb2y-8_VW-31OID8QzA8E8wPB_EQwZu6OmdoY88_PyVimwP4AC7hrDw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3015025728</pqid></control><display><type>article</type><title>A 1-8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks</title><source>IEEE Electronic Library (IEL)</source><creator>You, Heng ; Li, Weijun ; Shang, Delong ; Zhou, Yumei ; Qiao, Shushan</creator><creatorcontrib>You, Heng ; Li, Weijun ; Shang, Delong ; Zhou, Yumei ; Qiao, Shushan</creatorcontrib><description>This work presents a 1-8b reconfigurable digital SRAM compute-in-memory (CIM) macro, which significantly improves array utilization and energy efficiency under different input and weight configurations compared to previous works. To ensure the array utilization under different configurations, a row-based bitwise-summation-first digital CIM architecture is proposed. In addition, to realize flexible switching between signed and unsigned operations, a complete 2's complement encoding method is adopted, which makes the computation of the sign bits consistent with that of the magnitude bits when performing signed operations, thus ensuring that each row of the CIM array can store the sign of the weight. Due to the support of reconfigurable bit width, the proposed CIM macro can be widely used in various neural networks for optimal efficiency. In order to better apply the CIM macro to binarized neural networks, a configurable bitwise multiplier is presented, which supports both AND and XNOR operations. Moreover, since the power consumption of the adder tree occupies a major part of the digital CIM macro, a 4-2 compressor based adder tree is presented to further improve the energy efficiency. Measurement results based on 55nm CMOS process show that the proposed CIM macro achieves an energy efficiency of up to 2238TOPS/W at 1b/1b and 44.82TOPS/W at 4b/4b MAC operations.</description><identifier>ISSN: 1549-8328</identifier><identifier>EISSN: 1558-0806</identifier><identifier>DOI: 10.1109/TCSI.2024.3355944</identifier><identifier>CODEN: ITCSCH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Adders ; Adding circuits ; array utilization ; Arrays ; Common Information Model (computing) ; compute-in-memory ; Configurations ; Energy efficiency ; Memory management ; Neural networks ; Power consumption ; Random access memory ; reconfigurable ; Reconfiguration ; SRAM ; Static random access memory ; Termination of employment</subject><ispartof>IEEE transactions on circuits and systems. I, Regular papers, 2024-04, Vol.71 (4), p.1602-1614</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c246t-87f0df39885915bc44f080b40a56c40dcd389dc9554a4bf26002ef1400a9b12a3</cites><orcidid>0009-0007-5667-3094 ; 0000-0002-9102-2111 ; 0000-0002-9386-8030</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10417860$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10417860$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>You, Heng</creatorcontrib><creatorcontrib>Li, Weijun</creatorcontrib><creatorcontrib>Shang, Delong</creatorcontrib><creatorcontrib>Zhou, Yumei</creatorcontrib><creatorcontrib>Qiao, Shushan</creatorcontrib><title>A 1-8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks</title><title>IEEE transactions on circuits and systems. I, Regular papers</title><addtitle>TCSI</addtitle><description>This work presents a 1-8b reconfigurable digital SRAM compute-in-memory (CIM) macro, which significantly improves array utilization and energy efficiency under different input and weight configurations compared to previous works. To ensure the array utilization under different configurations, a row-based bitwise-summation-first digital CIM architecture is proposed. In addition, to realize flexible switching between signed and unsigned operations, a complete 2's complement encoding method is adopted, which makes the computation of the sign bits consistent with that of the magnitude bits when performing signed operations, thus ensuring that each row of the CIM array can store the sign of the weight. Due to the support of reconfigurable bit width, the proposed CIM macro can be widely used in various neural networks for optimal efficiency. In order to better apply the CIM macro to binarized neural networks, a configurable bitwise multiplier is presented, which supports both AND and XNOR operations. Moreover, since the power consumption of the adder tree occupies a major part of the digital CIM macro, a 4-2 compressor based adder tree is presented to further improve the energy efficiency. Measurement results based on 55nm CMOS process show that the proposed CIM macro achieves an energy efficiency of up to 2238TOPS/W at 1b/1b and 44.82TOPS/W at 4b/4b MAC operations.</description><subject>Adders</subject><subject>Adding circuits</subject><subject>array utilization</subject><subject>Arrays</subject><subject>Common Information Model (computing)</subject><subject>compute-in-memory</subject><subject>Configurations</subject><subject>Energy efficiency</subject><subject>Memory management</subject><subject>Neural networks</subject><subject>Power consumption</subject><subject>Random access memory</subject><subject>reconfigurable</subject><subject>Reconfiguration</subject><subject>SRAM</subject><subject>Static random access memory</subject><subject>Termination of employment</subject><issn>1549-8328</issn><issn>1558-0806</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkMtOwzAURCMEEqXwAUgsLLF2uX6l9rIKr0otoLYsWEVOYlcpaVzsRKh_j6t2wWruYmbu6CTJLYERIaAeVtlyOqJA-YgxIRTnZ8mACCExSEjPDzdXWDIqL5OrEDYAVAEjg-RrggiWBVqY0rW2XvdeF41Bj_W67nSDlovJHGVuu-s7g-sWz83W-T2a69I7ZJ1HH96VJoS6XaM3E8NNlO7X-e9wnVxY3QRzc9Jh8vn8tMpe8ez9ZZpNZrikPO2wHFuoLFNSCkVEUXJu4-SCgxZpyaEqKyZVVSohuOaFpWmcbizhAFoVhGo2TO6PvTvvfnoTunzjet_GlzkDIoCKMZXRRY6uODwEb2y-8_VW-31OID8QzA8E8wPB_EQwZu6OmdoY88_PyVimwP4AC7hrDw</recordid><startdate>20240401</startdate><enddate>20240401</enddate><creator>You, Heng</creator><creator>Li, Weijun</creator><creator>Shang, Delong</creator><creator>Zhou, Yumei</creator><creator>Qiao, Shushan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0009-0007-5667-3094</orcidid><orcidid>https://orcid.org/0000-0002-9102-2111</orcidid><orcidid>https://orcid.org/0000-0002-9386-8030</orcidid></search><sort><creationdate>20240401</creationdate><title>A 1-8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks</title><author>You, Heng ; Li, Weijun ; Shang, Delong ; Zhou, Yumei ; Qiao, Shushan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c246t-87f0df39885915bc44f080b40a56c40dcd389dc9554a4bf26002ef1400a9b12a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Adders</topic><topic>Adding circuits</topic><topic>array utilization</topic><topic>Arrays</topic><topic>Common Information Model (computing)</topic><topic>compute-in-memory</topic><topic>Configurations</topic><topic>Energy efficiency</topic><topic>Memory management</topic><topic>Neural networks</topic><topic>Power consumption</topic><topic>Random access memory</topic><topic>reconfigurable</topic><topic>Reconfiguration</topic><topic>SRAM</topic><topic>Static random access memory</topic><topic>Termination of employment</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>You, Heng</creatorcontrib><creatorcontrib>Li, Weijun</creatorcontrib><creatorcontrib>Shang, Delong</creatorcontrib><creatorcontrib>Zhou, Yumei</creatorcontrib><creatorcontrib>Qiao, Shushan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>You, Heng</au><au>Li, Weijun</au><au>Shang, Delong</au><au>Zhou, Yumei</au><au>Qiao, Shushan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A 1-8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks</atitle><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle><stitle>TCSI</stitle><date>2024-04-01</date><risdate>2024</risdate><volume>71</volume><issue>4</issue><spage>1602</spage><epage>1614</epage><pages>1602-1614</pages><issn>1549-8328</issn><eissn>1558-0806</eissn><coden>ITCSCH</coden><abstract>This work presents a 1-8b reconfigurable digital SRAM compute-in-memory (CIM) macro, which significantly improves array utilization and energy efficiency under different input and weight configurations compared to previous works. To ensure the array utilization under different configurations, a row-based bitwise-summation-first digital CIM architecture is proposed. In addition, to realize flexible switching between signed and unsigned operations, a complete 2's complement encoding method is adopted, which makes the computation of the sign bits consistent with that of the magnitude bits when performing signed operations, thus ensuring that each row of the CIM array can store the sign of the weight. Due to the support of reconfigurable bit width, the proposed CIM macro can be widely used in various neural networks for optimal efficiency. In order to better apply the CIM macro to binarized neural networks, a configurable bitwise multiplier is presented, which supports both AND and XNOR operations. Moreover, since the power consumption of the adder tree occupies a major part of the digital CIM macro, a 4-2 compressor based adder tree is presented to further improve the energy efficiency. Measurement results based on 55nm CMOS process show that the proposed CIM macro achieves an energy efficiency of up to 2238TOPS/W at 1b/1b and 44.82TOPS/W at 4b/4b MAC operations.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSI.2024.3355944</doi><tpages>13</tpages><orcidid>https://orcid.org/0009-0007-5667-3094</orcidid><orcidid>https://orcid.org/0000-0002-9102-2111</orcidid><orcidid>https://orcid.org/0000-0002-9386-8030</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1549-8328 |
ispartof | IEEE transactions on circuits and systems. I, Regular papers, 2024-04, Vol.71 (4), p.1602-1614 |
issn | 1549-8328 1558-0806 |
language | eng |
recordid | cdi_proquest_journals_3015025728 |
source | IEEE Electronic Library (IEL) |
subjects | Adders Adding circuits array utilization Arrays Common Information Model (computing) compute-in-memory Configurations Energy efficiency Memory management Neural networks Power consumption Random access memory reconfigurable Reconfiguration SRAM Static random access memory Termination of employment |
title | A 1-8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T08%3A26%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%201-8b%20Reconfigurable%20Digital%20SRAM%20Compute-in-Memory%20Macro%20for%20Processing%20Neural%20Networks&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems.%20I,%20Regular%20papers&rft.au=You,%20Heng&rft.date=2024-04-01&rft.volume=71&rft.issue=4&rft.spage=1602&rft.epage=1614&rft.pages=1602-1614&rft.issn=1549-8328&rft.eissn=1558-0806&rft.coden=ITCSCH&rft_id=info:doi/10.1109/TCSI.2024.3355944&rft_dat=%3Cproquest_RIE%3E3015025728%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3015025728&rft_id=info:pmid/&rft_ieee_id=10417860&rfr_iscdi=true |