A 1-8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks

This work presents a 1-8b reconfigurable digital SRAM compute-in-memory (CIM) macro, which significantly improves array utilization and energy efficiency under different input and weight configurations compared to previous works. To ensure the array utilization under different configurations, a row-...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2024-04, Vol.71 (4), p.1602-1614
Hauptverfasser: You, Heng, Li, Weijun, Shang, Delong, Zhou, Yumei, Qiao, Shushan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1614
container_issue 4
container_start_page 1602
container_title IEEE transactions on circuits and systems. I, Regular papers
container_volume 71
creator You, Heng
Li, Weijun
Shang, Delong
Zhou, Yumei
Qiao, Shushan
description This work presents a 1-8b reconfigurable digital SRAM compute-in-memory (CIM) macro, which significantly improves array utilization and energy efficiency under different input and weight configurations compared to previous works. To ensure the array utilization under different configurations, a row-based bitwise-summation-first digital CIM architecture is proposed. In addition, to realize flexible switching between signed and unsigned operations, a complete 2's complement encoding method is adopted, which makes the computation of the sign bits consistent with that of the magnitude bits when performing signed operations, thus ensuring that each row of the CIM array can store the sign of the weight. Due to the support of reconfigurable bit width, the proposed CIM macro can be widely used in various neural networks for optimal efficiency. In order to better apply the CIM macro to binarized neural networks, a configurable bitwise multiplier is presented, which supports both AND and XNOR operations. Moreover, since the power consumption of the adder tree occupies a major part of the digital CIM macro, a 4-2 compressor based adder tree is presented to further improve the energy efficiency. Measurement results based on 55nm CMOS process show that the proposed CIM macro achieves an energy efficiency of up to 2238TOPS/W at 1b/1b and 44.82TOPS/W at 4b/4b MAC operations.
doi_str_mv 10.1109/TCSI.2024.3355944
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3015025728</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10417860</ieee_id><sourcerecordid>3015025728</sourcerecordid><originalsourceid>FETCH-LOGICAL-c246t-87f0df39885915bc44f080b40a56c40dcd389dc9554a4bf26002ef1400a9b12a3</originalsourceid><addsrcrecordid>eNpNkMtOwzAURCMEEqXwAUgsLLF2uX6l9rIKr0otoLYsWEVOYlcpaVzsRKh_j6t2wWruYmbu6CTJLYERIaAeVtlyOqJA-YgxIRTnZ8mACCExSEjPDzdXWDIqL5OrEDYAVAEjg-RrggiWBVqY0rW2XvdeF41Bj_W67nSDlovJHGVuu-s7g-sWz83W-T2a69I7ZJ1HH96VJoS6XaM3E8NNlO7X-e9wnVxY3QRzc9Jh8vn8tMpe8ez9ZZpNZrikPO2wHFuoLFNSCkVEUXJu4-SCgxZpyaEqKyZVVSohuOaFpWmcbizhAFoVhGo2TO6PvTvvfnoTunzjet_GlzkDIoCKMZXRRY6uODwEb2y-8_VW-31OID8QzA8E8wPB_EQwZu6OmdoY88_PyVimwP4AC7hrDw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3015025728</pqid></control><display><type>article</type><title>A 1-8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks</title><source>IEEE Electronic Library (IEL)</source><creator>You, Heng ; Li, Weijun ; Shang, Delong ; Zhou, Yumei ; Qiao, Shushan</creator><creatorcontrib>You, Heng ; Li, Weijun ; Shang, Delong ; Zhou, Yumei ; Qiao, Shushan</creatorcontrib><description>This work presents a 1-8b reconfigurable digital SRAM compute-in-memory (CIM) macro, which significantly improves array utilization and energy efficiency under different input and weight configurations compared to previous works. To ensure the array utilization under different configurations, a row-based bitwise-summation-first digital CIM architecture is proposed. In addition, to realize flexible switching between signed and unsigned operations, a complete 2's complement encoding method is adopted, which makes the computation of the sign bits consistent with that of the magnitude bits when performing signed operations, thus ensuring that each row of the CIM array can store the sign of the weight. Due to the support of reconfigurable bit width, the proposed CIM macro can be widely used in various neural networks for optimal efficiency. In order to better apply the CIM macro to binarized neural networks, a configurable bitwise multiplier is presented, which supports both AND and XNOR operations. Moreover, since the power consumption of the adder tree occupies a major part of the digital CIM macro, a 4-2 compressor based adder tree is presented to further improve the energy efficiency. Measurement results based on 55nm CMOS process show that the proposed CIM macro achieves an energy efficiency of up to 2238TOPS/W at 1b/1b and 44.82TOPS/W at 4b/4b MAC operations.</description><identifier>ISSN: 1549-8328</identifier><identifier>EISSN: 1558-0806</identifier><identifier>DOI: 10.1109/TCSI.2024.3355944</identifier><identifier>CODEN: ITCSCH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Adders ; Adding circuits ; array utilization ; Arrays ; Common Information Model (computing) ; compute-in-memory ; Configurations ; Energy efficiency ; Memory management ; Neural networks ; Power consumption ; Random access memory ; reconfigurable ; Reconfiguration ; SRAM ; Static random access memory ; Termination of employment</subject><ispartof>IEEE transactions on circuits and systems. I, Regular papers, 2024-04, Vol.71 (4), p.1602-1614</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c246t-87f0df39885915bc44f080b40a56c40dcd389dc9554a4bf26002ef1400a9b12a3</cites><orcidid>0009-0007-5667-3094 ; 0000-0002-9102-2111 ; 0000-0002-9386-8030</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10417860$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10417860$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>You, Heng</creatorcontrib><creatorcontrib>Li, Weijun</creatorcontrib><creatorcontrib>Shang, Delong</creatorcontrib><creatorcontrib>Zhou, Yumei</creatorcontrib><creatorcontrib>Qiao, Shushan</creatorcontrib><title>A 1-8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks</title><title>IEEE transactions on circuits and systems. I, Regular papers</title><addtitle>TCSI</addtitle><description>This work presents a 1-8b reconfigurable digital SRAM compute-in-memory (CIM) macro, which significantly improves array utilization and energy efficiency under different input and weight configurations compared to previous works. To ensure the array utilization under different configurations, a row-based bitwise-summation-first digital CIM architecture is proposed. In addition, to realize flexible switching between signed and unsigned operations, a complete 2's complement encoding method is adopted, which makes the computation of the sign bits consistent with that of the magnitude bits when performing signed operations, thus ensuring that each row of the CIM array can store the sign of the weight. Due to the support of reconfigurable bit width, the proposed CIM macro can be widely used in various neural networks for optimal efficiency. In order to better apply the CIM macro to binarized neural networks, a configurable bitwise multiplier is presented, which supports both AND and XNOR operations. Moreover, since the power consumption of the adder tree occupies a major part of the digital CIM macro, a 4-2 compressor based adder tree is presented to further improve the energy efficiency. Measurement results based on 55nm CMOS process show that the proposed CIM macro achieves an energy efficiency of up to 2238TOPS/W at 1b/1b and 44.82TOPS/W at 4b/4b MAC operations.</description><subject>Adders</subject><subject>Adding circuits</subject><subject>array utilization</subject><subject>Arrays</subject><subject>Common Information Model (computing)</subject><subject>compute-in-memory</subject><subject>Configurations</subject><subject>Energy efficiency</subject><subject>Memory management</subject><subject>Neural networks</subject><subject>Power consumption</subject><subject>Random access memory</subject><subject>reconfigurable</subject><subject>Reconfiguration</subject><subject>SRAM</subject><subject>Static random access memory</subject><subject>Termination of employment</subject><issn>1549-8328</issn><issn>1558-0806</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkMtOwzAURCMEEqXwAUgsLLF2uX6l9rIKr0otoLYsWEVOYlcpaVzsRKh_j6t2wWruYmbu6CTJLYERIaAeVtlyOqJA-YgxIRTnZ8mACCExSEjPDzdXWDIqL5OrEDYAVAEjg-RrggiWBVqY0rW2XvdeF41Bj_W67nSDlovJHGVuu-s7g-sWz83W-T2a69I7ZJ1HH96VJoS6XaM3E8NNlO7X-e9wnVxY3QRzc9Jh8vn8tMpe8ez9ZZpNZrikPO2wHFuoLFNSCkVEUXJu4-SCgxZpyaEqKyZVVSohuOaFpWmcbizhAFoVhGo2TO6PvTvvfnoTunzjet_GlzkDIoCKMZXRRY6uODwEb2y-8_VW-31OID8QzA8E8wPB_EQwZu6OmdoY88_PyVimwP4AC7hrDw</recordid><startdate>20240401</startdate><enddate>20240401</enddate><creator>You, Heng</creator><creator>Li, Weijun</creator><creator>Shang, Delong</creator><creator>Zhou, Yumei</creator><creator>Qiao, Shushan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0009-0007-5667-3094</orcidid><orcidid>https://orcid.org/0000-0002-9102-2111</orcidid><orcidid>https://orcid.org/0000-0002-9386-8030</orcidid></search><sort><creationdate>20240401</creationdate><title>A 1-8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks</title><author>You, Heng ; Li, Weijun ; Shang, Delong ; Zhou, Yumei ; Qiao, Shushan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c246t-87f0df39885915bc44f080b40a56c40dcd389dc9554a4bf26002ef1400a9b12a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Adders</topic><topic>Adding circuits</topic><topic>array utilization</topic><topic>Arrays</topic><topic>Common Information Model (computing)</topic><topic>compute-in-memory</topic><topic>Configurations</topic><topic>Energy efficiency</topic><topic>Memory management</topic><topic>Neural networks</topic><topic>Power consumption</topic><topic>Random access memory</topic><topic>reconfigurable</topic><topic>Reconfiguration</topic><topic>SRAM</topic><topic>Static random access memory</topic><topic>Termination of employment</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>You, Heng</creatorcontrib><creatorcontrib>Li, Weijun</creatorcontrib><creatorcontrib>Shang, Delong</creatorcontrib><creatorcontrib>Zhou, Yumei</creatorcontrib><creatorcontrib>Qiao, Shushan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>You, Heng</au><au>Li, Weijun</au><au>Shang, Delong</au><au>Zhou, Yumei</au><au>Qiao, Shushan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A 1-8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks</atitle><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle><stitle>TCSI</stitle><date>2024-04-01</date><risdate>2024</risdate><volume>71</volume><issue>4</issue><spage>1602</spage><epage>1614</epage><pages>1602-1614</pages><issn>1549-8328</issn><eissn>1558-0806</eissn><coden>ITCSCH</coden><abstract>This work presents a 1-8b reconfigurable digital SRAM compute-in-memory (CIM) macro, which significantly improves array utilization and energy efficiency under different input and weight configurations compared to previous works. To ensure the array utilization under different configurations, a row-based bitwise-summation-first digital CIM architecture is proposed. In addition, to realize flexible switching between signed and unsigned operations, a complete 2's complement encoding method is adopted, which makes the computation of the sign bits consistent with that of the magnitude bits when performing signed operations, thus ensuring that each row of the CIM array can store the sign of the weight. Due to the support of reconfigurable bit width, the proposed CIM macro can be widely used in various neural networks for optimal efficiency. In order to better apply the CIM macro to binarized neural networks, a configurable bitwise multiplier is presented, which supports both AND and XNOR operations. Moreover, since the power consumption of the adder tree occupies a major part of the digital CIM macro, a 4-2 compressor based adder tree is presented to further improve the energy efficiency. Measurement results based on 55nm CMOS process show that the proposed CIM macro achieves an energy efficiency of up to 2238TOPS/W at 1b/1b and 44.82TOPS/W at 4b/4b MAC operations.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSI.2024.3355944</doi><tpages>13</tpages><orcidid>https://orcid.org/0009-0007-5667-3094</orcidid><orcidid>https://orcid.org/0000-0002-9102-2111</orcidid><orcidid>https://orcid.org/0000-0002-9386-8030</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1549-8328
ispartof IEEE transactions on circuits and systems. I, Regular papers, 2024-04, Vol.71 (4), p.1602-1614
issn 1549-8328
1558-0806
language eng
recordid cdi_proquest_journals_3015025728
source IEEE Electronic Library (IEL)
subjects Adders
Adding circuits
array utilization
Arrays
Common Information Model (computing)
compute-in-memory
Configurations
Energy efficiency
Memory management
Neural networks
Power consumption
Random access memory
reconfigurable
Reconfiguration
SRAM
Static random access memory
Termination of employment
title A 1-8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T08%3A26%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%201-8b%20Reconfigurable%20Digital%20SRAM%20Compute-in-Memory%20Macro%20for%20Processing%20Neural%20Networks&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems.%20I,%20Regular%20papers&rft.au=You,%20Heng&rft.date=2024-04-01&rft.volume=71&rft.issue=4&rft.spage=1602&rft.epage=1614&rft.pages=1602-1614&rft.issn=1549-8328&rft.eissn=1558-0806&rft.coden=ITCSCH&rft_id=info:doi/10.1109/TCSI.2024.3355944&rft_dat=%3Cproquest_RIE%3E3015025728%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3015025728&rft_id=info:pmid/&rft_ieee_id=10417860&rfr_iscdi=true