Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation

Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE MICRO 2022-11, Vol.42 (6), p.17-24
Hauptverfasser: Parra, Cecilia De la, Soliman, Taha, Guntoro, Andre, Kumar, Akash, Wehn, Norbert
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 24
container_issue 6
container_start_page 17
container_title IEEE MICRO
container_volume 42
creator Parra, Cecilia De la
Soliman, Taha
Guntoro, Andre
Kumar, Akash
Wehn, Norbert
description Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different approximation levels. A sensitivity-based high-dimensional search is performed to explore the optimal approximation level for each DNN layer. Our new methodology offers high flexibility and optimal tradeoff between accuracy and throughput, which we demonstrate by an extensive evaluation on various DNN benchmarks for medium- and large-scale image classification with CIFAR10, CIFAR100, and ImageNet. With our novel approach, we reach an average of 5× and up to 8× speedup without accuracy loss.
doi_str_mv 10.1109/MM.2022.3196865
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9851664</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9851664</ieee_id><sourcerecordid>2731238657</sourcerecordid><originalsourceid>FETCH-LOGICAL-c243t-c018fef74a963b762ea237d495742d400919126ca45d53e25e17bec209d2b343</originalsourceid><addsrcrecordid>eNo9kDFPwzAQhS0EEqUwM7BEYk5rn504HqtCoVJTlm4MkZNc2lRpHOxENP-eVKmYbvnee6ePkGdGZ4xRNY_jGVCAGWcqjMLghkyY4tIXTPBbMqEgwWeSwz15cO5IKQ2ARhPyva4zi9qV9d7bHazp9oemaz1TeOvaj_FkbO-9bbfeIsuwQqtbY52X9t6qwnOZVuhtdI_2t3Q4Yk1jzbk86bY09SO5K3Tl8Ol6p2S3et8tP_3N18d6udj4GQje-hllUYGFFFqFPJUhoAYuc6ECKSAXlCqmGISZFkEecIQAmUwxA6pySLngU_I61g7TPx26NjmaztbDYgKSM-CDDTlQ85HKrHHOYpE0dvjT9gmjyUVgEsfJRWByFTgkXsZEiYj_tIoCFoaC_wGA4Wrg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2731238657</pqid></control><display><type>article</type><title>Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation</title><source>IEEE Electronic Library (IEL)</source><creator>Parra, Cecilia De la ; Soliman, Taha ; Guntoro, Andre ; Kumar, Akash ; Wehn, Norbert</creator><creatorcontrib>Parra, Cecilia De la ; Soliman, Taha ; Guntoro, Andre ; Kumar, Akash ; Wehn, Norbert</creatorcontrib><description>Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different approximation levels. A sensitivity-based high-dimensional search is performed to explore the optimal approximation level for each DNN layer. Our new methodology offers high flexibility and optimal tradeoff between accuracy and throughput, which we demonstrate by an extensive evaluation on various DNN benchmarks for medium- and large-scale image classification with CIFAR10, CIFAR100, and ImageNet. With our novel approach, we reach an average of 5× and up to 8× speedup without accuracy loss.</description><identifier>ISSN: 0272-1732</identifier><identifier>EISSN: 1937-4143</identifier><identifier>DOI: 10.1109/MM.2022.3196865</identifier><identifier>CODEN: IEMIDZ</identifier><language>eng</language><publisher>Los Alamitos: IEEE</publisher><subject>Accelerators ; Accuracy ; Approximation ; Artificial neural networks ; Computational modeling ; Computer memory ; Deep learning ; Hardware ; Image classification ; In-memory computing ; Mathematical analysis ; Neural networks ; Optimization ; Quantization (signal) ; Space exploration ; Throughput</subject><ispartof>IEEE MICRO, 2022-11, Vol.42 (6), p.17-24</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c243t-c018fef74a963b762ea237d495742d400919126ca45d53e25e17bec209d2b343</cites><orcidid>0000-0002-1463-6822 ; 0000-0002-9421-9489 ; 0000-0002-9010-086X ; 0000-0003-4144-0283 ; 0000-0001-7125-1737</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9851664$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9851664$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Parra, Cecilia De la</creatorcontrib><creatorcontrib>Soliman, Taha</creatorcontrib><creatorcontrib>Guntoro, Andre</creatorcontrib><creatorcontrib>Kumar, Akash</creatorcontrib><creatorcontrib>Wehn, Norbert</creatorcontrib><title>Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation</title><title>IEEE MICRO</title><addtitle>MM</addtitle><description>Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different approximation levels. A sensitivity-based high-dimensional search is performed to explore the optimal approximation level for each DNN layer. Our new methodology offers high flexibility and optimal tradeoff between accuracy and throughput, which we demonstrate by an extensive evaluation on various DNN benchmarks for medium- and large-scale image classification with CIFAR10, CIFAR100, and ImageNet. With our novel approach, we reach an average of 5× and up to 8× speedup without accuracy loss.</description><subject>Accelerators</subject><subject>Accuracy</subject><subject>Approximation</subject><subject>Artificial neural networks</subject><subject>Computational modeling</subject><subject>Computer memory</subject><subject>Deep learning</subject><subject>Hardware</subject><subject>Image classification</subject><subject>In-memory computing</subject><subject>Mathematical analysis</subject><subject>Neural networks</subject><subject>Optimization</subject><subject>Quantization (signal)</subject><subject>Space exploration</subject><subject>Throughput</subject><issn>0272-1732</issn><issn>1937-4143</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kDFPwzAQhS0EEqUwM7BEYk5rn504HqtCoVJTlm4MkZNc2lRpHOxENP-eVKmYbvnee6ePkGdGZ4xRNY_jGVCAGWcqjMLghkyY4tIXTPBbMqEgwWeSwz15cO5IKQ2ARhPyva4zi9qV9d7bHazp9oemaz1TeOvaj_FkbO-9bbfeIsuwQqtbY52X9t6qwnOZVuhtdI_2t3Q4Yk1jzbk86bY09SO5K3Tl8Ol6p2S3et8tP_3N18d6udj4GQje-hllUYGFFFqFPJUhoAYuc6ECKSAXlCqmGISZFkEecIQAmUwxA6pySLngU_I61g7TPx26NjmaztbDYgKSM-CDDTlQ85HKrHHOYpE0dvjT9gmjyUVgEsfJRWByFTgkXsZEiYj_tIoCFoaC_wGA4Wrg</recordid><startdate>20221101</startdate><enddate>20221101</enddate><creator>Parra, Cecilia De la</creator><creator>Soliman, Taha</creator><creator>Guntoro, Andre</creator><creator>Kumar, Akash</creator><creator>Wehn, Norbert</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1463-6822</orcidid><orcidid>https://orcid.org/0000-0002-9421-9489</orcidid><orcidid>https://orcid.org/0000-0002-9010-086X</orcidid><orcidid>https://orcid.org/0000-0003-4144-0283</orcidid><orcidid>https://orcid.org/0000-0001-7125-1737</orcidid></search><sort><creationdate>20221101</creationdate><title>Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation</title><author>Parra, Cecilia De la ; Soliman, Taha ; Guntoro, Andre ; Kumar, Akash ; Wehn, Norbert</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c243t-c018fef74a963b762ea237d495742d400919126ca45d53e25e17bec209d2b343</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accelerators</topic><topic>Accuracy</topic><topic>Approximation</topic><topic>Artificial neural networks</topic><topic>Computational modeling</topic><topic>Computer memory</topic><topic>Deep learning</topic><topic>Hardware</topic><topic>Image classification</topic><topic>In-memory computing</topic><topic>Mathematical analysis</topic><topic>Neural networks</topic><topic>Optimization</topic><topic>Quantization (signal)</topic><topic>Space exploration</topic><topic>Throughput</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Parra, Cecilia De la</creatorcontrib><creatorcontrib>Soliman, Taha</creatorcontrib><creatorcontrib>Guntoro, Andre</creatorcontrib><creatorcontrib>Kumar, Akash</creatorcontrib><creatorcontrib>Wehn, Norbert</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE MICRO</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Parra, Cecilia De la</au><au>Soliman, Taha</au><au>Guntoro, Andre</au><au>Kumar, Akash</au><au>Wehn, Norbert</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation</atitle><jtitle>IEEE MICRO</jtitle><stitle>MM</stitle><date>2022-11-01</date><risdate>2022</risdate><volume>42</volume><issue>6</issue><spage>17</spage><epage>24</epage><pages>17-24</pages><issn>0272-1732</issn><eissn>1937-4143</eissn><coden>IEMIDZ</coden><abstract>Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different approximation levels. A sensitivity-based high-dimensional search is performed to explore the optimal approximation level for each DNN layer. Our new methodology offers high flexibility and optimal tradeoff between accuracy and throughput, which we demonstrate by an extensive evaluation on various DNN benchmarks for medium- and large-scale image classification with CIFAR10, CIFAR100, and ImageNet. With our novel approach, we reach an average of 5× and up to 8× speedup without accuracy loss.</abstract><cop>Los Alamitos</cop><pub>IEEE</pub><doi>10.1109/MM.2022.3196865</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0002-1463-6822</orcidid><orcidid>https://orcid.org/0000-0002-9421-9489</orcidid><orcidid>https://orcid.org/0000-0002-9010-086X</orcidid><orcidid>https://orcid.org/0000-0003-4144-0283</orcidid><orcidid>https://orcid.org/0000-0001-7125-1737</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0272-1732
ispartof IEEE MICRO, 2022-11, Vol.42 (6), p.17-24
issn 0272-1732
1937-4143
language eng
recordid cdi_ieee_primary_9851664
source IEEE Electronic Library (IEL)
subjects Accelerators
Accuracy
Approximation
Artificial neural networks
Computational modeling
Computer memory
Deep learning
Hardware
Image classification
In-memory computing
Mathematical analysis
Neural networks
Optimization
Quantization (signal)
Space exploration
Throughput
title Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T13%3A45%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Increasing%20Throughput%20of%20In-Memory%20DNN%20Accelerators%20by%20Flexible%20Layerwise%20DNN%20Approximation&rft.jtitle=IEEE%20MICRO&rft.au=Parra,%20Cecilia%20De%20la&rft.date=2022-11-01&rft.volume=42&rft.issue=6&rft.spage=17&rft.epage=24&rft.pages=17-24&rft.issn=0272-1732&rft.eissn=1937-4143&rft.coden=IEMIDZ&rft_id=info:doi/10.1109/MM.2022.3196865&rft_dat=%3Cproquest_RIE%3E2731238657%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2731238657&rft_id=info:pmid/&rft_ieee_id=9851664&rfr_iscdi=true