Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation
Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different...
Gespeichert in:
Veröffentlicht in: | IEEE MICRO 2022-11, Vol.42 (6), p.17-24 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 24 |
---|---|
container_issue | 6 |
container_start_page | 17 |
container_title | IEEE MICRO |
container_volume | 42 |
creator | Parra, Cecilia De la Soliman, Taha Guntoro, Andre Kumar, Akash Wehn, Norbert |
description | Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different approximation levels. A sensitivity-based high-dimensional search is performed to explore the optimal approximation level for each DNN layer. Our new methodology offers high flexibility and optimal tradeoff between accuracy and throughput, which we demonstrate by an extensive evaluation on various DNN benchmarks for medium- and large-scale image classification with CIFAR10, CIFAR100, and ImageNet. With our novel approach, we reach an average of 5× and up to 8× speedup without accuracy loss. |
doi_str_mv | 10.1109/MM.2022.3196865 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9851664</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9851664</ieee_id><sourcerecordid>2731238657</sourcerecordid><originalsourceid>FETCH-LOGICAL-c243t-c018fef74a963b762ea237d495742d400919126ca45d53e25e17bec209d2b343</originalsourceid><addsrcrecordid>eNo9kDFPwzAQhS0EEqUwM7BEYk5rn504HqtCoVJTlm4MkZNc2lRpHOxENP-eVKmYbvnee6ePkGdGZ4xRNY_jGVCAGWcqjMLghkyY4tIXTPBbMqEgwWeSwz15cO5IKQ2ARhPyva4zi9qV9d7bHazp9oemaz1TeOvaj_FkbO-9bbfeIsuwQqtbY52X9t6qwnOZVuhtdI_2t3Q4Yk1jzbk86bY09SO5K3Tl8Ol6p2S3et8tP_3N18d6udj4GQje-hllUYGFFFqFPJUhoAYuc6ECKSAXlCqmGISZFkEecIQAmUwxA6pySLngU_I61g7TPx26NjmaztbDYgKSM-CDDTlQ85HKrHHOYpE0dvjT9gmjyUVgEsfJRWByFTgkXsZEiYj_tIoCFoaC_wGA4Wrg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2731238657</pqid></control><display><type>article</type><title>Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation</title><source>IEEE Electronic Library (IEL)</source><creator>Parra, Cecilia De la ; Soliman, Taha ; Guntoro, Andre ; Kumar, Akash ; Wehn, Norbert</creator><creatorcontrib>Parra, Cecilia De la ; Soliman, Taha ; Guntoro, Andre ; Kumar, Akash ; Wehn, Norbert</creatorcontrib><description>Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different approximation levels. A sensitivity-based high-dimensional search is performed to explore the optimal approximation level for each DNN layer. Our new methodology offers high flexibility and optimal tradeoff between accuracy and throughput, which we demonstrate by an extensive evaluation on various DNN benchmarks for medium- and large-scale image classification with CIFAR10, CIFAR100, and ImageNet. With our novel approach, we reach an average of 5× and up to 8× speedup without accuracy loss.</description><identifier>ISSN: 0272-1732</identifier><identifier>EISSN: 1937-4143</identifier><identifier>DOI: 10.1109/MM.2022.3196865</identifier><identifier>CODEN: IEMIDZ</identifier><language>eng</language><publisher>Los Alamitos: IEEE</publisher><subject>Accelerators ; Accuracy ; Approximation ; Artificial neural networks ; Computational modeling ; Computer memory ; Deep learning ; Hardware ; Image classification ; In-memory computing ; Mathematical analysis ; Neural networks ; Optimization ; Quantization (signal) ; Space exploration ; Throughput</subject><ispartof>IEEE MICRO, 2022-11, Vol.42 (6), p.17-24</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c243t-c018fef74a963b762ea237d495742d400919126ca45d53e25e17bec209d2b343</cites><orcidid>0000-0002-1463-6822 ; 0000-0002-9421-9489 ; 0000-0002-9010-086X ; 0000-0003-4144-0283 ; 0000-0001-7125-1737</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9851664$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9851664$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Parra, Cecilia De la</creatorcontrib><creatorcontrib>Soliman, Taha</creatorcontrib><creatorcontrib>Guntoro, Andre</creatorcontrib><creatorcontrib>Kumar, Akash</creatorcontrib><creatorcontrib>Wehn, Norbert</creatorcontrib><title>Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation</title><title>IEEE MICRO</title><addtitle>MM</addtitle><description>Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different approximation levels. A sensitivity-based high-dimensional search is performed to explore the optimal approximation level for each DNN layer. Our new methodology offers high flexibility and optimal tradeoff between accuracy and throughput, which we demonstrate by an extensive evaluation on various DNN benchmarks for medium- and large-scale image classification with CIFAR10, CIFAR100, and ImageNet. With our novel approach, we reach an average of 5× and up to 8× speedup without accuracy loss.</description><subject>Accelerators</subject><subject>Accuracy</subject><subject>Approximation</subject><subject>Artificial neural networks</subject><subject>Computational modeling</subject><subject>Computer memory</subject><subject>Deep learning</subject><subject>Hardware</subject><subject>Image classification</subject><subject>In-memory computing</subject><subject>Mathematical analysis</subject><subject>Neural networks</subject><subject>Optimization</subject><subject>Quantization (signal)</subject><subject>Space exploration</subject><subject>Throughput</subject><issn>0272-1732</issn><issn>1937-4143</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kDFPwzAQhS0EEqUwM7BEYk5rn504HqtCoVJTlm4MkZNc2lRpHOxENP-eVKmYbvnee6ePkGdGZ4xRNY_jGVCAGWcqjMLghkyY4tIXTPBbMqEgwWeSwz15cO5IKQ2ARhPyva4zi9qV9d7bHazp9oemaz1TeOvaj_FkbO-9bbfeIsuwQqtbY52X9t6qwnOZVuhtdI_2t3Q4Yk1jzbk86bY09SO5K3Tl8Ol6p2S3et8tP_3N18d6udj4GQje-hllUYGFFFqFPJUhoAYuc6ECKSAXlCqmGISZFkEecIQAmUwxA6pySLngU_I61g7TPx26NjmaztbDYgKSM-CDDTlQ85HKrHHOYpE0dvjT9gmjyUVgEsfJRWByFTgkXsZEiYj_tIoCFoaC_wGA4Wrg</recordid><startdate>20221101</startdate><enddate>20221101</enddate><creator>Parra, Cecilia De la</creator><creator>Soliman, Taha</creator><creator>Guntoro, Andre</creator><creator>Kumar, Akash</creator><creator>Wehn, Norbert</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1463-6822</orcidid><orcidid>https://orcid.org/0000-0002-9421-9489</orcidid><orcidid>https://orcid.org/0000-0002-9010-086X</orcidid><orcidid>https://orcid.org/0000-0003-4144-0283</orcidid><orcidid>https://orcid.org/0000-0001-7125-1737</orcidid></search><sort><creationdate>20221101</creationdate><title>Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation</title><author>Parra, Cecilia De la ; Soliman, Taha ; Guntoro, Andre ; Kumar, Akash ; Wehn, Norbert</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c243t-c018fef74a963b762ea237d495742d400919126ca45d53e25e17bec209d2b343</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accelerators</topic><topic>Accuracy</topic><topic>Approximation</topic><topic>Artificial neural networks</topic><topic>Computational modeling</topic><topic>Computer memory</topic><topic>Deep learning</topic><topic>Hardware</topic><topic>Image classification</topic><topic>In-memory computing</topic><topic>Mathematical analysis</topic><topic>Neural networks</topic><topic>Optimization</topic><topic>Quantization (signal)</topic><topic>Space exploration</topic><topic>Throughput</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Parra, Cecilia De la</creatorcontrib><creatorcontrib>Soliman, Taha</creatorcontrib><creatorcontrib>Guntoro, Andre</creatorcontrib><creatorcontrib>Kumar, Akash</creatorcontrib><creatorcontrib>Wehn, Norbert</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE MICRO</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Parra, Cecilia De la</au><au>Soliman, Taha</au><au>Guntoro, Andre</au><au>Kumar, Akash</au><au>Wehn, Norbert</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation</atitle><jtitle>IEEE MICRO</jtitle><stitle>MM</stitle><date>2022-11-01</date><risdate>2022</risdate><volume>42</volume><issue>6</issue><spage>17</spage><epage>24</epage><pages>17-24</pages><issn>0272-1732</issn><eissn>1937-4143</eissn><coden>IEMIDZ</coden><abstract>Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different approximation levels. A sensitivity-based high-dimensional search is performed to explore the optimal approximation level for each DNN layer. Our new methodology offers high flexibility and optimal tradeoff between accuracy and throughput, which we demonstrate by an extensive evaluation on various DNN benchmarks for medium- and large-scale image classification with CIFAR10, CIFAR100, and ImageNet. With our novel approach, we reach an average of 5× and up to 8× speedup without accuracy loss.</abstract><cop>Los Alamitos</cop><pub>IEEE</pub><doi>10.1109/MM.2022.3196865</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0002-1463-6822</orcidid><orcidid>https://orcid.org/0000-0002-9421-9489</orcidid><orcidid>https://orcid.org/0000-0002-9010-086X</orcidid><orcidid>https://orcid.org/0000-0003-4144-0283</orcidid><orcidid>https://orcid.org/0000-0001-7125-1737</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0272-1732 |
ispartof | IEEE MICRO, 2022-11, Vol.42 (6), p.17-24 |
issn | 0272-1732 1937-4143 |
language | eng |
recordid | cdi_ieee_primary_9851664 |
source | IEEE Electronic Library (IEL) |
subjects | Accelerators Accuracy Approximation Artificial neural networks Computational modeling Computer memory Deep learning Hardware Image classification In-memory computing Mathematical analysis Neural networks Optimization Quantization (signal) Space exploration Throughput |
title | Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T13%3A45%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Increasing%20Throughput%20of%20In-Memory%20DNN%20Accelerators%20by%20Flexible%20Layerwise%20DNN%20Approximation&rft.jtitle=IEEE%20MICRO&rft.au=Parra,%20Cecilia%20De%20la&rft.date=2022-11-01&rft.volume=42&rft.issue=6&rft.spage=17&rft.epage=24&rft.pages=17-24&rft.issn=0272-1732&rft.eissn=1937-4143&rft.coden=IEMIDZ&rft_id=info:doi/10.1109/MM.2022.3196865&rft_dat=%3Cproquest_RIE%3E2731238657%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2731238657&rft_id=info:pmid/&rft_ieee_id=9851664&rfr_iscdi=true |