Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation

Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE MICRO 2022-11, Vol.42 (6), p.17-24
Hauptverfasser:	Parra, Cecilia De la, Soliman, Taha, Guntoro, Andre, Kumar, Akash, Wehn, Norbert
Format:	Artikel
Sprache:	eng
Schlagworte:	Accelerators Accuracy Approximation Artificial neural networks Computational modeling Computer memory Deep learning Hardware Image classification In-memory computing Mathematical analysis Neural networks Optimization Quantization (signal) Space exploration Throughput
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	24
container_issue	6
container_start_page	17
container_title	IEEE MICRO
container_volume	42
creator	Parra, Cecilia De la Soliman, Taha Guntoro, Andre Kumar, Akash Wehn, Norbert
description	Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different approximation levels. A sensitivity-based high-dimensional search is performed to explore the optimal approximation level for each DNN layer. Our new methodology offers high flexibility and optimal tradeoff between accuracy and throughput, which we demonstrate by an extensive evaluation on various DNN benchmarks for medium- and large-scale image classification with CIFAR10, CIFAR100, and ImageNet. With our novel approach, we reach an average of 5× and up to 8× speedup without accuracy loss.
doi_str_mv	10.1109/MM.2022.3196865
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9851664</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9851664</ieee_id><sourcerecordid>2731238657</sourcerecordid><originalsourceid>FETCH-LOGICAL-c243t-c018fef74a963b762ea237d495742d400919126ca45d53e25e17bec209d2b343</originalsourceid><addsrcrecordid>eNo9kDFPwzAQhS0EEqUwM7BEYk5rn504HqtCoVJTlm4MkZNc2lRpHOxENP-eVKmYbvnee6ePkGdGZ4xRNY_jGVCAGWcqjMLghkyY4tIXTPBbMqEgwWeSwz15cO5IKQ2ARhPyva4zi9qV9d7bHazp9oemaz1TeOvaj_FkbO-9bbfeIsuwQqtbY52X9t6qwnOZVuhtdI_2t3Q4Yk1jzbk86bY09SO5K3Tl8Ol6p2S3et8tP_3N18d6udj4GQje-hllUYGFFFqFPJUhoAYuc6ECKSAXlCqmGISZFkEecIQAmUwxA6pySLngU_I61g7TPx26NjmaztbDYgKSM-CDDTlQ85HKrHHOYpE0dvjT9gmjyUVgEsfJRWByFTgkXsZEiYj_tIoCFoaC_wGA4Wrg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2731238657</pqid></control><display><type>article</type><title>Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation</title><source>IEEE Electronic Library (IEL)</source><creator>Parra, Cecilia De la ; Soliman, Taha ; Guntoro, Andre ; Kumar, Akash ; Wehn, Norbert</creator><creatorcontrib>Parra, Cecilia De la ; Soliman, Taha ; Guntoro, Andre ; Kumar, Akash ; Wehn, Norbert</creatorcontrib><description>Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different approximation levels. A sensitivity-based high-dimensional search is performed to explore the optimal approximation level for each DNN layer. Our new methodology offers high flexibility and optimal tradeoff between accuracy and throughput, which we demonstrate by an extensive evaluation on various DNN benchmarks for medium- and large-scale image classification with CIFAR10, CIFAR100, and ImageNet. With our novel approach, we reach an average of 5× and up to 8× speedup without accuracy loss.</description><identifier>ISSN: 0272-1732</identifier><identifier>EISSN: 1937-4143</identifier><identifier>DOI: 10.1109/MM.2022.3196865</identifier><identifier>CODEN: IEMIDZ</identifier><language>eng</language><publisher>Los Alamitos: IEEE</publisher><subject>Accelerators ; Accuracy ; Approximation ; Artificial neural networks ; Computational modeling ; Computer memory ; Deep learning ; Hardware ; Image classification ; In-memory computing ; Mathematical analysis ; Neural networks ; Optimization ; Quantization (signal) ; Space exploration ; Throughput</subject><ispartof>IEEE MICRO, 2022-11, Vol.42 (6), p.17-24</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c243t-c018fef74a963b762ea237d495742d400919126ca45d53e25e17bec209d2b343</cites><orcidid>0000-0002-1463-6822 ; 0000-0002-9421-9489 ; 0000-0002-9010-086X ; 0000-0003-4144-0283 ; 0000-0001-7125-1737</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9851664$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9851664$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Parra, Cecilia De la</creatorcontrib><creatorcontrib>Soliman, Taha</creatorcontrib><creatorcontrib>Guntoro, Andre</creatorcontrib><creatorcontrib>Kumar, Akash</creatorcontrib><creatorcontrib>Wehn, Norbert</creatorcontrib><title>Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation</title><title>IEEE MICRO</title><addtitle>MM</addtitle><description>Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different approximation levels. A sensitivity-based high-dimensional search is performed to explore the optimal approximation level for each DNN layer. Our new methodology offers high flexibility and optimal tradeoff between accuracy and throughput, which we demonstrate by an extensive evaluation on various DNN benchmarks for medium- and large-scale image classification with CIFAR10, CIFAR100, and ImageNet. With our novel approach, we reach an average of 5× and up to 8× speedup without accuracy loss.</description><subject>Accelerators</subject><subject>Accuracy</subject><subject>Approximation</subject><subject>Artificial neural networks</subject><subject>Computational modeling</subject><subject>Computer memory</subject><subject>Deep learning</subject><subject>Hardware</subject><subject>Image classification</subject><subject>In-memory computing</subject><subject>Mathematical analysis</subject><subject>Neural networks</subject><subject>Optimization</subject><subject>Quantization (signal)</subject><subject>Space exploration</subject><subject>Throughput</subject><issn>0272-1732</issn><issn>1937-4143</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kDFPwzAQhS0EEqUwM7BEYk5rn504HqtCoVJTlm4MkZNc2lRpHOxENP-eVKmYbvnee6ePkGdGZ4xRNY_jGVCAGWcqjMLghkyY4tIXTPBbMqEgwWeSwz15cO5IKQ2ARhPyva4zi9qV9d7bHazp9oemaz1TeOvaj_FkbO-9bbfeIsuwQqtbY52X9t6qwnOZVuhtdI_2t3Q4Yk1jzbk86bY09SO5K3Tl8Ol6p2S3et8tP_3N18d6udj4GQje-hllUYGFFFqFPJUhoAYuc6ECKSAXlCqmGISZFkEecIQAmUwxA6pySLngU_I61g7TPx26NjmaztbDYgKSM-CDDTlQ85HKrHHOYpE0dvjT9gmjyUVgEsfJRWByFTgkXsZEiYj_tIoCFoaC_wGA4Wrg</recordid><startdate>20221101</startdate><enddate>20221101</enddate><creator>Parra, Cecilia De la</creator><creator>Soliman, Taha</creator><creator>Guntoro, Andre</creator><creator>Kumar, Akash</creator><creator>Wehn, Norbert</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1463-6822</orcidid><orcidid>https://orcid.org/0000-0002-9421-9489</orcidid><orcidid>https://orcid.org/0000-0002-9010-086X</orcidid><orcidid>https://orcid.org/0000-0003-4144-0283</orcidid><orcidid>https://orcid.org/0000-0001-7125-1737</orcidid></search><sort><creationdate>20221101</creationdate><title>Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation</title><author>Parra, Cecilia De la ; Soliman, Taha ; Guntoro, Andre ; Kumar, Akash ; Wehn, Norbert</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c243t-c018fef74a963b762ea237d495742d400919126ca45d53e25e17bec209d2b343</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accelerators</topic><topic>Accuracy</topic><topic>Approximation</topic><topic>Artificial neural networks</topic><topic>Computational modeling</topic><topic>Computer memory</topic><topic>Deep learning</topic><topic>Hardware</topic><topic>Image classification</topic><topic>In-memory computing</topic><topic>Mathematical analysis</topic><topic>Neural networks</topic><topic>Optimization</topic><topic>Quantization (signal)</topic><topic>Space exploration</topic><topic>Throughput</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Parra, Cecilia De la</creatorcontrib><creatorcontrib>Soliman, Taha</creatorcontrib><creatorcontrib>Guntoro, Andre</creatorcontrib><creatorcontrib>Kumar, Akash</creatorcontrib><creatorcontrib>Wehn, Norbert</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE MICRO</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Parra, Cecilia De la</au><au>Soliman, Taha</au><au>Guntoro, Andre</au><au>Kumar, Akash</au><au>Wehn, Norbert</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation</atitle><jtitle>IEEE MICRO</jtitle><stitle>MM</stitle><date>2022-11-01</date><risdate>2022</risdate><volume>42</volume><issue>6</issue><spage>17</spage><epage>24</epage><pages>17-24</pages><issn>0272-1732</issn><eissn>1937-4143</eissn><coden>IEMIDZ</coden><abstract>Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different approximation levels. A sensitivity-based high-dimensional search is performed to explore the optimal approximation level for each DNN layer. Our new methodology offers high flexibility and optimal tradeoff between accuracy and throughput, which we demonstrate by an extensive evaluation on various DNN benchmarks for medium- and large-scale image classification with CIFAR10, CIFAR100, and ImageNet. With our novel approach, we reach an average of 5× and up to 8× speedup without accuracy loss.</abstract><cop>Los Alamitos</cop><pub>IEEE</pub><doi>10.1109/MM.2022.3196865</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0002-1463-6822</orcidid><orcidid>https://orcid.org/0000-0002-9421-9489</orcidid><orcidid>https://orcid.org/0000-0002-9010-086X</orcidid><orcidid>https://orcid.org/0000-0003-4144-0283</orcidid><orcidid>https://orcid.org/0000-0001-7125-1737</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0272-1732
ispartof	IEEE MICRO, 2022-11, Vol.42 (6), p.17-24
issn	0272-1732 1937-4143
language	eng
recordid	cdi_ieee_primary_9851664
source	IEEE Electronic Library (IEL)
subjects	Accelerators Accuracy Approximation Artificial neural networks Computational modeling Computer memory Deep learning Hardware Image classification In-memory computing Mathematical analysis Neural networks Optimization Quantization (signal) Space exploration Throughput
title	Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T13%3A45%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Increasing%20Throughput%20of%20In-Memory%20DNN%20Accelerators%20by%20Flexible%20Layerwise%20DNN%20Approximation&rft.jtitle=IEEE%20MICRO&rft.au=Parra,%20Cecilia%20De%20la&rft.date=2022-11-01&rft.volume=42&rft.issue=6&rft.spage=17&rft.epage=24&rft.pages=17-24&rft.issn=0272-1732&rft.eissn=1937-4143&rft.coden=IEMIDZ&rft_id=info:doi/10.1109/MM.2022.3196865&rft_dat=%3Cproquest_RIE%3E2731238657%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2731238657&rft_id=info:pmid/&rft_ieee_id=9851664&rfr_iscdi=true