CGPA: Coarse-Grained Pruning of Activations for Energy-Efficient RNN Inference
Recurrent neural networks (RNNs) perform element-wise multiplications across the activations of gates. We show that a significant percentage of activations are saturated and propose coarse-grained pruning of activations (CGPA) to avoid the computation of entire neurons, based on the activation value...
Gespeichert in:
Veröffentlicht in: | IEEE MICRO 2019-09, Vol.39 (5), p.36-45 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 45 |
---|---|
container_issue | 5 |
container_start_page | 36 |
container_title | IEEE MICRO |
container_volume | 39 |
creator | Riera, Marc Arnau, Jose-Maria Gonzalez, Antonio |
description | Recurrent neural networks (RNNs) perform element-wise multiplications across the activations of gates. We show that a significant percentage of activations are saturated and propose coarse-grained pruning of activations (CGPA) to avoid the computation of entire neurons, based on the activation values of the gates. We show that CGPA can be easily implemented on top of a TPU-like architecture with negligible area overhead, resulting in 12% speedup and 12% energy savings on average for a set of widely used RNNs. |
doi_str_mv | 10.1109/MM.2019.2929742 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_csuc_recercat_oai_recercat_cat_2072_364080</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8771118</ieee_id><sourcerecordid>2289260292</sourcerecordid><originalsourceid>FETCH-LOGICAL-c372t-9a1aa09b74bcdaa16dcfe08b878a632cc4e7a809ccf8306caba9a25a01ec9ec13</originalsourceid><addsrcrecordid>eNpFkMFLKzEQh4MoWH2ePXhZ8Lx1kqybxFsptQq2T0TPYTqdSESzmmwf-N-7pcV3GIaB3_dj-IQ4lzCWEtzVYjFWIN1YOeVMow7ESDpt6kY2-lCMQBlVS6PVsTgp5Q0ArhXYkVhO54-Tm2raYS5czzPGxOvqMW9STK9VF6oJ9fEf9rFLpQpdrmaJ8-t3PQshUuTUV0_LZXWfAmdOxH_EUcD3wmf7fSpebmfP07v64e_8fjp5qEkb1dcOJSK4lWlWtEaU7ZoCg11ZY7HViqhhgxYcUbAaWsIVOlTXCJLJMUl9KuSul8qGfGbiTNj7DuP_YzsKjPK6bcDCwFzumM_cfW249P6t2-Q0vOmVsk61MKgbUlf75tyVkjn4zxw_MH97CX4r2i8Wfiva70UPxMWOiMz8m7bGSCmt_gFRM3h5</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2289260292</pqid></control><display><type>article</type><title>CGPA: Coarse-Grained Pruning of Activations for Energy-Efficient RNN Inference</title><source>IEEE Electronic Library (IEL)</source><creator>Riera, Marc ; Arnau, Jose-Maria ; Gonzalez, Antonio</creator><creatorcontrib>Riera, Marc ; Arnau, Jose-Maria ; Gonzalez, Antonio</creatorcontrib><description>Recurrent neural networks (RNNs) perform element-wise multiplications across the activations of gates. We show that a significant percentage of activations are saturated and propose coarse-grained pruning of activations (CGPA) to avoid the computation of entire neurons, based on the activation values of the gates. We show that CGPA can be easily implemented on top of a TPU-like architecture with negligible area overhead, resulting in 12% speedup and 12% energy savings on average for a set of widely used RNNs.</description><identifier>ISSN: 0272-1732</identifier><identifier>EISSN: 1937-4143</identifier><identifier>DOI: 10.1109/MM.2019.2929742</identifier><identifier>CODEN: IEMIDZ</identifier><language>eng</language><publisher>Los Alamitos: IEEE</publisher><subject>Accelerators ; Aprenentatge automàtic ; Deep learning ; Energy efficiency ; Histograms ; Informàtica ; Intel·ligència artificial ; Logic gates ; Low Energy ; Machine learning ; Mathematical model ; Neurons ; Pruning ; Recurrent neural networks ; RNN ; Àrees temàtiques de la UPC</subject><ispartof>IEEE MICRO, 2019-09, Vol.39 (5), p.36-45</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><rights>info:eu-repo/semantics/openAccess</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c372t-9a1aa09b74bcdaa16dcfe08b878a632cc4e7a809ccf8306caba9a25a01ec9ec13</citedby><cites>FETCH-LOGICAL-c372t-9a1aa09b74bcdaa16dcfe08b878a632cc4e7a809ccf8306caba9a25a01ec9ec13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8771118$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>230,314,776,780,792,881,26951,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8771118$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Riera, Marc</creatorcontrib><creatorcontrib>Arnau, Jose-Maria</creatorcontrib><creatorcontrib>Gonzalez, Antonio</creatorcontrib><title>CGPA: Coarse-Grained Pruning of Activations for Energy-Efficient RNN Inference</title><title>IEEE MICRO</title><addtitle>MM</addtitle><description>Recurrent neural networks (RNNs) perform element-wise multiplications across the activations of gates. We show that a significant percentage of activations are saturated and propose coarse-grained pruning of activations (CGPA) to avoid the computation of entire neurons, based on the activation values of the gates. We show that CGPA can be easily implemented on top of a TPU-like architecture with negligible area overhead, resulting in 12% speedup and 12% energy savings on average for a set of widely used RNNs.</description><subject>Accelerators</subject><subject>Aprenentatge automàtic</subject><subject>Deep learning</subject><subject>Energy efficiency</subject><subject>Histograms</subject><subject>Informàtica</subject><subject>Intel·ligència artificial</subject><subject>Logic gates</subject><subject>Low Energy</subject><subject>Machine learning</subject><subject>Mathematical model</subject><subject>Neurons</subject><subject>Pruning</subject><subject>Recurrent neural networks</subject><subject>RNN</subject><subject>Àrees temàtiques de la UPC</subject><issn>0272-1732</issn><issn>1937-4143</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>XX2</sourceid><recordid>eNpFkMFLKzEQh4MoWH2ePXhZ8Lx1kqybxFsptQq2T0TPYTqdSESzmmwf-N-7pcV3GIaB3_dj-IQ4lzCWEtzVYjFWIN1YOeVMow7ESDpt6kY2-lCMQBlVS6PVsTgp5Q0ArhXYkVhO54-Tm2raYS5czzPGxOvqMW9STK9VF6oJ9fEf9rFLpQpdrmaJ8-t3PQshUuTUV0_LZXWfAmdOxH_EUcD3wmf7fSpebmfP07v64e_8fjp5qEkb1dcOJSK4lWlWtEaU7ZoCg11ZY7HViqhhgxYcUbAaWsIVOlTXCJLJMUl9KuSul8qGfGbiTNj7DuP_YzsKjPK6bcDCwFzumM_cfW249P6t2-Q0vOmVsk61MKgbUlf75tyVkjn4zxw_MH97CX4r2i8Wfiva70UPxMWOiMz8m7bGSCmt_gFRM3h5</recordid><startdate>20190901</startdate><enddate>20190901</enddate><creator>Riera, Marc</creator><creator>Arnau, Jose-Maria</creator><creator>Gonzalez, Antonio</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>XX2</scope></search><sort><creationdate>20190901</creationdate><title>CGPA: Coarse-Grained Pruning of Activations for Energy-Efficient RNN Inference</title><author>Riera, Marc ; Arnau, Jose-Maria ; Gonzalez, Antonio</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c372t-9a1aa09b74bcdaa16dcfe08b878a632cc4e7a809ccf8306caba9a25a01ec9ec13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Accelerators</topic><topic>Aprenentatge automàtic</topic><topic>Deep learning</topic><topic>Energy efficiency</topic><topic>Histograms</topic><topic>Informàtica</topic><topic>Intel·ligència artificial</topic><topic>Logic gates</topic><topic>Low Energy</topic><topic>Machine learning</topic><topic>Mathematical model</topic><topic>Neurons</topic><topic>Pruning</topic><topic>Recurrent neural networks</topic><topic>RNN</topic><topic>Àrees temàtiques de la UPC</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Riera, Marc</creatorcontrib><creatorcontrib>Arnau, Jose-Maria</creatorcontrib><creatorcontrib>Gonzalez, Antonio</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Recercat</collection><jtitle>IEEE MICRO</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Riera, Marc</au><au>Arnau, Jose-Maria</au><au>Gonzalez, Antonio</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CGPA: Coarse-Grained Pruning of Activations for Energy-Efficient RNN Inference</atitle><jtitle>IEEE MICRO</jtitle><stitle>MM</stitle><date>2019-09-01</date><risdate>2019</risdate><volume>39</volume><issue>5</issue><spage>36</spage><epage>45</epage><pages>36-45</pages><issn>0272-1732</issn><eissn>1937-4143</eissn><coden>IEMIDZ</coden><abstract>Recurrent neural networks (RNNs) perform element-wise multiplications across the activations of gates. We show that a significant percentage of activations are saturated and propose coarse-grained pruning of activations (CGPA) to avoid the computation of entire neurons, based on the activation values of the gates. We show that CGPA can be easily implemented on top of a TPU-like architecture with negligible area overhead, resulting in 12% speedup and 12% energy savings on average for a set of widely used RNNs.</abstract><cop>Los Alamitos</cop><pub>IEEE</pub><doi>10.1109/MM.2019.2929742</doi><tpages>10</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0272-1732 |
ispartof | IEEE MICRO, 2019-09, Vol.39 (5), p.36-45 |
issn | 0272-1732 1937-4143 |
language | eng |
recordid | cdi_csuc_recercat_oai_recercat_cat_2072_364080 |
source | IEEE Electronic Library (IEL) |
subjects | Accelerators Aprenentatge automàtic Deep learning Energy efficiency Histograms Informàtica Intel·ligència artificial Logic gates Low Energy Machine learning Mathematical model Neurons Pruning Recurrent neural networks RNN Àrees temàtiques de la UPC |
title | CGPA: Coarse-Grained Pruning of Activations for Energy-Efficient RNN Inference |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-18T21%3A49%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CGPA:%20Coarse-Grained%20Pruning%20of%20Activations%20for%20Energy-Efficient%20RNN%20Inference&rft.jtitle=IEEE%20MICRO&rft.au=Riera,%20Marc&rft.date=2019-09-01&rft.volume=39&rft.issue=5&rft.spage=36&rft.epage=45&rft.pages=36-45&rft.issn=0272-1732&rft.eissn=1937-4143&rft.coden=IEMIDZ&rft_id=info:doi/10.1109/MM.2019.2929742&rft_dat=%3Cproquest_RIE%3E2289260292%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2289260292&rft_id=info:pmid/&rft_ieee_id=8771118&rfr_iscdi=true |