Neuron-Level Fuzzy Memoization in RNNs

Recurrent Neural Networks (RNNs) are a key technology for applications such as automatic speech recognition or machine translation. Unlike conventional feed-forward DNNs, RNNs remember past information to improve the accuracy of future predictions and, therefore, they are very effective for sequence...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Silfa, Franyell, Dot, Gem, Arnau, Jose-Maria, Gonzàlez, Antonio
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 793
container_issue
container_start_page 782
container_title
container_volume
creator Silfa, Franyell
Dot, Gem
Arnau, Jose-Maria
Gonzàlez, Antonio
description Recurrent Neural Networks (RNNs) are a key technology for applications such as automatic speech recognition or machine translation. Unlike conventional feed-forward DNNs, RNNs remember past information to improve the accuracy of future predictions and, therefore, they are very effective for sequence processing problems. For each application run, each recurrent layer is executed many times for processing a potentially large sequence of inputs (words, images, audio frames, etc.). In this paper, we make the observation that the output of a neuron exhibits small changes in consecutive invocations. We exploit this property to build a neuron-level fuzzy memoization scheme, which dynamically caches the output of each neuron and reuses it whenever it is predicted that the current output will be similar to a previously computed result, avoiding in this way the output computations. The main challenge in this scheme is determining whether the new neuron's output for the current input in the sequence will be similar to a recently computed result. To this end, we extend the recurrent layer with a much simpler Bitwise Neural Network (BNN), and show that the BNN and RNN outputs are highly correlated: if two BNN outputs are very similar, the corresponding outputs in the original RNN layer are likely to exhibit negligible changes. The BNN provides a low-cost and effective mechanism for deciding when fuzzy memoization can be applied with a small impact on accuracy. We evaluate our memoization scheme on top of a state-of-the-art accelerator for RNNs, for a variety of different neural networks from multiple application domains. We show that our technique avoids more than 24.2% of computations, resulting in 18.5% energy savings and 1.35x speedup on average.
doi_str_mv 10.1145/3352460.3358309
format Conference Proceeding
fullrecord <record><control><sourceid>csuc_XX2</sourceid><recordid>TN_cdi_csuc_recercat_oai_recercat_cat_2072_370097</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>oai_recercat_cat_2072_370097</sourcerecordid><originalsourceid>FETCH-LOGICAL-a349t-c8f76fa2543fcb0d4c02a61743d4c8fe6a4c1bb0dd7d4ad832fe44a4a1727ded3</originalsourceid><addsrcrecordid>eNqNkM1LxDAQxQMiKGvPXnsSL62TjzbJURZXhVpB9BzSfEB1t4GmFexfb5YtePUwvBnee3P4IXSNocSYVXeUVoTVUCYVFOQZyiQXyQBaSyrwBcpi_AQAIjiWIC_RTevmMQxF477dPt_Ny_KTv7hD6Bc99WHI-yF_a9t4hc693keXrbpBH7uH9-1T0bw-Pm_vm0JTJqfCCM9rr0nFqDcdWGaA6BpzRtMqvKs1M7hLhuWWaSso8Y4xzTTmhFtn6Qbh018TZ6NGZ9xo9KSC7v-O4xDgRFEOIHnq3J462hxUF8JXVBjUEYdacagVR4qW_4yqbuydp79tU195</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Neuron-Level Fuzzy Memoization in RNNs</title><source>Recercat</source><creator>Silfa, Franyell ; Dot, Gem ; Arnau, Jose-Maria ; Gonzàlez, Antonio</creator><creatorcontrib>Silfa, Franyell ; Dot, Gem ; Arnau, Jose-Maria ; Gonzàlez, Antonio</creatorcontrib><description>Recurrent Neural Networks (RNNs) are a key technology for applications such as automatic speech recognition or machine translation. Unlike conventional feed-forward DNNs, RNNs remember past information to improve the accuracy of future predictions and, therefore, they are very effective for sequence processing problems. For each application run, each recurrent layer is executed many times for processing a potentially large sequence of inputs (words, images, audio frames, etc.). In this paper, we make the observation that the output of a neuron exhibits small changes in consecutive invocations. We exploit this property to build a neuron-level fuzzy memoization scheme, which dynamically caches the output of each neuron and reuses it whenever it is predicted that the current output will be similar to a previously computed result, avoiding in this way the output computations. The main challenge in this scheme is determining whether the new neuron's output for the current input in the sequence will be similar to a recently computed result. To this end, we extend the recurrent layer with a much simpler Bitwise Neural Network (BNN), and show that the BNN and RNN outputs are highly correlated: if two BNN outputs are very similar, the corresponding outputs in the original RNN layer are likely to exhibit negligible changes. The BNN provides a low-cost and effective mechanism for deciding when fuzzy memoization can be applied with a small impact on accuracy. We evaluate our memoization scheme on top of a state-of-the-art accelerator for RNNs, for a variety of different neural networks from multiple application domains. We show that our technique avoids more than 24.2% of computations, resulting in 18.5% energy savings and 1.35x speedup on average.</description><identifier>ISBN: 9781450369381</identifier><identifier>ISBN: 1450369383</identifier><identifier>DOI: 10.1145/3352460.3358309</identifier><language>eng</language><publisher>New York, NY, USA: ACM</publisher><subject>Automatic speech recognition ; Binary networks ; Computer systems organization -- Architectures -- Other architectures -- Neural networks ; Computing methodologies -- Machine learning ; Enginyeria de la telecomunicació ; Informàtica ; Long short term memory ; Machine learning ; Memoization ; Neural networks (Computer science) ; Processament de la parla i del senyal acústic ; Processament del senyal ; Reconeixement automàtic de la parla ; Recurrent neural networks ; Xarxes neuronals (Informàtica) ; Àrees temàtiques de la UPC</subject><ispartof>Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019, p.782-793</ispartof><rights>2019 ACM</rights><rights>info:eu-repo/semantics/openAccess</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,309,310,776,881,26951</link.rule.ids><linktorsrc>$$Uhttps://recercat.cat/handle/2072/370097$$EView_record_in_Consorci_de_Serveis_Universitaris_de_Catalunya_(CSUC)$$FView_record_in_$$GConsorci_de_Serveis_Universitaris_de_Catalunya_(CSUC)$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Silfa, Franyell</creatorcontrib><creatorcontrib>Dot, Gem</creatorcontrib><creatorcontrib>Arnau, Jose-Maria</creatorcontrib><creatorcontrib>Gonzàlez, Antonio</creatorcontrib><title>Neuron-Level Fuzzy Memoization in RNNs</title><title>Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture</title><description>Recurrent Neural Networks (RNNs) are a key technology for applications such as automatic speech recognition or machine translation. Unlike conventional feed-forward DNNs, RNNs remember past information to improve the accuracy of future predictions and, therefore, they are very effective for sequence processing problems. For each application run, each recurrent layer is executed many times for processing a potentially large sequence of inputs (words, images, audio frames, etc.). In this paper, we make the observation that the output of a neuron exhibits small changes in consecutive invocations. We exploit this property to build a neuron-level fuzzy memoization scheme, which dynamically caches the output of each neuron and reuses it whenever it is predicted that the current output will be similar to a previously computed result, avoiding in this way the output computations. The main challenge in this scheme is determining whether the new neuron's output for the current input in the sequence will be similar to a recently computed result. To this end, we extend the recurrent layer with a much simpler Bitwise Neural Network (BNN), and show that the BNN and RNN outputs are highly correlated: if two BNN outputs are very similar, the corresponding outputs in the original RNN layer are likely to exhibit negligible changes. The BNN provides a low-cost and effective mechanism for deciding when fuzzy memoization can be applied with a small impact on accuracy. We evaluate our memoization scheme on top of a state-of-the-art accelerator for RNNs, for a variety of different neural networks from multiple application domains. We show that our technique avoids more than 24.2% of computations, resulting in 18.5% energy savings and 1.35x speedup on average.</description><subject>Automatic speech recognition</subject><subject>Binary networks</subject><subject>Computer systems organization -- Architectures -- Other architectures -- Neural networks</subject><subject>Computing methodologies -- Machine learning</subject><subject>Enginyeria de la telecomunicació</subject><subject>Informàtica</subject><subject>Long short term memory</subject><subject>Machine learning</subject><subject>Memoization</subject><subject>Neural networks (Computer science)</subject><subject>Processament de la parla i del senyal acústic</subject><subject>Processament del senyal</subject><subject>Reconeixement automàtic de la parla</subject><subject>Recurrent neural networks</subject><subject>Xarxes neuronals (Informàtica)</subject><subject>Àrees temàtiques de la UPC</subject><isbn>9781450369381</isbn><isbn>1450369383</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2019</creationdate><recordtype>conference_proceeding</recordtype><sourceid>XX2</sourceid><recordid>eNqNkM1LxDAQxQMiKGvPXnsSL62TjzbJURZXhVpB9BzSfEB1t4GmFexfb5YtePUwvBnee3P4IXSNocSYVXeUVoTVUCYVFOQZyiQXyQBaSyrwBcpi_AQAIjiWIC_RTevmMQxF477dPt_Ny_KTv7hD6Bc99WHI-yF_a9t4hc693keXrbpBH7uH9-1T0bw-Pm_vm0JTJqfCCM9rr0nFqDcdWGaA6BpzRtMqvKs1M7hLhuWWaSso8Y4xzTTmhFtn6Qbh018TZ6NGZ9xo9KSC7v-O4xDgRFEOIHnq3J462hxUF8JXVBjUEYdacagVR4qW_4yqbuydp79tU195</recordid><startdate>20190101</startdate><enddate>20190101</enddate><creator>Silfa, Franyell</creator><creator>Dot, Gem</creator><creator>Arnau, Jose-Maria</creator><creator>Gonzàlez, Antonio</creator><general>ACM</general><general>Association for Computing Machinery (ACM)</general><scope>XX2</scope></search><sort><creationdate>20190101</creationdate><title>Neuron-Level Fuzzy Memoization in RNNs</title><author>Silfa, Franyell ; Dot, Gem ; Arnau, Jose-Maria ; Gonzàlez, Antonio</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a349t-c8f76fa2543fcb0d4c02a61743d4c8fe6a4c1bb0dd7d4ad832fe44a4a1727ded3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Automatic speech recognition</topic><topic>Binary networks</topic><topic>Computer systems organization -- Architectures -- Other architectures -- Neural networks</topic><topic>Computing methodologies -- Machine learning</topic><topic>Enginyeria de la telecomunicació</topic><topic>Informàtica</topic><topic>Long short term memory</topic><topic>Machine learning</topic><topic>Memoization</topic><topic>Neural networks (Computer science)</topic><topic>Processament de la parla i del senyal acústic</topic><topic>Processament del senyal</topic><topic>Reconeixement automàtic de la parla</topic><topic>Recurrent neural networks</topic><topic>Xarxes neuronals (Informàtica)</topic><topic>Àrees temàtiques de la UPC</topic><toplevel>online_resources</toplevel><creatorcontrib>Silfa, Franyell</creatorcontrib><creatorcontrib>Dot, Gem</creatorcontrib><creatorcontrib>Arnau, Jose-Maria</creatorcontrib><creatorcontrib>Gonzàlez, Antonio</creatorcontrib><collection>Recercat</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Silfa, Franyell</au><au>Dot, Gem</au><au>Arnau, Jose-Maria</au><au>Gonzàlez, Antonio</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Neuron-Level Fuzzy Memoization in RNNs</atitle><btitle>Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture</btitle><date>2019-01-01</date><risdate>2019</risdate><spage>782</spage><epage>793</epage><pages>782-793</pages><isbn>9781450369381</isbn><isbn>1450369383</isbn><abstract>Recurrent Neural Networks (RNNs) are a key technology for applications such as automatic speech recognition or machine translation. Unlike conventional feed-forward DNNs, RNNs remember past information to improve the accuracy of future predictions and, therefore, they are very effective for sequence processing problems. For each application run, each recurrent layer is executed many times for processing a potentially large sequence of inputs (words, images, audio frames, etc.). In this paper, we make the observation that the output of a neuron exhibits small changes in consecutive invocations. We exploit this property to build a neuron-level fuzzy memoization scheme, which dynamically caches the output of each neuron and reuses it whenever it is predicted that the current output will be similar to a previously computed result, avoiding in this way the output computations. The main challenge in this scheme is determining whether the new neuron's output for the current input in the sequence will be similar to a recently computed result. To this end, we extend the recurrent layer with a much simpler Bitwise Neural Network (BNN), and show that the BNN and RNN outputs are highly correlated: if two BNN outputs are very similar, the corresponding outputs in the original RNN layer are likely to exhibit negligible changes. The BNN provides a low-cost and effective mechanism for deciding when fuzzy memoization can be applied with a small impact on accuracy. We evaluate our memoization scheme on top of a state-of-the-art accelerator for RNNs, for a variety of different neural networks from multiple application domains. We show that our technique avoids more than 24.2% of computations, resulting in 18.5% energy savings and 1.35x speedup on average.</abstract><cop>New York, NY, USA</cop><pub>ACM</pub><doi>10.1145/3352460.3358309</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISBN: 9781450369381
ispartof Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019, p.782-793
issn
language eng
recordid cdi_csuc_recercat_oai_recercat_cat_2072_370097
source Recercat
subjects Automatic speech recognition
Binary networks
Computer systems organization -- Architectures -- Other architectures -- Neural networks
Computing methodologies -- Machine learning
Enginyeria de la telecomunicació
Informàtica
Long short term memory
Machine learning
Memoization
Neural networks (Computer science)
Processament de la parla i del senyal acústic
Processament del senyal
Reconeixement automàtic de la parla
Recurrent neural networks
Xarxes neuronals (Informàtica)
Àrees temàtiques de la UPC
title Neuron-Level Fuzzy Memoization in RNNs
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T01%3A31%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-csuc_XX2&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Neuron-Level%20Fuzzy%20Memoization%20in%20RNNs&rft.btitle=Proceedings%20of%20the%2052nd%20Annual%20IEEE/ACM%20International%20Symposium%20on%20Microarchitecture&rft.au=Silfa,%20Franyell&rft.date=2019-01-01&rft.spage=782&rft.epage=793&rft.pages=782-793&rft.isbn=9781450369381&rft.isbn_list=1450369383&rft_id=info:doi/10.1145/3352460.3358309&rft_dat=%3Ccsuc_XX2%3Eoai_recercat_cat_2072_370097%3C/csuc_XX2%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true