A Novel, Efficient Implementation of a Local Binary Convolutional Neural Network

In order to reduce the computational complexity of convolutional neural networks (CNNs), the local binary convolutional neural network (LBCNN) has been proposed. In the LBCNN, a convolutional layer is divided into two sublayers. Sublayer 1 is a sparse ternary-weighted convolutional layer, and Sublay...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on circuits and systems. II, Express briefs Express briefs, 2021-04, Vol.68 (4), p.1413-1417
Hauptverfasser:	Lin, Ing-Chao, Tang, Chi-Huan, Ni, Chi-Ting, Hu, Xing, Shen, Yu-Tong, Chen, Pei-Yin, Xie, Yuan
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Clocks Complexity Computational complexity Convolutional neural networks Convolutional neural networks (CNNs) Hardware Inference local binary CNN (LBCNN) Memory management Neural networks Power consumption Quantization (signal) VLSI Weight
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1417
container_issue	4
container_start_page	1413
container_title	IEEE transactions on circuits and systems. II, Express briefs
container_volume	68
creator	Lin, Ing-Chao Tang, Chi-Huan Ni, Chi-Ting Hu, Xing Shen, Yu-Tong Chen, Pei-Yin Xie, Yuan
description	In order to reduce the computational complexity of convolutional neural networks (CNNs), the local binary convolutional neural network (LBCNN) has been proposed. In the LBCNN, a convolutional layer is divided into two sublayers. Sublayer 1 is a sparse ternary-weighted convolutional layer, and Sublayer 2 is a 1 \times 1 convolutional layer. With the use of two sublayers, the LBCNN has lower computational complexity and uses less memory than the CNN. In this brief, we propose a platform that includes a weight preprocessor and layer accelerator for the LBCNN. The proposed weight preprocessor takes advantage of the sparsity in the LBCNN and encodes the weight offline. The layer accelerator effectively uses the encoded data to reduce computational complexity and memory accesses for an inference. When compared to the state-of-the-art design, the experimental results show that the number of clock cycles are reduced by 76.32%, and memory usage is reduced by 39.41%. The synthesized results show that the clock period is reduced by 4.76%; the cell area is reduced by 46.48%, and the power consumption is reduced by 40.87%. The inference accuracy is the same as that of the state-of-the-art design.
doi_str_mv	10.1109/TCSII.2020.3036012
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9249011</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9249011</ieee_id><sourcerecordid>2506596059</sourcerecordid><originalsourceid>FETCH-LOGICAL-c295t-7ba018bbf82257771fa573ae145e6de2f4684dec2598e587ffc42c45b2d19c8f3</originalsourceid><addsrcrecordid>eNo9kNFLwzAQxoMoOKf_gL4EfLUzSZOmeZxlamFMwfkc0uwCnV1T03bif2-7DZ_u4-77jrsfQreUzCgl6nGdfeT5jBFGZjGJE0LZGZpQIdIoloqej5qrSEouL9FV224JYYrEbILe53jl91A94IVzpS2h7nC-ayrYDcp0pa-xd9jgpbemwk9lbcIvzny991U_TofmCvpwKN2PD1_X6MKZqoWbU52iz-fFOnuNlm8veTZfRpYp0UWyMISmReFSxoSUkjojZGyAcgHJBpjjSco3YJlQKYhUOmc5s1wUbEOVTV08RffHvU3w3z20nd76Pgz3tJoJkgiVEKEGFzu6bPBtG8DpJpS74QdNiR7J6QM5PZLTJ3JD6O4YKgHgP6AYV4TS-A9kEWmk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2506596059</pqid></control><display><type>article</type><title>A Novel, Efficient Implementation of a Local Binary Convolutional Neural Network</title><source>IEEE Electronic Library (IEL)</source><creator>Lin, Ing-Chao ; Tang, Chi-Huan ; Ni, Chi-Ting ; Hu, Xing ; Shen, Yu-Tong ; Chen, Pei-Yin ; Xie, Yuan</creator><creatorcontrib>Lin, Ing-Chao ; Tang, Chi-Huan ; Ni, Chi-Ting ; Hu, Xing ; Shen, Yu-Tong ; Chen, Pei-Yin ; Xie, Yuan</creatorcontrib><description>In order to reduce the computational complexity of convolutional neural networks (CNNs), the local binary convolutional neural network (LBCNN) has been proposed. In the LBCNN, a convolutional layer is divided into two sublayers. Sublayer 1 is a sparse ternary-weighted convolutional layer, and Sublayer 2 is a 1<inline-formula> <tex-math notation="LaTeX">\times </tex-math></inline-formula>1 convolutional layer. With the use of two sublayers, the LBCNN has lower computational complexity and uses less memory than the CNN. In this brief, we propose a platform that includes a weight preprocessor and layer accelerator for the LBCNN. The proposed weight preprocessor takes advantage of the sparsity in the LBCNN and encodes the weight offline. The layer accelerator effectively uses the encoded data to reduce computational complexity and memory accesses for an inference. When compared to the state-of-the-art design, the experimental results show that the number of clock cycles are reduced by 76.32%, and memory usage is reduced by 39.41%. The synthesized results show that the clock period is reduced by 4.76%; the cell area is reduced by 46.48%, and the power consumption is reduced by 40.87%. The inference accuracy is the same as that of the state-of-the-art design.</description><identifier>ISSN: 1549-7747</identifier><identifier>EISSN: 1558-3791</identifier><identifier>DOI: 10.1109/TCSII.2020.3036012</identifier><identifier>CODEN: ICSPE5</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Artificial neural networks ; Clocks ; Complexity ; Computational complexity ; Convolutional neural networks ; Convolutional neural networks (CNNs) ; Hardware ; Inference ; local binary CNN (LBCNN) ; Memory management ; Neural networks ; Power consumption ; Quantization (signal) ; VLSI ; Weight</subject><ispartof>IEEE transactions on circuits and systems. II, Express briefs, 2021-04, Vol.68 (4), p.1413-1417</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c295t-7ba018bbf82257771fa573ae145e6de2f4684dec2598e587ffc42c45b2d19c8f3</citedby><cites>FETCH-LOGICAL-c295t-7ba018bbf82257771fa573ae145e6de2f4684dec2598e587ffc42c45b2d19c8f3</cites><orcidid>0000-0003-2093-1788 ; 0000-0002-5104-6055 ; 0000-0003-1994-7512</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9249011$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,778,782,794,27911,27912,54745</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9249011$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Lin, Ing-Chao</creatorcontrib><creatorcontrib>Tang, Chi-Huan</creatorcontrib><creatorcontrib>Ni, Chi-Ting</creatorcontrib><creatorcontrib>Hu, Xing</creatorcontrib><creatorcontrib>Shen, Yu-Tong</creatorcontrib><creatorcontrib>Chen, Pei-Yin</creatorcontrib><creatorcontrib>Xie, Yuan</creatorcontrib><title>A Novel, Efficient Implementation of a Local Binary Convolutional Neural Network</title><title>IEEE transactions on circuits and systems. II, Express briefs</title><addtitle>TCSII</addtitle><description>In order to reduce the computational complexity of convolutional neural networks (CNNs), the local binary convolutional neural network (LBCNN) has been proposed. In the LBCNN, a convolutional layer is divided into two sublayers. Sublayer 1 is a sparse ternary-weighted convolutional layer, and Sublayer 2 is a 1<inline-formula> <tex-math notation="LaTeX">\times </tex-math></inline-formula>1 convolutional layer. With the use of two sublayers, the LBCNN has lower computational complexity and uses less memory than the CNN. In this brief, we propose a platform that includes a weight preprocessor and layer accelerator for the LBCNN. The proposed weight preprocessor takes advantage of the sparsity in the LBCNN and encodes the weight offline. The layer accelerator effectively uses the encoded data to reduce computational complexity and memory accesses for an inference. When compared to the state-of-the-art design, the experimental results show that the number of clock cycles are reduced by 76.32%, and memory usage is reduced by 39.41%. The synthesized results show that the clock period is reduced by 4.76%; the cell area is reduced by 46.48%, and the power consumption is reduced by 40.87%. The inference accuracy is the same as that of the state-of-the-art design.</description><subject>Artificial neural networks</subject><subject>Clocks</subject><subject>Complexity</subject><subject>Computational complexity</subject><subject>Convolutional neural networks</subject><subject>Convolutional neural networks (CNNs)</subject><subject>Hardware</subject><subject>Inference</subject><subject>local binary CNN (LBCNN)</subject><subject>Memory management</subject><subject>Neural networks</subject><subject>Power consumption</subject><subject>Quantization (signal)</subject><subject>VLSI</subject><subject>Weight</subject><issn>1549-7747</issn><issn>1558-3791</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kNFLwzAQxoMoOKf_gL4EfLUzSZOmeZxlamFMwfkc0uwCnV1T03bif2-7DZ_u4-77jrsfQreUzCgl6nGdfeT5jBFGZjGJE0LZGZpQIdIoloqej5qrSEouL9FV224JYYrEbILe53jl91A94IVzpS2h7nC-ayrYDcp0pa-xd9jgpbemwk9lbcIvzny991U_TofmCvpwKN2PD1_X6MKZqoWbU52iz-fFOnuNlm8veTZfRpYp0UWyMISmReFSxoSUkjojZGyAcgHJBpjjSco3YJlQKYhUOmc5s1wUbEOVTV08RffHvU3w3z20nd76Pgz3tJoJkgiVEKEGFzu6bPBtG8DpJpS74QdNiR7J6QM5PZLTJ3JD6O4YKgHgP6AYV4TS-A9kEWmk</recordid><startdate>20210401</startdate><enddate>20210401</enddate><creator>Lin, Ing-Chao</creator><creator>Tang, Chi-Huan</creator><creator>Ni, Chi-Ting</creator><creator>Hu, Xing</creator><creator>Shen, Yu-Tong</creator><creator>Chen, Pei-Yin</creator><creator>Xie, Yuan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0003-2093-1788</orcidid><orcidid>https://orcid.org/0000-0002-5104-6055</orcidid><orcidid>https://orcid.org/0000-0003-1994-7512</orcidid></search><sort><creationdate>20210401</creationdate><title>A Novel, Efficient Implementation of a Local Binary Convolutional Neural Network</title><author>Lin, Ing-Chao ; Tang, Chi-Huan ; Ni, Chi-Ting ; Hu, Xing ; Shen, Yu-Tong ; Chen, Pei-Yin ; Xie, Yuan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c295t-7ba018bbf82257771fa573ae145e6de2f4684dec2598e587ffc42c45b2d19c8f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Artificial neural networks</topic><topic>Clocks</topic><topic>Complexity</topic><topic>Computational complexity</topic><topic>Convolutional neural networks</topic><topic>Convolutional neural networks (CNNs)</topic><topic>Hardware</topic><topic>Inference</topic><topic>local binary CNN (LBCNN)</topic><topic>Memory management</topic><topic>Neural networks</topic><topic>Power consumption</topic><topic>Quantization (signal)</topic><topic>VLSI</topic><topic>Weight</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lin, Ing-Chao</creatorcontrib><creatorcontrib>Tang, Chi-Huan</creatorcontrib><creatorcontrib>Ni, Chi-Ting</creatorcontrib><creatorcontrib>Hu, Xing</creatorcontrib><creatorcontrib>Shen, Yu-Tong</creatorcontrib><creatorcontrib>Chen, Pei-Yin</creatorcontrib><creatorcontrib>Xie, Yuan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on circuits and systems. II, Express briefs</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lin, Ing-Chao</au><au>Tang, Chi-Huan</au><au>Ni, Chi-Ting</au><au>Hu, Xing</au><au>Shen, Yu-Tong</au><au>Chen, Pei-Yin</au><au>Xie, Yuan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Novel, Efficient Implementation of a Local Binary Convolutional Neural Network</atitle><jtitle>IEEE transactions on circuits and systems. II, Express briefs</jtitle><stitle>TCSII</stitle><date>2021-04-01</date><risdate>2021</risdate><volume>68</volume><issue>4</issue><spage>1413</spage><epage>1417</epage><pages>1413-1417</pages><issn>1549-7747</issn><eissn>1558-3791</eissn><coden>ICSPE5</coden><abstract>In order to reduce the computational complexity of convolutional neural networks (CNNs), the local binary convolutional neural network (LBCNN) has been proposed. In the LBCNN, a convolutional layer is divided into two sublayers. Sublayer 1 is a sparse ternary-weighted convolutional layer, and Sublayer 2 is a 1<inline-formula> <tex-math notation="LaTeX">\times </tex-math></inline-formula>1 convolutional layer. With the use of two sublayers, the LBCNN has lower computational complexity and uses less memory than the CNN. In this brief, we propose a platform that includes a weight preprocessor and layer accelerator for the LBCNN. The proposed weight preprocessor takes advantage of the sparsity in the LBCNN and encodes the weight offline. The layer accelerator effectively uses the encoded data to reduce computational complexity and memory accesses for an inference. When compared to the state-of-the-art design, the experimental results show that the number of clock cycles are reduced by 76.32%, and memory usage is reduced by 39.41%. The synthesized results show that the clock period is reduced by 4.76%; the cell area is reduced by 46.48%, and the power consumption is reduced by 40.87%. The inference accuracy is the same as that of the state-of-the-art design.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSII.2020.3036012</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0003-2093-1788</orcidid><orcidid>https://orcid.org/0000-0002-5104-6055</orcidid><orcidid>https://orcid.org/0000-0003-1994-7512</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1549-7747
ispartof	IEEE transactions on circuits and systems. II, Express briefs, 2021-04, Vol.68 (4), p.1413-1417
issn	1549-7747 1558-3791
language	eng
recordid	cdi_ieee_primary_9249011
source	IEEE Electronic Library (IEL)
subjects	Artificial neural networks Clocks Complexity Computational complexity Convolutional neural networks Convolutional neural networks (CNNs) Hardware Inference local binary CNN (LBCNN) Memory management Neural networks Power consumption Quantization (signal) VLSI Weight
title	A Novel, Efficient Implementation of a Local Binary Convolutional Neural Network
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T22%3A31%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Novel,%20Efficient%20Implementation%20of%20a%20Local%20Binary%20Convolutional%20Neural%20Network&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems.%20II,%20Express%20briefs&rft.au=Lin,%20Ing-Chao&rft.date=2021-04-01&rft.volume=68&rft.issue=4&rft.spage=1413&rft.epage=1417&rft.pages=1413-1417&rft.issn=1549-7747&rft.eissn=1558-3791&rft.coden=ICSPE5&rft_id=info:doi/10.1109/TCSII.2020.3036012&rft_dat=%3Cproquest_RIE%3E2506596059%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2506596059&rft_id=info:pmid/&rft_ieee_id=9249011&rfr_iscdi=true