RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks

With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks. Unfortunately, the structural complexity and computational cost of residual networks make hardwa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEICE Transactions on Information and Systems 2019/05/01, Vol.E102.D(5), pp.1037-1045
Hauptverfasser:	LUO, Cheng, CAO, Wei, WANG, Lingli, LEONG, Philip H. W.
Format:	Artikel
Sprache:	eng
Schlagworte:	Accelerators Accuracy Artificial neural networks batch-normalization layers deep learning FPGA Hardware Image classification Inserts Multiplication Neural networks residual networks software-hardware co-design Task complexity
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1045
container_issue	5
container_start_page	1037
container_title	IEICE Transactions on Information and Systems
container_volume	E102.D
creator	LUO, Cheng CAO, Wei WANG, Lingli LEONG, Philip H. W.
description	With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks. Unfortunately, the structural complexity and computational cost of residual networks make hardware implementation difficult. In this paper, we present the quantized and reconstructed deep neural network (QR-DNN) technique, which first inserts batch normalization (BN) layers in the network during training, and later removes them to facilitate efficient hardware implementation. Moreover, an accurate and efficient residual network accelerator (RNA) is presented based on QR-DNN with batch-normalization-free structures and weights represented in a logarithmic number system. RNA employs a systolic array architecture to perform shift-and-accumulate operations instead of multiplication operations. QR-DNN is shown to achieve a 1∼2% improvement in accuracy over existing techniques, and RNA over previous best fixed-point accelerators. An FPGA implementation on a Xilinx Zynq XC7Z045 device achieves 804.03 GOPS, 104.15 FPS and 91.41% top-5 accuracy for the ResNet-50 benchmark, and state-of-the-art results are also reported for AlexNet and VGG.
doi_str_mv	10.1587/transinf.2018RCP0008
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2303793534</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2303793534</sourcerecordid><originalsourceid>FETCH-LOGICAL-c567t-176ba6edc349d6a35ddd8f3fba5b7b0c347f7bcd6454dbd39101744cb894977f3</originalsourceid><addsrcrecordid>eNpNkMtOwzAQRS0EEqXwBywisU6xYztO2EUtL6kqEMHa8iuQEpxiO0Lw9bgqLV2MPJq55458AThHcIJowS6DE9a3tplkEBX19BFCWByAEWKEpgjn6BCMYInytKA4OwYn3i9hFGaIjoCsF9VVUtmkUmpwIpikNr7Vg-iShQlfvXtfb0xn4q53SRPraRA2tD9GJ8LqKFe99cENKsTJzJhVBKPTjven4KgRnTdnf-8YvNxcP0_v0vnD7f20mqeK5iykiOVS5EYrTEqdC0y11kWDGymoZBLGMWuYVDonlGipcYlg_CBRsihJyViDx-Bi47ty_edgfODLfnA2nuQZhpiVmGISVWSjUq733pmGr1z7Idw3R5Cv0-TbNPlemhGrN9jSB_FqdpBwoVWd-YeuEcz4jNNts2eyE6s34bix-BdGM4hu</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2303793534</pqid></control><display><type>article</type><title>RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks</title><source>J-STAGE (Japan Science & Technology Information Aggregator, Electronic) Freely Available Titles - Japanese</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>LUO, Cheng ; CAO, Wei ; WANG, Lingli ; LEONG, Philip H. W.</creator><creatorcontrib>LUO, Cheng ; CAO, Wei ; WANG, Lingli ; LEONG, Philip H. W.</creatorcontrib><description>With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks. Unfortunately, the structural complexity and computational cost of residual networks make hardware implementation difficult. In this paper, we present the quantized and reconstructed deep neural network (QR-DNN) technique, which first inserts batch normalization (BN) layers in the network during training, and later removes them to facilitate efficient hardware implementation. Moreover, an accurate and efficient residual network accelerator (RNA) is presented based on QR-DNN with batch-normalization-free structures and weights represented in a logarithmic number system. RNA employs a systolic array architecture to perform shift-and-accumulate operations instead of multiplication operations. QR-DNN is shown to achieve a 1∼2% improvement in accuracy over existing techniques, and RNA over previous best fixed-point accelerators. An FPGA implementation on a Xilinx Zynq XC7Z045 device achieves 804.03 GOPS, 104.15 FPS and 91.41% top-5 accuracy for the ResNet-50 benchmark, and state-of-the-art results are also reported for AlexNet and VGG.</description><identifier>ISSN: 0916-8532</identifier><identifier>EISSN: 1745-1361</identifier><identifier>DOI: 10.1587/transinf.2018RCP0008</identifier><language>eng</language><publisher>Tokyo: The Institute of Electronics, Information and Communication Engineers</publisher><subject>Accelerators ; Accuracy ; Artificial neural networks ; batch-normalization layers ; deep learning ; FPGA ; Hardware ; Image classification ; Inserts ; Multiplication ; Neural networks ; residual networks ; software-hardware co-design ; Task complexity</subject><ispartof>IEICE Transactions on Information and Systems, 2019/05/01, Vol.E102.D(5), pp.1037-1045</ispartof><rights>2019 The Institute of Electronics, Information and Communication Engineers</rights><rights>Copyright Japan Science and Technology Agency 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c567t-176ba6edc349d6a35ddd8f3fba5b7b0c347f7bcd6454dbd39101744cb894977f3</citedby><cites>FETCH-LOGICAL-c567t-176ba6edc349d6a35ddd8f3fba5b7b0c347f7bcd6454dbd39101744cb894977f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,1883,27924,27925</link.rule.ids></links><search><creatorcontrib>LUO, Cheng</creatorcontrib><creatorcontrib>CAO, Wei</creatorcontrib><creatorcontrib>WANG, Lingli</creatorcontrib><creatorcontrib>LEONG, Philip H. W.</creatorcontrib><title>RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks</title><title>IEICE Transactions on Information and Systems</title><addtitle>IEICE Trans. Inf. & Syst.</addtitle><description>With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks. Unfortunately, the structural complexity and computational cost of residual networks make hardware implementation difficult. In this paper, we present the quantized and reconstructed deep neural network (QR-DNN) technique, which first inserts batch normalization (BN) layers in the network during training, and later removes them to facilitate efficient hardware implementation. Moreover, an accurate and efficient residual network accelerator (RNA) is presented based on QR-DNN with batch-normalization-free structures and weights represented in a logarithmic number system. RNA employs a systolic array architecture to perform shift-and-accumulate operations instead of multiplication operations. QR-DNN is shown to achieve a 1∼2% improvement in accuracy over existing techniques, and RNA over previous best fixed-point accelerators. An FPGA implementation on a Xilinx Zynq XC7Z045 device achieves 804.03 GOPS, 104.15 FPS and 91.41% top-5 accuracy for the ResNet-50 benchmark, and state-of-the-art results are also reported for AlexNet and VGG.</description><subject>Accelerators</subject><subject>Accuracy</subject><subject>Artificial neural networks</subject><subject>batch-normalization layers</subject><subject>deep learning</subject><subject>FPGA</subject><subject>Hardware</subject><subject>Image classification</subject><subject>Inserts</subject><subject>Multiplication</subject><subject>Neural networks</subject><subject>residual networks</subject><subject>software-hardware co-design</subject><subject>Task complexity</subject><issn>0916-8532</issn><issn>1745-1361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNpNkMtOwzAQRS0EEqXwBywisU6xYztO2EUtL6kqEMHa8iuQEpxiO0Lw9bgqLV2MPJq55458AThHcIJowS6DE9a3tplkEBX19BFCWByAEWKEpgjn6BCMYInytKA4OwYn3i9hFGaIjoCsF9VVUtmkUmpwIpikNr7Vg-iShQlfvXtfb0xn4q53SRPraRA2tD9GJ8LqKFe99cENKsTJzJhVBKPTjven4KgRnTdnf-8YvNxcP0_v0vnD7f20mqeK5iykiOVS5EYrTEqdC0y11kWDGymoZBLGMWuYVDonlGipcYlg_CBRsihJyViDx-Bi47ty_edgfODLfnA2nuQZhpiVmGISVWSjUq733pmGr1z7Idw3R5Cv0-TbNPlemhGrN9jSB_FqdpBwoVWd-YeuEcz4jNNts2eyE6s34bix-BdGM4hu</recordid><startdate>20190501</startdate><enddate>20190501</enddate><creator>LUO, Cheng</creator><creator>CAO, Wei</creator><creator>WANG, Lingli</creator><creator>LEONG, Philip H. W.</creator><general>The Institute of Electronics, Information and Communication Engineers</general><general>Japan Science and Technology Agency</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20190501</creationdate><title>RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks</title><author>LUO, Cheng ; CAO, Wei ; WANG, Lingli ; LEONG, Philip H. W.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c567t-176ba6edc349d6a35ddd8f3fba5b7b0c347f7bcd6454dbd39101744cb894977f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Accelerators</topic><topic>Accuracy</topic><topic>Artificial neural networks</topic><topic>batch-normalization layers</topic><topic>deep learning</topic><topic>FPGA</topic><topic>Hardware</topic><topic>Image classification</topic><topic>Inserts</topic><topic>Multiplication</topic><topic>Neural networks</topic><topic>residual networks</topic><topic>software-hardware co-design</topic><topic>Task complexity</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>LUO, Cheng</creatorcontrib><creatorcontrib>CAO, Wei</creatorcontrib><creatorcontrib>WANG, Lingli</creatorcontrib><creatorcontrib>LEONG, Philip H. W.</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEICE Transactions on Information and Systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>LUO, Cheng</au><au>CAO, Wei</au><au>WANG, Lingli</au><au>LEONG, Philip H. W.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks</atitle><jtitle>IEICE Transactions on Information and Systems</jtitle><addtitle>IEICE Trans. Inf. & Syst.</addtitle><date>2019-05-01</date><risdate>2019</risdate><volume>E102.D</volume><issue>5</issue><spage>1037</spage><epage>1045</epage><pages>1037-1045</pages><issn>0916-8532</issn><eissn>1745-1361</eissn><abstract>With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks. Unfortunately, the structural complexity and computational cost of residual networks make hardware implementation difficult. In this paper, we present the quantized and reconstructed deep neural network (QR-DNN) technique, which first inserts batch normalization (BN) layers in the network during training, and later removes them to facilitate efficient hardware implementation. Moreover, an accurate and efficient residual network accelerator (RNA) is presented based on QR-DNN with batch-normalization-free structures and weights represented in a logarithmic number system. RNA employs a systolic array architecture to perform shift-and-accumulate operations instead of multiplication operations. QR-DNN is shown to achieve a 1∼2% improvement in accuracy over existing techniques, and RNA over previous best fixed-point accelerators. An FPGA implementation on a Xilinx Zynq XC7Z045 device achieves 804.03 GOPS, 104.15 FPS and 91.41% top-5 accuracy for the ResNet-50 benchmark, and state-of-the-art results are also reported for AlexNet and VGG.</abstract><cop>Tokyo</cop><pub>The Institute of Electronics, Information and Communication Engineers</pub><doi>10.1587/transinf.2018RCP0008</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0916-8532
ispartof	IEICE Transactions on Information and Systems, 2019/05/01, Vol.E102.D(5), pp.1037-1045
issn	0916-8532 1745-1361
language	eng
recordid	cdi_proquest_journals_2303793534
source	J-STAGE (Japan Science & Technology Information Aggregator, Electronic) Freely Available Titles - Japanese; EZB-FREE-00999 freely available EZB journals
subjects	Accelerators Accuracy Artificial neural networks batch-normalization layers deep learning FPGA Hardware Image classification Inserts Multiplication Neural networks residual networks software-hardware co-design Task complexity
title	RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T18%3A12%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=RNA:%20An%20Accurate%20Residual%20Network%20Accelerator%20for%20Quantized%20and%20Reconstructed%20Deep%20Neural%20Networks&rft.jtitle=IEICE%20Transactions%20on%20Information%20and%20Systems&rft.au=LUO,%20Cheng&rft.date=2019-05-01&rft.volume=E102.D&rft.issue=5&rft.spage=1037&rft.epage=1045&rft.pages=1037-1045&rft.issn=0916-8532&rft.eissn=1745-1361&rft_id=info:doi/10.1587/transinf.2018RCP0008&rft_dat=%3Cproquest_cross%3E2303793534%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2303793534&rft_id=info:pmid/&rfr_iscdi=true