A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications

Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with hig...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2023-04, Vol.70 (4), p.1-14
Hauptverfasser:	Kim, Suchang, Jang, Boseon, Lee, Jaeyoung, Bae, Hyungjoon, Jang, Hyejung, Park, In-Cheol
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Chaining Chips (memory devices) Coders Complexity compression Computer architecture Computer vision Convolution Convolutional neural network (CNN) Convolutional neural networks Data compression Encoders-Decoders Feature maps field programmable gate array (FPGA) Field programmable gate arrays Hardware Image resolution Inference Labeling neural processing unit (NPU) style transfer application Superresolution Throughput
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	14
container_issue	4
container_start_page	1
container_title	IEEE transactions on circuits and systems. I, Regular papers
container_volume	70
creator	Kim, Suchang Jang, Boseon Lee, Jaeyoung Bae, Hyungjoon Jang, Hyejung Park, In-Cheol
description	Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with high-resolution images that become mainstream these days, the computational complexity and the feature map size are very large, preventing the CNNs from being implemented on an FPGA. This paper proposes a CNN inference accelerator for the style transfer applications, which employs network compression and layer-chaining techniques. The network compression technique is to make a style transfer CNN have low computational complexity and a small amount of parameters, and an efficient data compression method is proposed to reduce the feature map size. In addition, the layer-chaining technique is proposed to reduce the off-chip memory traffic and thus to increase the throughput at the cost of small hardware resources. In the proposed hardware architecture, a neural processing unit is designed by taking into account the proposed data compression and layer-chaining techniques. A prototype accelerator implemented on a FPGA board achieves a throughput comparable to the state-of-the-art accelerators developed for encoder-decoder CNNs.
doi_str_mv	10.1109/TCSI.2023.3234640
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TCSI_2023_3234640</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10013941</ieee_id><sourcerecordid>2793209142</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-83dbb4cff8c0cae4de6df8ac81738dc0c9cadfbbd4a70949917609f9f254e9993</originalsourceid><addsrcrecordid>eNpNkE9Lw0AQxRdRsFY_gOBhwXPq_muyewxBa6FUoRGPYbOZtSntJu6mh357N7QHTzMM7715_BB6pGRGKVEvZbFZzhhhfMYZF6kgV2hC53OZEEnS63EXKpGcyVt0F8KOEKYIpxPkc1ys13jpLHhwBnBuDOzB66HzuHP47XOR4-922OKiO_QeQmjjVbsGr_QJfFJsdeta94NLMFvX_h4hYButm-G0B1x67UJMxnnf71ujh2gO9-jG6n2Ah8ucoq-317J4T1Yfi2WRrxLDlBhi2aauhbFWGmI0iAbSxkptJM24bOJNGd3Yum6EzogSStEsJcoqy-YClFJ8ip7Pub3vxl5DteuO3sWXFcsUZ0RRwaKKnlXGdyF4sFXv24P2p4qSakRbjWirEW11QRs9T2dPCwD_9IRyJSj_A0uMdig</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2793209142</pqid></control><display><type>article</type><title>A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications</title><source>IEEE Electronic Library (IEL)</source><creator>Kim, Suchang ; Jang, Boseon ; Lee, Jaeyoung ; Bae, Hyungjoon ; Jang, Hyejung ; Park, In-Cheol</creator><creatorcontrib>Kim, Suchang ; Jang, Boseon ; Lee, Jaeyoung ; Bae, Hyungjoon ; Jang, Hyejung ; Park, In-Cheol</creatorcontrib><description>Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with high-resolution images that become mainstream these days, the computational complexity and the feature map size are very large, preventing the CNNs from being implemented on an FPGA. This paper proposes a CNN inference accelerator for the style transfer applications, which employs network compression and layer-chaining techniques. The network compression technique is to make a style transfer CNN have low computational complexity and a small amount of parameters, and an efficient data compression method is proposed to reduce the feature map size. In addition, the layer-chaining technique is proposed to reduce the off-chip memory traffic and thus to increase the throughput at the cost of small hardware resources. In the proposed hardware architecture, a neural processing unit is designed by taking into account the proposed data compression and layer-chaining techniques. A prototype accelerator implemented on a FPGA board achieves a throughput comparable to the state-of-the-art accelerators developed for encoder-decoder CNNs.</description><identifier>ISSN: 1549-8328</identifier><identifier>EISSN: 1558-0806</identifier><identifier>DOI: 10.1109/TCSI.2023.3234640</identifier><identifier>CODEN: ITCSCH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Artificial neural networks ; Chaining ; Chips (memory devices) ; Coders ; Complexity ; compression ; Computer architecture ; Computer vision ; Convolution ; Convolutional neural network (CNN) ; Convolutional neural networks ; Data compression ; Encoders-Decoders ; Feature maps ; field programmable gate array (FPGA) ; Field programmable gate arrays ; Hardware ; Image resolution ; Inference ; Labeling ; neural processing unit (NPU) ; style transfer application ; Superresolution ; Throughput</subject><ispartof>IEEE transactions on circuits and systems. I, Regular papers, 2023-04, Vol.70 (4), p.1-14</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-83dbb4cff8c0cae4de6df8ac81738dc0c9cadfbbd4a70949917609f9f254e9993</citedby><cites>FETCH-LOGICAL-c294t-83dbb4cff8c0cae4de6df8ac81738dc0c9cadfbbd4a70949917609f9f254e9993</cites><orcidid>0000-0002-1128-0429 ; 0000-0001-7045-3281 ; 0000-0003-0907-2492 ; 0000-0002-5175-7981 ; 0000-0003-3524-2838</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10013941$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10013941$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kim, Suchang</creatorcontrib><creatorcontrib>Jang, Boseon</creatorcontrib><creatorcontrib>Lee, Jaeyoung</creatorcontrib><creatorcontrib>Bae, Hyungjoon</creatorcontrib><creatorcontrib>Jang, Hyejung</creatorcontrib><creatorcontrib>Park, In-Cheol</creatorcontrib><title>A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications</title><title>IEEE transactions on circuits and systems. I, Regular papers</title><addtitle>TCSI</addtitle><description>Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with high-resolution images that become mainstream these days, the computational complexity and the feature map size are very large, preventing the CNNs from being implemented on an FPGA. This paper proposes a CNN inference accelerator for the style transfer applications, which employs network compression and layer-chaining techniques. The network compression technique is to make a style transfer CNN have low computational complexity and a small amount of parameters, and an efficient data compression method is proposed to reduce the feature map size. In addition, the layer-chaining technique is proposed to reduce the off-chip memory traffic and thus to increase the throughput at the cost of small hardware resources. In the proposed hardware architecture, a neural processing unit is designed by taking into account the proposed data compression and layer-chaining techniques. A prototype accelerator implemented on a FPGA board achieves a throughput comparable to the state-of-the-art accelerators developed for encoder-decoder CNNs.</description><subject>Artificial neural networks</subject><subject>Chaining</subject><subject>Chips (memory devices)</subject><subject>Coders</subject><subject>Complexity</subject><subject>compression</subject><subject>Computer architecture</subject><subject>Computer vision</subject><subject>Convolution</subject><subject>Convolutional neural network (CNN)</subject><subject>Convolutional neural networks</subject><subject>Data compression</subject><subject>Encoders-Decoders</subject><subject>Feature maps</subject><subject>field programmable gate array (FPGA)</subject><subject>Field programmable gate arrays</subject><subject>Hardware</subject><subject>Image resolution</subject><subject>Inference</subject><subject>Labeling</subject><subject>neural processing unit (NPU)</subject><subject>style transfer application</subject><subject>Superresolution</subject><subject>Throughput</subject><issn>1549-8328</issn><issn>1558-0806</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE9Lw0AQxRdRsFY_gOBhwXPq_muyewxBa6FUoRGPYbOZtSntJu6mh357N7QHTzMM7715_BB6pGRGKVEvZbFZzhhhfMYZF6kgV2hC53OZEEnS63EXKpGcyVt0F8KOEKYIpxPkc1ys13jpLHhwBnBuDOzB66HzuHP47XOR4-922OKiO_QeQmjjVbsGr_QJfFJsdeta94NLMFvX_h4hYButm-G0B1x67UJMxnnf71ujh2gO9-jG6n2Ah8ucoq-317J4T1Yfi2WRrxLDlBhi2aauhbFWGmI0iAbSxkptJM24bOJNGd3Yum6EzogSStEsJcoqy-YClFJ8ip7Pub3vxl5DteuO3sWXFcsUZ0RRwaKKnlXGdyF4sFXv24P2p4qSakRbjWirEW11QRs9T2dPCwD_9IRyJSj_A0uMdig</recordid><startdate>20230401</startdate><enddate>20230401</enddate><creator>Kim, Suchang</creator><creator>Jang, Boseon</creator><creator>Lee, Jaeyoung</creator><creator>Bae, Hyungjoon</creator><creator>Jang, Hyejung</creator><creator>Park, In-Cheol</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-1128-0429</orcidid><orcidid>https://orcid.org/0000-0001-7045-3281</orcidid><orcidid>https://orcid.org/0000-0003-0907-2492</orcidid><orcidid>https://orcid.org/0000-0002-5175-7981</orcidid><orcidid>https://orcid.org/0000-0003-3524-2838</orcidid></search><sort><creationdate>20230401</creationdate><title>A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications</title><author>Kim, Suchang ; Jang, Boseon ; Lee, Jaeyoung ; Bae, Hyungjoon ; Jang, Hyejung ; Park, In-Cheol</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-83dbb4cff8c0cae4de6df8ac81738dc0c9cadfbbd4a70949917609f9f254e9993</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Chaining</topic><topic>Chips (memory devices)</topic><topic>Coders</topic><topic>Complexity</topic><topic>compression</topic><topic>Computer architecture</topic><topic>Computer vision</topic><topic>Convolution</topic><topic>Convolutional neural network (CNN)</topic><topic>Convolutional neural networks</topic><topic>Data compression</topic><topic>Encoders-Decoders</topic><topic>Feature maps</topic><topic>field programmable gate array (FPGA)</topic><topic>Field programmable gate arrays</topic><topic>Hardware</topic><topic>Image resolution</topic><topic>Inference</topic><topic>Labeling</topic><topic>neural processing unit (NPU)</topic><topic>style transfer application</topic><topic>Superresolution</topic><topic>Throughput</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kim, Suchang</creatorcontrib><creatorcontrib>Jang, Boseon</creatorcontrib><creatorcontrib>Lee, Jaeyoung</creatorcontrib><creatorcontrib>Bae, Hyungjoon</creatorcontrib><creatorcontrib>Jang, Hyejung</creatorcontrib><creatorcontrib>Park, In-Cheol</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kim, Suchang</au><au>Jang, Boseon</au><au>Lee, Jaeyoung</au><au>Bae, Hyungjoon</au><au>Jang, Hyejung</au><au>Park, In-Cheol</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications</atitle><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle><stitle>TCSI</stitle><date>2023-04-01</date><risdate>2023</risdate><volume>70</volume><issue>4</issue><spage>1</spage><epage>14</epage><pages>1-14</pages><issn>1549-8328</issn><eissn>1558-0806</eissn><coden>ITCSCH</coden><abstract>Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with high-resolution images that become mainstream these days, the computational complexity and the feature map size are very large, preventing the CNNs from being implemented on an FPGA. This paper proposes a CNN inference accelerator for the style transfer applications, which employs network compression and layer-chaining techniques. The network compression technique is to make a style transfer CNN have low computational complexity and a small amount of parameters, and an efficient data compression method is proposed to reduce the feature map size. In addition, the layer-chaining technique is proposed to reduce the off-chip memory traffic and thus to increase the throughput at the cost of small hardware resources. In the proposed hardware architecture, a neural processing unit is designed by taking into account the proposed data compression and layer-chaining techniques. A prototype accelerator implemented on a FPGA board achieves a throughput comparable to the state-of-the-art accelerators developed for encoder-decoder CNNs.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSI.2023.3234640</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-1128-0429</orcidid><orcidid>https://orcid.org/0000-0001-7045-3281</orcidid><orcidid>https://orcid.org/0000-0003-0907-2492</orcidid><orcidid>https://orcid.org/0000-0002-5175-7981</orcidid><orcidid>https://orcid.org/0000-0003-3524-2838</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1549-8328
ispartof	IEEE transactions on circuits and systems. I, Regular papers, 2023-04, Vol.70 (4), p.1-14
issn	1549-8328 1558-0806
language	eng
recordid	cdi_crossref_primary_10_1109_TCSI_2023_3234640
source	IEEE Electronic Library (IEL)
subjects	Artificial neural networks Chaining Chips (memory devices) Coders Complexity compression Computer architecture Computer vision Convolution Convolutional neural network (CNN) Convolutional neural networks Data compression Encoders-Decoders Feature maps field programmable gate array (FPGA) Field programmable gate arrays Hardware Image resolution Inference Labeling neural processing unit (NPU) style transfer application Superresolution Throughput
title	A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T15%3A15%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20CNN%20Inference%20Accelerator%20on%20FPGA%20With%20Compression%20and%20Layer-Chaining%20Techniques%20for%20Style%20Transfer%20Applications&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems.%20I,%20Regular%20papers&rft.au=Kim,%20Suchang&rft.date=2023-04-01&rft.volume=70&rft.issue=4&rft.spage=1&rft.epage=14&rft.pages=1-14&rft.issn=1549-8328&rft.eissn=1558-0806&rft.coden=ITCSCH&rft_id=info:doi/10.1109/TCSI.2023.3234640&rft_dat=%3Cproquest_RIE%3E2793209142%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2793209142&rft_id=info:pmid/&rft_ieee_id=10013941&rfr_iscdi=true