A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications

Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with hig...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2023-04, Vol.70 (4), p.1-14
Hauptverfasser: Kim, Suchang, Jang, Boseon, Lee, Jaeyoung, Bae, Hyungjoon, Jang, Hyejung, Park, In-Cheol
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 14
container_issue 4
container_start_page 1
container_title IEEE transactions on circuits and systems. I, Regular papers
container_volume 70
creator Kim, Suchang
Jang, Boseon
Lee, Jaeyoung
Bae, Hyungjoon
Jang, Hyejung
Park, In-Cheol
description Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with high-resolution images that become mainstream these days, the computational complexity and the feature map size are very large, preventing the CNNs from being implemented on an FPGA. This paper proposes a CNN inference accelerator for the style transfer applications, which employs network compression and layer-chaining techniques. The network compression technique is to make a style transfer CNN have low computational complexity and a small amount of parameters, and an efficient data compression method is proposed to reduce the feature map size. In addition, the layer-chaining technique is proposed to reduce the off-chip memory traffic and thus to increase the throughput at the cost of small hardware resources. In the proposed hardware architecture, a neural processing unit is designed by taking into account the proposed data compression and layer-chaining techniques. A prototype accelerator implemented on a FPGA board achieves a throughput comparable to the state-of-the-art accelerators developed for encoder-decoder CNNs.
doi_str_mv 10.1109/TCSI.2023.3234640
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TCSI_2023_3234640</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10013941</ieee_id><sourcerecordid>2793209142</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-83dbb4cff8c0cae4de6df8ac81738dc0c9cadfbbd4a70949917609f9f254e9993</originalsourceid><addsrcrecordid>eNpNkE9Lw0AQxRdRsFY_gOBhwXPq_muyewxBa6FUoRGPYbOZtSntJu6mh357N7QHTzMM7715_BB6pGRGKVEvZbFZzhhhfMYZF6kgV2hC53OZEEnS63EXKpGcyVt0F8KOEKYIpxPkc1ys13jpLHhwBnBuDOzB66HzuHP47XOR4-922OKiO_QeQmjjVbsGr_QJfFJsdeta94NLMFvX_h4hYButm-G0B1x67UJMxnnf71ujh2gO9-jG6n2Ah8ucoq-317J4T1Yfi2WRrxLDlBhi2aauhbFWGmI0iAbSxkptJM24bOJNGd3Yum6EzogSStEsJcoqy-YClFJ8ip7Pub3vxl5DteuO3sWXFcsUZ0RRwaKKnlXGdyF4sFXv24P2p4qSakRbjWirEW11QRs9T2dPCwD_9IRyJSj_A0uMdig</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2793209142</pqid></control><display><type>article</type><title>A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications</title><source>IEEE Electronic Library (IEL)</source><creator>Kim, Suchang ; Jang, Boseon ; Lee, Jaeyoung ; Bae, Hyungjoon ; Jang, Hyejung ; Park, In-Cheol</creator><creatorcontrib>Kim, Suchang ; Jang, Boseon ; Lee, Jaeyoung ; Bae, Hyungjoon ; Jang, Hyejung ; Park, In-Cheol</creatorcontrib><description>Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with high-resolution images that become mainstream these days, the computational complexity and the feature map size are very large, preventing the CNNs from being implemented on an FPGA. This paper proposes a CNN inference accelerator for the style transfer applications, which employs network compression and layer-chaining techniques. The network compression technique is to make a style transfer CNN have low computational complexity and a small amount of parameters, and an efficient data compression method is proposed to reduce the feature map size. In addition, the layer-chaining technique is proposed to reduce the off-chip memory traffic and thus to increase the throughput at the cost of small hardware resources. In the proposed hardware architecture, a neural processing unit is designed by taking into account the proposed data compression and layer-chaining techniques. A prototype accelerator implemented on a FPGA board achieves a throughput comparable to the state-of-the-art accelerators developed for encoder-decoder CNNs.</description><identifier>ISSN: 1549-8328</identifier><identifier>EISSN: 1558-0806</identifier><identifier>DOI: 10.1109/TCSI.2023.3234640</identifier><identifier>CODEN: ITCSCH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Artificial neural networks ; Chaining ; Chips (memory devices) ; Coders ; Complexity ; compression ; Computer architecture ; Computer vision ; Convolution ; Convolutional neural network (CNN) ; Convolutional neural networks ; Data compression ; Encoders-Decoders ; Feature maps ; field programmable gate array (FPGA) ; Field programmable gate arrays ; Hardware ; Image resolution ; Inference ; Labeling ; neural processing unit (NPU) ; style transfer application ; Superresolution ; Throughput</subject><ispartof>IEEE transactions on circuits and systems. I, Regular papers, 2023-04, Vol.70 (4), p.1-14</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-83dbb4cff8c0cae4de6df8ac81738dc0c9cadfbbd4a70949917609f9f254e9993</citedby><cites>FETCH-LOGICAL-c294t-83dbb4cff8c0cae4de6df8ac81738dc0c9cadfbbd4a70949917609f9f254e9993</cites><orcidid>0000-0002-1128-0429 ; 0000-0001-7045-3281 ; 0000-0003-0907-2492 ; 0000-0002-5175-7981 ; 0000-0003-3524-2838</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10013941$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10013941$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kim, Suchang</creatorcontrib><creatorcontrib>Jang, Boseon</creatorcontrib><creatorcontrib>Lee, Jaeyoung</creatorcontrib><creatorcontrib>Bae, Hyungjoon</creatorcontrib><creatorcontrib>Jang, Hyejung</creatorcontrib><creatorcontrib>Park, In-Cheol</creatorcontrib><title>A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications</title><title>IEEE transactions on circuits and systems. I, Regular papers</title><addtitle>TCSI</addtitle><description>Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with high-resolution images that become mainstream these days, the computational complexity and the feature map size are very large, preventing the CNNs from being implemented on an FPGA. This paper proposes a CNN inference accelerator for the style transfer applications, which employs network compression and layer-chaining techniques. The network compression technique is to make a style transfer CNN have low computational complexity and a small amount of parameters, and an efficient data compression method is proposed to reduce the feature map size. In addition, the layer-chaining technique is proposed to reduce the off-chip memory traffic and thus to increase the throughput at the cost of small hardware resources. In the proposed hardware architecture, a neural processing unit is designed by taking into account the proposed data compression and layer-chaining techniques. A prototype accelerator implemented on a FPGA board achieves a throughput comparable to the state-of-the-art accelerators developed for encoder-decoder CNNs.</description><subject>Artificial neural networks</subject><subject>Chaining</subject><subject>Chips (memory devices)</subject><subject>Coders</subject><subject>Complexity</subject><subject>compression</subject><subject>Computer architecture</subject><subject>Computer vision</subject><subject>Convolution</subject><subject>Convolutional neural network (CNN)</subject><subject>Convolutional neural networks</subject><subject>Data compression</subject><subject>Encoders-Decoders</subject><subject>Feature maps</subject><subject>field programmable gate array (FPGA)</subject><subject>Field programmable gate arrays</subject><subject>Hardware</subject><subject>Image resolution</subject><subject>Inference</subject><subject>Labeling</subject><subject>neural processing unit (NPU)</subject><subject>style transfer application</subject><subject>Superresolution</subject><subject>Throughput</subject><issn>1549-8328</issn><issn>1558-0806</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE9Lw0AQxRdRsFY_gOBhwXPq_muyewxBa6FUoRGPYbOZtSntJu6mh357N7QHTzMM7715_BB6pGRGKVEvZbFZzhhhfMYZF6kgV2hC53OZEEnS63EXKpGcyVt0F8KOEKYIpxPkc1ys13jpLHhwBnBuDOzB66HzuHP47XOR4-922OKiO_QeQmjjVbsGr_QJfFJsdeta94NLMFvX_h4hYButm-G0B1x67UJMxnnf71ujh2gO9-jG6n2Ah8ucoq-317J4T1Yfi2WRrxLDlBhi2aauhbFWGmI0iAbSxkptJM24bOJNGd3Yum6EzogSStEsJcoqy-YClFJ8ip7Pub3vxl5DteuO3sWXFcsUZ0RRwaKKnlXGdyF4sFXv24P2p4qSakRbjWirEW11QRs9T2dPCwD_9IRyJSj_A0uMdig</recordid><startdate>20230401</startdate><enddate>20230401</enddate><creator>Kim, Suchang</creator><creator>Jang, Boseon</creator><creator>Lee, Jaeyoung</creator><creator>Bae, Hyungjoon</creator><creator>Jang, Hyejung</creator><creator>Park, In-Cheol</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-1128-0429</orcidid><orcidid>https://orcid.org/0000-0001-7045-3281</orcidid><orcidid>https://orcid.org/0000-0003-0907-2492</orcidid><orcidid>https://orcid.org/0000-0002-5175-7981</orcidid><orcidid>https://orcid.org/0000-0003-3524-2838</orcidid></search><sort><creationdate>20230401</creationdate><title>A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications</title><author>Kim, Suchang ; Jang, Boseon ; Lee, Jaeyoung ; Bae, Hyungjoon ; Jang, Hyejung ; Park, In-Cheol</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-83dbb4cff8c0cae4de6df8ac81738dc0c9cadfbbd4a70949917609f9f254e9993</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Chaining</topic><topic>Chips (memory devices)</topic><topic>Coders</topic><topic>Complexity</topic><topic>compression</topic><topic>Computer architecture</topic><topic>Computer vision</topic><topic>Convolution</topic><topic>Convolutional neural network (CNN)</topic><topic>Convolutional neural networks</topic><topic>Data compression</topic><topic>Encoders-Decoders</topic><topic>Feature maps</topic><topic>field programmable gate array (FPGA)</topic><topic>Field programmable gate arrays</topic><topic>Hardware</topic><topic>Image resolution</topic><topic>Inference</topic><topic>Labeling</topic><topic>neural processing unit (NPU)</topic><topic>style transfer application</topic><topic>Superresolution</topic><topic>Throughput</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kim, Suchang</creatorcontrib><creatorcontrib>Jang, Boseon</creatorcontrib><creatorcontrib>Lee, Jaeyoung</creatorcontrib><creatorcontrib>Bae, Hyungjoon</creatorcontrib><creatorcontrib>Jang, Hyejung</creatorcontrib><creatorcontrib>Park, In-Cheol</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kim, Suchang</au><au>Jang, Boseon</au><au>Lee, Jaeyoung</au><au>Bae, Hyungjoon</au><au>Jang, Hyejung</au><au>Park, In-Cheol</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications</atitle><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle><stitle>TCSI</stitle><date>2023-04-01</date><risdate>2023</risdate><volume>70</volume><issue>4</issue><spage>1</spage><epage>14</epage><pages>1-14</pages><issn>1549-8328</issn><eissn>1558-0806</eissn><coden>ITCSCH</coden><abstract>Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with high-resolution images that become mainstream these days, the computational complexity and the feature map size are very large, preventing the CNNs from being implemented on an FPGA. This paper proposes a CNN inference accelerator for the style transfer applications, which employs network compression and layer-chaining techniques. The network compression technique is to make a style transfer CNN have low computational complexity and a small amount of parameters, and an efficient data compression method is proposed to reduce the feature map size. In addition, the layer-chaining technique is proposed to reduce the off-chip memory traffic and thus to increase the throughput at the cost of small hardware resources. In the proposed hardware architecture, a neural processing unit is designed by taking into account the proposed data compression and layer-chaining techniques. A prototype accelerator implemented on a FPGA board achieves a throughput comparable to the state-of-the-art accelerators developed for encoder-decoder CNNs.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSI.2023.3234640</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-1128-0429</orcidid><orcidid>https://orcid.org/0000-0001-7045-3281</orcidid><orcidid>https://orcid.org/0000-0003-0907-2492</orcidid><orcidid>https://orcid.org/0000-0002-5175-7981</orcidid><orcidid>https://orcid.org/0000-0003-3524-2838</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1549-8328
ispartof IEEE transactions on circuits and systems. I, Regular papers, 2023-04, Vol.70 (4), p.1-14
issn 1549-8328
1558-0806
language eng
recordid cdi_crossref_primary_10_1109_TCSI_2023_3234640
source IEEE Electronic Library (IEL)
subjects Artificial neural networks
Chaining
Chips (memory devices)
Coders
Complexity
compression
Computer architecture
Computer vision
Convolution
Convolutional neural network (CNN)
Convolutional neural networks
Data compression
Encoders-Decoders
Feature maps
field programmable gate array (FPGA)
Field programmable gate arrays
Hardware
Image resolution
Inference
Labeling
neural processing unit (NPU)
style transfer application
Superresolution
Throughput
title A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T15%3A15%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20CNN%20Inference%20Accelerator%20on%20FPGA%20With%20Compression%20and%20Layer-Chaining%20Techniques%20for%20Style%20Transfer%20Applications&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems.%20I,%20Regular%20papers&rft.au=Kim,%20Suchang&rft.date=2023-04-01&rft.volume=70&rft.issue=4&rft.spage=1&rft.epage=14&rft.pages=1-14&rft.issn=1549-8328&rft.eissn=1558-0806&rft.coden=ITCSCH&rft_id=info:doi/10.1109/TCSI.2023.3234640&rft_dat=%3Cproquest_RIE%3E2793209142%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2793209142&rft_id=info:pmid/&rft_ieee_id=10013941&rfr_iscdi=true