A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications
Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with hig...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2023-04, Vol.70 (4), p.1-14 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 14 |
---|---|
container_issue | 4 |
container_start_page | 1 |
container_title | IEEE transactions on circuits and systems. I, Regular papers |
container_volume | 70 |
creator | Kim, Suchang Jang, Boseon Lee, Jaeyoung Bae, Hyungjoon Jang, Hyejung Park, In-Cheol |
description | Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with high-resolution images that become mainstream these days, the computational complexity and the feature map size are very large, preventing the CNNs from being implemented on an FPGA. This paper proposes a CNN inference accelerator for the style transfer applications, which employs network compression and layer-chaining techniques. The network compression technique is to make a style transfer CNN have low computational complexity and a small amount of parameters, and an efficient data compression method is proposed to reduce the feature map size. In addition, the layer-chaining technique is proposed to reduce the off-chip memory traffic and thus to increase the throughput at the cost of small hardware resources. In the proposed hardware architecture, a neural processing unit is designed by taking into account the proposed data compression and layer-chaining techniques. A prototype accelerator implemented on a FPGA board achieves a throughput comparable to the state-of-the-art accelerators developed for encoder-decoder CNNs. |
doi_str_mv | 10.1109/TCSI.2023.3234640 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TCSI_2023_3234640</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10013941</ieee_id><sourcerecordid>2793209142</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-83dbb4cff8c0cae4de6df8ac81738dc0c9cadfbbd4a70949917609f9f254e9993</originalsourceid><addsrcrecordid>eNpNkE9Lw0AQxRdRsFY_gOBhwXPq_muyewxBa6FUoRGPYbOZtSntJu6mh357N7QHTzMM7715_BB6pGRGKVEvZbFZzhhhfMYZF6kgV2hC53OZEEnS63EXKpGcyVt0F8KOEKYIpxPkc1ys13jpLHhwBnBuDOzB66HzuHP47XOR4-922OKiO_QeQmjjVbsGr_QJfFJsdeta94NLMFvX_h4hYButm-G0B1x67UJMxnnf71ujh2gO9-jG6n2Ah8ucoq-317J4T1Yfi2WRrxLDlBhi2aauhbFWGmI0iAbSxkptJM24bOJNGd3Yum6EzogSStEsJcoqy-YClFJ8ip7Pub3vxl5DteuO3sWXFcsUZ0RRwaKKnlXGdyF4sFXv24P2p4qSakRbjWirEW11QRs9T2dPCwD_9IRyJSj_A0uMdig</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2793209142</pqid></control><display><type>article</type><title>A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications</title><source>IEEE Electronic Library (IEL)</source><creator>Kim, Suchang ; Jang, Boseon ; Lee, Jaeyoung ; Bae, Hyungjoon ; Jang, Hyejung ; Park, In-Cheol</creator><creatorcontrib>Kim, Suchang ; Jang, Boseon ; Lee, Jaeyoung ; Bae, Hyungjoon ; Jang, Hyejung ; Park, In-Cheol</creatorcontrib><description>Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with high-resolution images that become mainstream these days, the computational complexity and the feature map size are very large, preventing the CNNs from being implemented on an FPGA. This paper proposes a CNN inference accelerator for the style transfer applications, which employs network compression and layer-chaining techniques. The network compression technique is to make a style transfer CNN have low computational complexity and a small amount of parameters, and an efficient data compression method is proposed to reduce the feature map size. In addition, the layer-chaining technique is proposed to reduce the off-chip memory traffic and thus to increase the throughput at the cost of small hardware resources. In the proposed hardware architecture, a neural processing unit is designed by taking into account the proposed data compression and layer-chaining techniques. A prototype accelerator implemented on a FPGA board achieves a throughput comparable to the state-of-the-art accelerators developed for encoder-decoder CNNs.</description><identifier>ISSN: 1549-8328</identifier><identifier>EISSN: 1558-0806</identifier><identifier>DOI: 10.1109/TCSI.2023.3234640</identifier><identifier>CODEN: ITCSCH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Artificial neural networks ; Chaining ; Chips (memory devices) ; Coders ; Complexity ; compression ; Computer architecture ; Computer vision ; Convolution ; Convolutional neural network (CNN) ; Convolutional neural networks ; Data compression ; Encoders-Decoders ; Feature maps ; field programmable gate array (FPGA) ; Field programmable gate arrays ; Hardware ; Image resolution ; Inference ; Labeling ; neural processing unit (NPU) ; style transfer application ; Superresolution ; Throughput</subject><ispartof>IEEE transactions on circuits and systems. I, Regular papers, 2023-04, Vol.70 (4), p.1-14</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-83dbb4cff8c0cae4de6df8ac81738dc0c9cadfbbd4a70949917609f9f254e9993</citedby><cites>FETCH-LOGICAL-c294t-83dbb4cff8c0cae4de6df8ac81738dc0c9cadfbbd4a70949917609f9f254e9993</cites><orcidid>0000-0002-1128-0429 ; 0000-0001-7045-3281 ; 0000-0003-0907-2492 ; 0000-0002-5175-7981 ; 0000-0003-3524-2838</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10013941$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10013941$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kim, Suchang</creatorcontrib><creatorcontrib>Jang, Boseon</creatorcontrib><creatorcontrib>Lee, Jaeyoung</creatorcontrib><creatorcontrib>Bae, Hyungjoon</creatorcontrib><creatorcontrib>Jang, Hyejung</creatorcontrib><creatorcontrib>Park, In-Cheol</creatorcontrib><title>A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications</title><title>IEEE transactions on circuits and systems. I, Regular papers</title><addtitle>TCSI</addtitle><description>Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with high-resolution images that become mainstream these days, the computational complexity and the feature map size are very large, preventing the CNNs from being implemented on an FPGA. This paper proposes a CNN inference accelerator for the style transfer applications, which employs network compression and layer-chaining techniques. The network compression technique is to make a style transfer CNN have low computational complexity and a small amount of parameters, and an efficient data compression method is proposed to reduce the feature map size. In addition, the layer-chaining technique is proposed to reduce the off-chip memory traffic and thus to increase the throughput at the cost of small hardware resources. In the proposed hardware architecture, a neural processing unit is designed by taking into account the proposed data compression and layer-chaining techniques. A prototype accelerator implemented on a FPGA board achieves a throughput comparable to the state-of-the-art accelerators developed for encoder-decoder CNNs.</description><subject>Artificial neural networks</subject><subject>Chaining</subject><subject>Chips (memory devices)</subject><subject>Coders</subject><subject>Complexity</subject><subject>compression</subject><subject>Computer architecture</subject><subject>Computer vision</subject><subject>Convolution</subject><subject>Convolutional neural network (CNN)</subject><subject>Convolutional neural networks</subject><subject>Data compression</subject><subject>Encoders-Decoders</subject><subject>Feature maps</subject><subject>field programmable gate array (FPGA)</subject><subject>Field programmable gate arrays</subject><subject>Hardware</subject><subject>Image resolution</subject><subject>Inference</subject><subject>Labeling</subject><subject>neural processing unit (NPU)</subject><subject>style transfer application</subject><subject>Superresolution</subject><subject>Throughput</subject><issn>1549-8328</issn><issn>1558-0806</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE9Lw0AQxRdRsFY_gOBhwXPq_muyewxBa6FUoRGPYbOZtSntJu6mh357N7QHTzMM7715_BB6pGRGKVEvZbFZzhhhfMYZF6kgV2hC53OZEEnS63EXKpGcyVt0F8KOEKYIpxPkc1ys13jpLHhwBnBuDOzB66HzuHP47XOR4-922OKiO_QeQmjjVbsGr_QJfFJsdeta94NLMFvX_h4hYButm-G0B1x67UJMxnnf71ujh2gO9-jG6n2Ah8ucoq-317J4T1Yfi2WRrxLDlBhi2aauhbFWGmI0iAbSxkptJM24bOJNGd3Yum6EzogSStEsJcoqy-YClFJ8ip7Pub3vxl5DteuO3sWXFcsUZ0RRwaKKnlXGdyF4sFXv24P2p4qSakRbjWirEW11QRs9T2dPCwD_9IRyJSj_A0uMdig</recordid><startdate>20230401</startdate><enddate>20230401</enddate><creator>Kim, Suchang</creator><creator>Jang, Boseon</creator><creator>Lee, Jaeyoung</creator><creator>Bae, Hyungjoon</creator><creator>Jang, Hyejung</creator><creator>Park, In-Cheol</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-1128-0429</orcidid><orcidid>https://orcid.org/0000-0001-7045-3281</orcidid><orcidid>https://orcid.org/0000-0003-0907-2492</orcidid><orcidid>https://orcid.org/0000-0002-5175-7981</orcidid><orcidid>https://orcid.org/0000-0003-3524-2838</orcidid></search><sort><creationdate>20230401</creationdate><title>A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications</title><author>Kim, Suchang ; Jang, Boseon ; Lee, Jaeyoung ; Bae, Hyungjoon ; Jang, Hyejung ; Park, In-Cheol</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-83dbb4cff8c0cae4de6df8ac81738dc0c9cadfbbd4a70949917609f9f254e9993</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Chaining</topic><topic>Chips (memory devices)</topic><topic>Coders</topic><topic>Complexity</topic><topic>compression</topic><topic>Computer architecture</topic><topic>Computer vision</topic><topic>Convolution</topic><topic>Convolutional neural network (CNN)</topic><topic>Convolutional neural networks</topic><topic>Data compression</topic><topic>Encoders-Decoders</topic><topic>Feature maps</topic><topic>field programmable gate array (FPGA)</topic><topic>Field programmable gate arrays</topic><topic>Hardware</topic><topic>Image resolution</topic><topic>Inference</topic><topic>Labeling</topic><topic>neural processing unit (NPU)</topic><topic>style transfer application</topic><topic>Superresolution</topic><topic>Throughput</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kim, Suchang</creatorcontrib><creatorcontrib>Jang, Boseon</creatorcontrib><creatorcontrib>Lee, Jaeyoung</creatorcontrib><creatorcontrib>Bae, Hyungjoon</creatorcontrib><creatorcontrib>Jang, Hyejung</creatorcontrib><creatorcontrib>Park, In-Cheol</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kim, Suchang</au><au>Jang, Boseon</au><au>Lee, Jaeyoung</au><au>Bae, Hyungjoon</au><au>Jang, Hyejung</au><au>Park, In-Cheol</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications</atitle><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle><stitle>TCSI</stitle><date>2023-04-01</date><risdate>2023</risdate><volume>70</volume><issue>4</issue><spage>1</spage><epage>14</epage><pages>1-14</pages><issn>1549-8328</issn><eissn>1558-0806</eissn><coden>ITCSCH</coden><abstract>Recently, convolutional neural networks (CNNs) have actively been applied to computer vision applications such as style transfer that changes the style of a content image into that of a style image. As the style transfer CNNs are based on encoder-decoder network architecture and should deal with high-resolution images that become mainstream these days, the computational complexity and the feature map size are very large, preventing the CNNs from being implemented on an FPGA. This paper proposes a CNN inference accelerator for the style transfer applications, which employs network compression and layer-chaining techniques. The network compression technique is to make a style transfer CNN have low computational complexity and a small amount of parameters, and an efficient data compression method is proposed to reduce the feature map size. In addition, the layer-chaining technique is proposed to reduce the off-chip memory traffic and thus to increase the throughput at the cost of small hardware resources. In the proposed hardware architecture, a neural processing unit is designed by taking into account the proposed data compression and layer-chaining techniques. A prototype accelerator implemented on a FPGA board achieves a throughput comparable to the state-of-the-art accelerators developed for encoder-decoder CNNs.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSI.2023.3234640</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-1128-0429</orcidid><orcidid>https://orcid.org/0000-0001-7045-3281</orcidid><orcidid>https://orcid.org/0000-0003-0907-2492</orcidid><orcidid>https://orcid.org/0000-0002-5175-7981</orcidid><orcidid>https://orcid.org/0000-0003-3524-2838</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1549-8328 |
ispartof | IEEE transactions on circuits and systems. I, Regular papers, 2023-04, Vol.70 (4), p.1-14 |
issn | 1549-8328 1558-0806 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TCSI_2023_3234640 |
source | IEEE Electronic Library (IEL) |
subjects | Artificial neural networks Chaining Chips (memory devices) Coders Complexity compression Computer architecture Computer vision Convolution Convolutional neural network (CNN) Convolutional neural networks Data compression Encoders-Decoders Feature maps field programmable gate array (FPGA) Field programmable gate arrays Hardware Image resolution Inference Labeling neural processing unit (NPU) style transfer application Superresolution Throughput |
title | A CNN Inference Accelerator on FPGA With Compression and Layer-Chaining Techniques for Style Transfer Applications |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T15%3A15%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20CNN%20Inference%20Accelerator%20on%20FPGA%20With%20Compression%20and%20Layer-Chaining%20Techniques%20for%20Style%20Transfer%20Applications&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems.%20I,%20Regular%20papers&rft.au=Kim,%20Suchang&rft.date=2023-04-01&rft.volume=70&rft.issue=4&rft.spage=1&rft.epage=14&rft.pages=1-14&rft.issn=1549-8328&rft.eissn=1558-0806&rft.coden=ITCSCH&rft_id=info:doi/10.1109/TCSI.2023.3234640&rft_dat=%3Cproquest_RIE%3E2793209142%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2793209142&rft_id=info:pmid/&rft_ieee_id=10013941&rfr_iscdi=true |