An FPGA-Based Residual Recurrent Neural Network for Real-Time Video Super-Resolution

In this paper, we propose a hardware-efficient residual recurrent neural network for real-time video super-resolution (VSR) based on field programmable gate array (FPGA). Although recent learning-based VSR methods have achieved remarkable performance, the large computational complexity prohibits the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on circuits and systems for video technology 2022-04, Vol.32 (4), p.1739-1750
Hauptverfasser:	Sun, Kaicong, Koch, Maurice, Wang, Zhe, Jovanovic, Slavisa, Rabah, Hassan, Simon, Sven
Format:	Artikel
Sprache:	eng
Schlagworte:	4K UHD Complexity Computer Science Convolution Field programmable gate arrays FPGA Hardware hardware-efficient Image quality Image reconstruction Image resolution Neural networks Real time Real-time systems Recurrent neural networks residual recurrent neural network Streaming media Superresolution UHDTV Video super-resolution
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1750
container_issue	4
container_start_page	1739
container_title	IEEE transactions on circuits and systems for video technology
container_volume	32
creator	Sun, Kaicong Koch, Maurice Wang, Zhe Jovanovic, Slavisa Rabah, Hassan Simon, Sven
description	In this paper, we propose a hardware-efficient residual recurrent neural network for real-time video super-resolution (VSR) based on field programmable gate array (FPGA). Although recent learning-based VSR methods have achieved remarkable performance, the large computational complexity prohibits the deployment of the sophisticated VSR models on FPGA for real-time applications. Limited by the hardware resources, state-of-the-art FPGA-based VSR methods perform single-image super-resolution over the video sequence and suffer from temporal inconsistency. In order to exploit the inter-frame temporal correlation for real-time VSR on low-complexity hardware, we introduce a hardware-efficient recurrent neural network ERVSR. Specially, the proposed ERVSR leverages the input frame and the temporal information entailed in the hidden state to reconstruct the high-resolution counterpart. To reduce the network parameters, the low-resolution input branch and the hidden state branch are convolved individually and a channel modulation coefficient is proposed to explicitly guide the network to allocate the amount of output feature channels to each branch. Additionally, in order to reduce the memory consumption, we perform a dedicated lightweight compression of the hidden state by introducing a statistical normalization scheme followed by a fixed-point quantization. Besides, we adopt group convolution and depthwise separable convolution to further compact the network. We evaluated the proposed ERVSR on multiple public datasets from different aspects. Experimental results demonstrate that ERVSR performs better than the existing state-of-the-art FPGA-based VSR methods in both image quality and data throughput.
doi_str_mv	10.1109/TCSVT.2021.3080241
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2647425573</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9430531</ieee_id><sourcerecordid>2647425573</sourcerecordid><originalsourceid>FETCH-LOGICAL-c329t-3ae52086b20c9d197b751e9fe4b392e1341fc11bede8b034b386ba3cbc9dfef23</originalsourceid><addsrcrecordid>eNo9kE9PwkAQxTdGExH9Anpp4slDcWf_0O2xEgETgkYq1822ncZiYXHbavz2LpZ4mpc3vzeZPEKugY4AaHyfTlbrdMQogxGnijIBJ2QAUqqQMSpPvaYSQsVAnpOLptlQCkKJaEDSZBdMX2ZJ-GAaLIJXbKqiM7UXeecc7tpgiZ3zxhLbb-s-gtI6vzR1mFZbDNZVgTZYdXt0oc_aumsru7skZ6WpG7w6ziF5mz6mk3m4eJ49TZJFmHMWtyE3KBlV44zRPC4gjrJIAsYliozHDIELKHOADAtUGeXe9azheebpEkvGh-Suv_tuar131da4H21NpefJQh88KhRjfKy-wLO3Pbt39rPDptUb27mdf0-zsYgEkzLinmI9lTvbNA7L_7NA9aFp_de0PjStj0370E0fqhDxPxALTiUH_gvNNXiw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2647425573</pqid></control><display><type>article</type><title>An FPGA-Based Residual Recurrent Neural Network for Real-Time Video Super-Resolution</title><source>IEEE Electronic Library (IEL)</source><creator>Sun, Kaicong ; Koch, Maurice ; Wang, Zhe ; Jovanovic, Slavisa ; Rabah, Hassan ; Simon, Sven</creator><creatorcontrib>Sun, Kaicong ; Koch, Maurice ; Wang, Zhe ; Jovanovic, Slavisa ; Rabah, Hassan ; Simon, Sven</creatorcontrib><description>In this paper, we propose a hardware-efficient residual recurrent neural network for real-time video super-resolution (VSR) based on field programmable gate array (FPGA). Although recent learning-based VSR methods have achieved remarkable performance, the large computational complexity prohibits the deployment of the sophisticated VSR models on FPGA for real-time applications. Limited by the hardware resources, state-of-the-art FPGA-based VSR methods perform single-image super-resolution over the video sequence and suffer from temporal inconsistency. In order to exploit the inter-frame temporal correlation for real-time VSR on low-complexity hardware, we introduce a hardware-efficient recurrent neural network ERVSR. Specially, the proposed ERVSR leverages the input frame and the temporal information entailed in the hidden state to reconstruct the high-resolution counterpart. To reduce the network parameters, the low-resolution input branch and the hidden state branch are convolved individually and a channel modulation coefficient is proposed to explicitly guide the network to allocate the amount of output feature channels to each branch. Additionally, in order to reduce the memory consumption, we perform a dedicated lightweight compression of the hidden state by introducing a statistical normalization scheme followed by a fixed-point quantization. Besides, we adopt group convolution and depthwise separable convolution to further compact the network. We evaluated the proposed ERVSR on multiple public datasets from different aspects. Experimental results demonstrate that ERVSR performs better than the existing state-of-the-art FPGA-based VSR methods in both image quality and data throughput.</description><identifier>ISSN: 1051-8215</identifier><identifier>EISSN: 1558-2205</identifier><identifier>DOI: 10.1109/TCSVT.2021.3080241</identifier><identifier>CODEN: ITCTEM</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>4K UHD ; Complexity ; Computer Science ; Convolution ; Field programmable gate arrays ; FPGA ; Hardware ; hardware-efficient ; Image quality ; Image reconstruction ; Image resolution ; Neural networks ; Real time ; Real-time systems ; Recurrent neural networks ; residual recurrent neural network ; Streaming media ; Superresolution ; UHDTV ; Video super-resolution</subject><ispartof>IEEE transactions on circuits and systems for video technology, 2022-04, Vol.32 (4), p.1739-1750</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c329t-3ae52086b20c9d197b751e9fe4b392e1341fc11bede8b034b386ba3cbc9dfef23</citedby><cites>FETCH-LOGICAL-c329t-3ae52086b20c9d197b751e9fe4b392e1341fc11bede8b034b386ba3cbc9dfef23</cites><orcidid>0000-0003-3552-7139 ; 0000-0001-6459-7043 ; 0000-0002-9999-2542 ; 0000-0001-6334-3084</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9430531$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>230,314,780,784,796,885,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9430531$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://hal.science/hal-04822368$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Sun, Kaicong</creatorcontrib><creatorcontrib>Koch, Maurice</creatorcontrib><creatorcontrib>Wang, Zhe</creatorcontrib><creatorcontrib>Jovanovic, Slavisa</creatorcontrib><creatorcontrib>Rabah, Hassan</creatorcontrib><creatorcontrib>Simon, Sven</creatorcontrib><title>An FPGA-Based Residual Recurrent Neural Network for Real-Time Video Super-Resolution</title><title>IEEE transactions on circuits and systems for video technology</title><addtitle>TCSVT</addtitle><description>In this paper, we propose a hardware-efficient residual recurrent neural network for real-time video super-resolution (VSR) based on field programmable gate array (FPGA). Although recent learning-based VSR methods have achieved remarkable performance, the large computational complexity prohibits the deployment of the sophisticated VSR models on FPGA for real-time applications. Limited by the hardware resources, state-of-the-art FPGA-based VSR methods perform single-image super-resolution over the video sequence and suffer from temporal inconsistency. In order to exploit the inter-frame temporal correlation for real-time VSR on low-complexity hardware, we introduce a hardware-efficient recurrent neural network ERVSR. Specially, the proposed ERVSR leverages the input frame and the temporal information entailed in the hidden state to reconstruct the high-resolution counterpart. To reduce the network parameters, the low-resolution input branch and the hidden state branch are convolved individually and a channel modulation coefficient is proposed to explicitly guide the network to allocate the amount of output feature channels to each branch. Additionally, in order to reduce the memory consumption, we perform a dedicated lightweight compression of the hidden state by introducing a statistical normalization scheme followed by a fixed-point quantization. Besides, we adopt group convolution and depthwise separable convolution to further compact the network. We evaluated the proposed ERVSR on multiple public datasets from different aspects. Experimental results demonstrate that ERVSR performs better than the existing state-of-the-art FPGA-based VSR methods in both image quality and data throughput.</description><subject>4K UHD</subject><subject>Complexity</subject><subject>Computer Science</subject><subject>Convolution</subject><subject>Field programmable gate arrays</subject><subject>FPGA</subject><subject>Hardware</subject><subject>hardware-efficient</subject><subject>Image quality</subject><subject>Image reconstruction</subject><subject>Image resolution</subject><subject>Neural networks</subject><subject>Real time</subject><subject>Real-time systems</subject><subject>Recurrent neural networks</subject><subject>residual recurrent neural network</subject><subject>Streaming media</subject><subject>Superresolution</subject><subject>UHDTV</subject><subject>Video super-resolution</subject><issn>1051-8215</issn><issn>1558-2205</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE9PwkAQxTdGExH9Anpp4slDcWf_0O2xEgETgkYq1822ncZiYXHbavz2LpZ4mpc3vzeZPEKugY4AaHyfTlbrdMQogxGnijIBJ2QAUqqQMSpPvaYSQsVAnpOLptlQCkKJaEDSZBdMX2ZJ-GAaLIJXbKqiM7UXeecc7tpgiZ3zxhLbb-s-gtI6vzR1mFZbDNZVgTZYdXt0oc_aumsru7skZ6WpG7w6ziF5mz6mk3m4eJ49TZJFmHMWtyE3KBlV44zRPC4gjrJIAsYliozHDIELKHOADAtUGeXe9azheebpEkvGh-Suv_tuar131da4H21NpefJQh88KhRjfKy-wLO3Pbt39rPDptUb27mdf0-zsYgEkzLinmI9lTvbNA7L_7NA9aFp_de0PjStj0370E0fqhDxPxALTiUH_gvNNXiw</recordid><startdate>20220401</startdate><enddate>20220401</enddate><creator>Sun, Kaicong</creator><creator>Koch, Maurice</creator><creator>Wang, Zhe</creator><creator>Jovanovic, Slavisa</creator><creator>Rabah, Hassan</creator><creator>Simon, Sven</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><general>Institute of Electrical and Electronics Engineers</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>1XC</scope><orcidid>https://orcid.org/0000-0003-3552-7139</orcidid><orcidid>https://orcid.org/0000-0001-6459-7043</orcidid><orcidid>https://orcid.org/0000-0002-9999-2542</orcidid><orcidid>https://orcid.org/0000-0001-6334-3084</orcidid></search><sort><creationdate>20220401</creationdate><title>An FPGA-Based Residual Recurrent Neural Network for Real-Time Video Super-Resolution</title><author>Sun, Kaicong ; Koch, Maurice ; Wang, Zhe ; Jovanovic, Slavisa ; Rabah, Hassan ; Simon, Sven</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c329t-3ae52086b20c9d197b751e9fe4b392e1341fc11bede8b034b386ba3cbc9dfef23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>4K UHD</topic><topic>Complexity</topic><topic>Computer Science</topic><topic>Convolution</topic><topic>Field programmable gate arrays</topic><topic>FPGA</topic><topic>Hardware</topic><topic>hardware-efficient</topic><topic>Image quality</topic><topic>Image reconstruction</topic><topic>Image resolution</topic><topic>Neural networks</topic><topic>Real time</topic><topic>Real-time systems</topic><topic>Recurrent neural networks</topic><topic>residual recurrent neural network</topic><topic>Streaming media</topic><topic>Superresolution</topic><topic>UHDTV</topic><topic>Video super-resolution</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sun, Kaicong</creatorcontrib><creatorcontrib>Koch, Maurice</creatorcontrib><creatorcontrib>Wang, Zhe</creatorcontrib><creatorcontrib>Jovanovic, Slavisa</creatorcontrib><creatorcontrib>Rabah, Hassan</creatorcontrib><creatorcontrib>Simon, Sven</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Hyper Article en Ligne (HAL)</collection><jtitle>IEEE transactions on circuits and systems for video technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sun, Kaicong</au><au>Koch, Maurice</au><au>Wang, Zhe</au><au>Jovanovic, Slavisa</au><au>Rabah, Hassan</au><au>Simon, Sven</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An FPGA-Based Residual Recurrent Neural Network for Real-Time Video Super-Resolution</atitle><jtitle>IEEE transactions on circuits and systems for video technology</jtitle><stitle>TCSVT</stitle><date>2022-04-01</date><risdate>2022</risdate><volume>32</volume><issue>4</issue><spage>1739</spage><epage>1750</epage><pages>1739-1750</pages><issn>1051-8215</issn><eissn>1558-2205</eissn><coden>ITCTEM</coden><abstract>In this paper, we propose a hardware-efficient residual recurrent neural network for real-time video super-resolution (VSR) based on field programmable gate array (FPGA). Although recent learning-based VSR methods have achieved remarkable performance, the large computational complexity prohibits the deployment of the sophisticated VSR models on FPGA for real-time applications. Limited by the hardware resources, state-of-the-art FPGA-based VSR methods perform single-image super-resolution over the video sequence and suffer from temporal inconsistency. In order to exploit the inter-frame temporal correlation for real-time VSR on low-complexity hardware, we introduce a hardware-efficient recurrent neural network ERVSR. Specially, the proposed ERVSR leverages the input frame and the temporal information entailed in the hidden state to reconstruct the high-resolution counterpart. To reduce the network parameters, the low-resolution input branch and the hidden state branch are convolved individually and a channel modulation coefficient is proposed to explicitly guide the network to allocate the amount of output feature channels to each branch. Additionally, in order to reduce the memory consumption, we perform a dedicated lightweight compression of the hidden state by introducing a statistical normalization scheme followed by a fixed-point quantization. Besides, we adopt group convolution and depthwise separable convolution to further compact the network. We evaluated the proposed ERVSR on multiple public datasets from different aspects. Experimental results demonstrate that ERVSR performs better than the existing state-of-the-art FPGA-based VSR methods in both image quality and data throughput.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSVT.2021.3080241</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0003-3552-7139</orcidid><orcidid>https://orcid.org/0000-0001-6459-7043</orcidid><orcidid>https://orcid.org/0000-0002-9999-2542</orcidid><orcidid>https://orcid.org/0000-0001-6334-3084</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1051-8215
ispartof	IEEE transactions on circuits and systems for video technology, 2022-04, Vol.32 (4), p.1739-1750
issn	1051-8215 1558-2205
language	eng
recordid	cdi_proquest_journals_2647425573
source	IEEE Electronic Library (IEL)
subjects	4K UHD Complexity Computer Science Convolution Field programmable gate arrays FPGA Hardware hardware-efficient Image quality Image reconstruction Image resolution Neural networks Real time Real-time systems Recurrent neural networks residual recurrent neural network Streaming media Superresolution UHDTV Video super-resolution
title	An FPGA-Based Residual Recurrent Neural Network for Real-Time Video Super-Resolution
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T18%3A33%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20FPGA-Based%20Residual%20Recurrent%20Neural%20Network%20for%20Real-Time%20Video%20Super-Resolution&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems%20for%20video%20technology&rft.au=Sun,%20Kaicong&rft.date=2022-04-01&rft.volume=32&rft.issue=4&rft.spage=1739&rft.epage=1750&rft.pages=1739-1750&rft.issn=1051-8215&rft.eissn=1558-2205&rft.coden=ITCTEM&rft_id=info:doi/10.1109/TCSVT.2021.3080241&rft_dat=%3Cproquest_RIE%3E2647425573%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2647425573&rft_id=info:pmid/&rft_ieee_id=9430531&rfr_iscdi=true