Fast H.264 to HEVC Transcoding: A Deep Learning Method

With the development of video coding technology, high-efficiency video coding (HEVC) has become a promising alternative, compared with the previous coding standards, for example, H.264. In general, H.264 to HEVC transcoding can be accomplished by fully H.264 decoding and fully HEVC encoding, which s...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on multimedia 2019-07, Vol.21 (7), p.1633-1645
Hauptverfasser:	Xu, Jingyao, Xu, Mai, Wei, Yanan, Wang, Zulin, Guan, Zhenyu
Format:	Artikel
Sprache:	eng
Schlagworte:	Coding Coding standards Computing time Decoding Deep learning Feature extraction H.264 HEVC LSTM Machine learning Optimization Partitions Searching State of the art Streaming media Teaching methods Transcoding Video coding Video compression
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1645
container_issue	7
container_start_page	1633
container_title	IEEE transactions on multimedia
container_volume	21
creator	Xu, Jingyao Xu, Mai Wei, Yanan Wang, Zulin Guan, Zhenyu
description	With the development of video coding technology, high-efficiency video coding (HEVC) has become a promising alternative, compared with the previous coding standards, for example, H.264. In general, H.264 to HEVC transcoding can be accomplished by fully H.264 decoding and fully HEVC encoding, which suffers from considerable time consumption on the brute-force search of the HEVC coding tree unit (CTU) partition for rate-distortion optimization (RDO). In this paper, we propose a deep learning method to predict the HEVC CTU partition, instead of the brute-force RDO search, for H.264 to HEVC transcoding. First, we build a large-scale H.264 to HEVC transcoding database. Second, we investigate the correlation between the HEVC CTU partition and H.264 features, and analyze both temporal and spatial-temporal similarities of the CTU partition across video frames. Third, we propose a deep learning architecture of a hierarchical long short-term memory (H-LSTM) network to predict the CTU partition of HEVC. Then, the brute-force RDO search of the CTU partition is replaced by the H-LSTM prediction such that the computational time can be significantly reduced for fast H.264 to HEVC transcoding. Finally, the experimental results verify that the proposed H-LSTM method can achieve a better tradeoff between coding efficiency and complexity, compared to the state-of-the-art H.264 to HEVC transcoding methods.
doi_str_mv	10.1109/TMM.2018.2885921
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_8570845</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8570845</ieee_id><sourcerecordid>2247925633</sourcerecordid><originalsourceid>FETCH-LOGICAL-c357t-7a853c42e8767431bc41ebb5d8c4879805e903fdab5c02bdda9ad3c39501bf053</originalsourceid><addsrcrecordid>eNo9kE1LAzEQhoMoWKt3wUvA866Tr03irdTWCl28rF5DNslqi-7WZHvw35vS4mmGl-edgQehWwIlIaAfmrouKRBVUqWEpuQMTYjmpACQ8jzvgkKRY7hEVyltAQgXICeoWto04lVJK47HAa8W73PcRNsnN_hN__GIZ_gphB1eBxv7HOA6jJ-Dv0YXnf1K4eY0p-htuWjmq2L9-vwyn60Lx4QcC2mVYI7ToGQlOSOt4yS0rfDKcSW1AhE0sM7bVjigrfdWW88c0wJI24FgU3R_vLuLw88-pNFsh33s80tDKZeaioqxTMGRcnFIKYbO7OLm28ZfQ8Ac7JhsxxzsmJOdXLk7VjYhhH9cCQmKC_YHhPtczw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2247925633</pqid></control><display><type>article</type><title>Fast H.264 to HEVC Transcoding: A Deep Learning Method</title><source>IEEE/IET Electronic Library (IEL)</source><creator>Xu, Jingyao ; Xu, Mai ; Wei, Yanan ; Wang, Zulin ; Guan, Zhenyu</creator><creatorcontrib>Xu, Jingyao ; Xu, Mai ; Wei, Yanan ; Wang, Zulin ; Guan, Zhenyu</creatorcontrib><description>With the development of video coding technology, high-efficiency video coding (HEVC) has become a promising alternative, compared with the previous coding standards, for example, H.264. In general, H.264 to HEVC transcoding can be accomplished by fully H.264 decoding and fully HEVC encoding, which suffers from considerable time consumption on the brute-force search of the HEVC coding tree unit (CTU) partition for rate-distortion optimization (RDO). In this paper, we propose a deep learning method to predict the HEVC CTU partition, instead of the brute-force RDO search, for H.264 to HEVC transcoding. First, we build a large-scale H.264 to HEVC transcoding database. Second, we investigate the correlation between the HEVC CTU partition and H.264 features, and analyze both temporal and spatial-temporal similarities of the CTU partition across video frames. Third, we propose a deep learning architecture of a hierarchical long short-term memory (H-LSTM) network to predict the CTU partition of HEVC. Then, the brute-force RDO search of the CTU partition is replaced by the H-LSTM prediction such that the computational time can be significantly reduced for fast H.264 to HEVC transcoding. Finally, the experimental results verify that the proposed H-LSTM method can achieve a better tradeoff between coding efficiency and complexity, compared to the state-of-the-art H.264 to HEVC transcoding methods.</description><identifier>ISSN: 1520-9210</identifier><identifier>EISSN: 1941-0077</identifier><identifier>DOI: 10.1109/TMM.2018.2885921</identifier><identifier>CODEN: ITMUF8</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Coding ; Coding standards ; Computing time ; Decoding ; Deep learning ; Feature extraction ; H.264 ; HEVC ; LSTM ; Machine learning ; Optimization ; Partitions ; Searching ; State of the art ; Streaming media ; Teaching methods ; Transcoding ; Video coding ; Video compression</subject><ispartof>IEEE transactions on multimedia, 2019-07, Vol.21 (7), p.1633-1645</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c357t-7a853c42e8767431bc41ebb5d8c4879805e903fdab5c02bdda9ad3c39501bf053</citedby><cites>FETCH-LOGICAL-c357t-7a853c42e8767431bc41ebb5d8c4879805e903fdab5c02bdda9ad3c39501bf053</cites><orcidid>0000-0002-0277-3301 ; 0000-0002-3959-338X ; 0000-0002-0305-8625</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8570845$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8570845$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Xu, Jingyao</creatorcontrib><creatorcontrib>Xu, Mai</creatorcontrib><creatorcontrib>Wei, Yanan</creatorcontrib><creatorcontrib>Wang, Zulin</creatorcontrib><creatorcontrib>Guan, Zhenyu</creatorcontrib><title>Fast H.264 to HEVC Transcoding: A Deep Learning Method</title><title>IEEE transactions on multimedia</title><addtitle>TMM</addtitle><description>With the development of video coding technology, high-efficiency video coding (HEVC) has become a promising alternative, compared with the previous coding standards, for example, H.264. In general, H.264 to HEVC transcoding can be accomplished by fully H.264 decoding and fully HEVC encoding, which suffers from considerable time consumption on the brute-force search of the HEVC coding tree unit (CTU) partition for rate-distortion optimization (RDO). In this paper, we propose a deep learning method to predict the HEVC CTU partition, instead of the brute-force RDO search, for H.264 to HEVC transcoding. First, we build a large-scale H.264 to HEVC transcoding database. Second, we investigate the correlation between the HEVC CTU partition and H.264 features, and analyze both temporal and spatial-temporal similarities of the CTU partition across video frames. Third, we propose a deep learning architecture of a hierarchical long short-term memory (H-LSTM) network to predict the CTU partition of HEVC. Then, the brute-force RDO search of the CTU partition is replaced by the H-LSTM prediction such that the computational time can be significantly reduced for fast H.264 to HEVC transcoding. Finally, the experimental results verify that the proposed H-LSTM method can achieve a better tradeoff between coding efficiency and complexity, compared to the state-of-the-art H.264 to HEVC transcoding methods.</description><subject>Coding</subject><subject>Coding standards</subject><subject>Computing time</subject><subject>Decoding</subject><subject>Deep learning</subject><subject>Feature extraction</subject><subject>H.264</subject><subject>HEVC</subject><subject>LSTM</subject><subject>Machine learning</subject><subject>Optimization</subject><subject>Partitions</subject><subject>Searching</subject><subject>State of the art</subject><subject>Streaming media</subject><subject>Teaching methods</subject><subject>Transcoding</subject><subject>Video coding</subject><subject>Video compression</subject><issn>1520-9210</issn><issn>1941-0077</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1LAzEQhoMoWKt3wUvA866Tr03irdTWCl28rF5DNslqi-7WZHvw35vS4mmGl-edgQehWwIlIaAfmrouKRBVUqWEpuQMTYjmpACQ8jzvgkKRY7hEVyltAQgXICeoWto04lVJK47HAa8W73PcRNsnN_hN__GIZ_gphB1eBxv7HOA6jJ-Dv0YXnf1K4eY0p-htuWjmq2L9-vwyn60Lx4QcC2mVYI7ToGQlOSOt4yS0rfDKcSW1AhE0sM7bVjigrfdWW88c0wJI24FgU3R_vLuLw88-pNFsh33s80tDKZeaioqxTMGRcnFIKYbO7OLm28ZfQ8Ac7JhsxxzsmJOdXLk7VjYhhH9cCQmKC_YHhPtczw</recordid><startdate>20190701</startdate><enddate>20190701</enddate><creator>Xu, Jingyao</creator><creator>Xu, Mai</creator><creator>Wei, Yanan</creator><creator>Wang, Zulin</creator><creator>Guan, Zhenyu</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-0277-3301</orcidid><orcidid>https://orcid.org/0000-0002-3959-338X</orcidid><orcidid>https://orcid.org/0000-0002-0305-8625</orcidid></search><sort><creationdate>20190701</creationdate><title>Fast H.264 to HEVC Transcoding: A Deep Learning Method</title><author>Xu, Jingyao ; Xu, Mai ; Wei, Yanan ; Wang, Zulin ; Guan, Zhenyu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c357t-7a853c42e8767431bc41ebb5d8c4879805e903fdab5c02bdda9ad3c39501bf053</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Coding</topic><topic>Coding standards</topic><topic>Computing time</topic><topic>Decoding</topic><topic>Deep learning</topic><topic>Feature extraction</topic><topic>H.264</topic><topic>HEVC</topic><topic>LSTM</topic><topic>Machine learning</topic><topic>Optimization</topic><topic>Partitions</topic><topic>Searching</topic><topic>State of the art</topic><topic>Streaming media</topic><topic>Teaching methods</topic><topic>Transcoding</topic><topic>Video coding</topic><topic>Video compression</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xu, Jingyao</creatorcontrib><creatorcontrib>Xu, Mai</creatorcontrib><creatorcontrib>Wei, Yanan</creatorcontrib><creatorcontrib>Wang, Zulin</creatorcontrib><creatorcontrib>Guan, Zhenyu</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on multimedia</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Xu, Jingyao</au><au>Xu, Mai</au><au>Wei, Yanan</au><au>Wang, Zulin</au><au>Guan, Zhenyu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Fast H.264 to HEVC Transcoding: A Deep Learning Method</atitle><jtitle>IEEE transactions on multimedia</jtitle><stitle>TMM</stitle><date>2019-07-01</date><risdate>2019</risdate><volume>21</volume><issue>7</issue><spage>1633</spage><epage>1645</epage><pages>1633-1645</pages><issn>1520-9210</issn><eissn>1941-0077</eissn><coden>ITMUF8</coden><abstract>With the development of video coding technology, high-efficiency video coding (HEVC) has become a promising alternative, compared with the previous coding standards, for example, H.264. In general, H.264 to HEVC transcoding can be accomplished by fully H.264 decoding and fully HEVC encoding, which suffers from considerable time consumption on the brute-force search of the HEVC coding tree unit (CTU) partition for rate-distortion optimization (RDO). In this paper, we propose a deep learning method to predict the HEVC CTU partition, instead of the brute-force RDO search, for H.264 to HEVC transcoding. First, we build a large-scale H.264 to HEVC transcoding database. Second, we investigate the correlation between the HEVC CTU partition and H.264 features, and analyze both temporal and spatial-temporal similarities of the CTU partition across video frames. Third, we propose a deep learning architecture of a hierarchical long short-term memory (H-LSTM) network to predict the CTU partition of HEVC. Then, the brute-force RDO search of the CTU partition is replaced by the H-LSTM prediction such that the computational time can be significantly reduced for fast H.264 to HEVC transcoding. Finally, the experimental results verify that the proposed H-LSTM method can achieve a better tradeoff between coding efficiency and complexity, compared to the state-of-the-art H.264 to HEVC transcoding methods.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TMM.2018.2885921</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-0277-3301</orcidid><orcidid>https://orcid.org/0000-0002-3959-338X</orcidid><orcidid>https://orcid.org/0000-0002-0305-8625</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1520-9210
ispartof	IEEE transactions on multimedia, 2019-07, Vol.21 (7), p.1633-1645
issn	1520-9210 1941-0077
language	eng
recordid	cdi_ieee_primary_8570845
source	IEEE/IET Electronic Library (IEL)
subjects	Coding Coding standards Computing time Decoding Deep learning Feature extraction H.264 HEVC LSTM Machine learning Optimization Partitions Searching State of the art Streaming media Teaching methods Transcoding Video coding Video compression
title	Fast H.264 to HEVC Transcoding: A Deep Learning Method
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T17%3A40%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Fast%20H.264%20to%20HEVC%20Transcoding:%20A%20Deep%20Learning%20Method&rft.jtitle=IEEE%20transactions%20on%20multimedia&rft.au=Xu,%20Jingyao&rft.date=2019-07-01&rft.volume=21&rft.issue=7&rft.spage=1633&rft.epage=1645&rft.pages=1633-1645&rft.issn=1520-9210&rft.eissn=1941-0077&rft.coden=ITMUF8&rft_id=info:doi/10.1109/TMM.2018.2885921&rft_dat=%3Cproquest_RIE%3E2247925633%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2247925633&rft_id=info:pmid/&rft_ieee_id=8570845&rfr_iscdi=true