Adaptive multithreaded H.264/AVC decoding

The current trend towards multi-core processors imposes the necessity of finding viable strategies to exploit the additional computational resources in media processing. Among the challenges for video decoding are the appropriate partitioning of decoder steps, efficient tracking of dependencies and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Richter, Henryk, Stabernack, Benno, Muller, Erika
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 890
container_issue
container_start_page 886
container_title
container_volume
creator Richter, Henryk
Stabernack, Benno
Muller, Erika
description The current trend towards multi-core processors imposes the necessity of finding viable strategies to exploit the additional computational resources in media processing. Among the challenges for video decoding are the appropriate partitioning of decoder steps, efficient tracking of dependencies and resource allocation/synchronization for multiple threads with respect to the resulting overhead. In this paper, we propose two variants of multithreading with distributed synchronization. The first method is optimized for minimum latency decoding, necessary for conversational applications. The second method aims to maximize the total throughput at the cost of a higher latency. In addition, we propose a method of dynamic core usage in order to reduce the total allocated processing resources due to inter-process communication overhead. This method is based on a coarse grained complexity estimation. To implicitly adapt to different combinations of processor architectures, associated memory interfaces and power-saving states, the scheme is feedback assisted. By correlating the initial estimate with the actual required processing time, a sufficiently accurate prediction of the required number of cores for the image processing part can be obtained. Experimental results demonstrate the scaling abilities of up to factor 3.5 on a quad-core machine, as well as the limits of the proposed approach regarding the complexity of sequential bitstream processing. We demonstrate that real-time 4k resolution decoding is feasible on current mid-range PC hardware. For less demanding streams, the adaptive mode reduces the total required CPU resources by up to 10% compared to the greedy approach.
doi_str_mv 10.1109/ACSSC.2009.5469999
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5469999</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5469999</ieee_id><sourcerecordid>5469999</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-4bd653261c008a6f327bcdfaeee96cc2ae0ad92c6b9aaaaad27a56a9d152d603</originalsourceid><addsrcrecordid>eNo1j0tLw0AUhccXGGv-gG6ydZH0zp1HcpchqBUKLiJuy83MREdaLUkU_PdGrN_mLA7ngyPElYRCSqBl3bRtUyAAFUZbmjkSKZWV1Ki1qdDqY5GgKW2OCtSJuPgvDJyKRIKpcqtInYt0HN9gRhskhYm4qT3vp_gVst3ndorT6xDYB5-titm5rJ-bzAf34eP7y6U463k7hvSQC9He3T41q3z9eP_Q1Os8Eky57rw1Cq10ABXbXmHZOd9zCIGsc8gB2BM62xH_4rFkY5m8NOgtqIW4_rPGebHZD3HHw_fm8Fn9AJI7RWg</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Adaptive multithreaded H.264/AVC decoding</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Richter, Henryk ; Stabernack, Benno ; Muller, Erika</creator><creatorcontrib>Richter, Henryk ; Stabernack, Benno ; Muller, Erika</creatorcontrib><description>The current trend towards multi-core processors imposes the necessity of finding viable strategies to exploit the additional computational resources in media processing. Among the challenges for video decoding are the appropriate partitioning of decoder steps, efficient tracking of dependencies and resource allocation/synchronization for multiple threads with respect to the resulting overhead. In this paper, we propose two variants of multithreading with distributed synchronization. The first method is optimized for minimum latency decoding, necessary for conversational applications. The second method aims to maximize the total throughput at the cost of a higher latency. In addition, we propose a method of dynamic core usage in order to reduce the total allocated processing resources due to inter-process communication overhead. This method is based on a coarse grained complexity estimation. To implicitly adapt to different combinations of processor architectures, associated memory interfaces and power-saving states, the scheme is feedback assisted. By correlating the initial estimate with the actual required processing time, a sufficiently accurate prediction of the required number of cores for the image processing part can be obtained. Experimental results demonstrate the scaling abilities of up to factor 3.5 on a quad-core machine, as well as the limits of the proposed approach regarding the complexity of sequential bitstream processing. We demonstrate that real-time 4k resolution decoding is feasible on current mid-range PC hardware. For less demanding streams, the adaptive mode reduces the total required CPU resources by up to 10% compared to the greedy approach.</description><identifier>ISSN: 1058-6393</identifier><identifier>ISBN: 1424458250</identifier><identifier>ISBN: 9781424458257</identifier><identifier>EISSN: 2576-2303</identifier><identifier>EISBN: 9781424458264</identifier><identifier>EISBN: 1424458277</identifier><identifier>EISBN: 1424458269</identifier><identifier>EISBN: 9781424458271</identifier><identifier>DOI: 10.1109/ACSSC.2009.5469999</identifier><language>eng</language><publisher>IEEE</publisher><subject>Automatic voltage control ; Costs ; Decoding ; Delay ; Multicore processing ; Multithreading ; Optimization methods ; Resource management ; Throughput</subject><ispartof>2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers, 2009, p.886-890</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5469999$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2052,27902,54895</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5469999$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Richter, Henryk</creatorcontrib><creatorcontrib>Stabernack, Benno</creatorcontrib><creatorcontrib>Muller, Erika</creatorcontrib><title>Adaptive multithreaded H.264/AVC decoding</title><title>2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers</title><addtitle>ACSSC</addtitle><description>The current trend towards multi-core processors imposes the necessity of finding viable strategies to exploit the additional computational resources in media processing. Among the challenges for video decoding are the appropriate partitioning of decoder steps, efficient tracking of dependencies and resource allocation/synchronization for multiple threads with respect to the resulting overhead. In this paper, we propose two variants of multithreading with distributed synchronization. The first method is optimized for minimum latency decoding, necessary for conversational applications. The second method aims to maximize the total throughput at the cost of a higher latency. In addition, we propose a method of dynamic core usage in order to reduce the total allocated processing resources due to inter-process communication overhead. This method is based on a coarse grained complexity estimation. To implicitly adapt to different combinations of processor architectures, associated memory interfaces and power-saving states, the scheme is feedback assisted. By correlating the initial estimate with the actual required processing time, a sufficiently accurate prediction of the required number of cores for the image processing part can be obtained. Experimental results demonstrate the scaling abilities of up to factor 3.5 on a quad-core machine, as well as the limits of the proposed approach regarding the complexity of sequential bitstream processing. We demonstrate that real-time 4k resolution decoding is feasible on current mid-range PC hardware. For less demanding streams, the adaptive mode reduces the total required CPU resources by up to 10% compared to the greedy approach.</description><subject>Automatic voltage control</subject><subject>Costs</subject><subject>Decoding</subject><subject>Delay</subject><subject>Multicore processing</subject><subject>Multithreading</subject><subject>Optimization methods</subject><subject>Resource management</subject><subject>Throughput</subject><issn>1058-6393</issn><issn>2576-2303</issn><isbn>1424458250</isbn><isbn>9781424458257</isbn><isbn>9781424458264</isbn><isbn>1424458277</isbn><isbn>1424458269</isbn><isbn>9781424458271</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2009</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1j0tLw0AUhccXGGv-gG6ydZH0zp1HcpchqBUKLiJuy83MREdaLUkU_PdGrN_mLA7ngyPElYRCSqBl3bRtUyAAFUZbmjkSKZWV1Ki1qdDqY5GgKW2OCtSJuPgvDJyKRIKpcqtInYt0HN9gRhskhYm4qT3vp_gVst3ndorT6xDYB5-titm5rJ-bzAf34eP7y6U463k7hvSQC9He3T41q3z9eP_Q1Os8Eky57rw1Cq10ABXbXmHZOd9zCIGsc8gB2BM62xH_4rFkY5m8NOgtqIW4_rPGebHZD3HHw_fm8Fn9AJI7RWg</recordid><startdate>200911</startdate><enddate>200911</enddate><creator>Richter, Henryk</creator><creator>Stabernack, Benno</creator><creator>Muller, Erika</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>200911</creationdate><title>Adaptive multithreaded H.264/AVC decoding</title><author>Richter, Henryk ; Stabernack, Benno ; Muller, Erika</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-4bd653261c008a6f327bcdfaeee96cc2ae0ad92c6b9aaaaad27a56a9d152d603</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Automatic voltage control</topic><topic>Costs</topic><topic>Decoding</topic><topic>Delay</topic><topic>Multicore processing</topic><topic>Multithreading</topic><topic>Optimization methods</topic><topic>Resource management</topic><topic>Throughput</topic><toplevel>online_resources</toplevel><creatorcontrib>Richter, Henryk</creatorcontrib><creatorcontrib>Stabernack, Benno</creatorcontrib><creatorcontrib>Muller, Erika</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Richter, Henryk</au><au>Stabernack, Benno</au><au>Muller, Erika</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Adaptive multithreaded H.264/AVC decoding</atitle><btitle>2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers</btitle><stitle>ACSSC</stitle><date>2009-11</date><risdate>2009</risdate><spage>886</spage><epage>890</epage><pages>886-890</pages><issn>1058-6393</issn><eissn>2576-2303</eissn><isbn>1424458250</isbn><isbn>9781424458257</isbn><eisbn>9781424458264</eisbn><eisbn>1424458277</eisbn><eisbn>1424458269</eisbn><eisbn>9781424458271</eisbn><abstract>The current trend towards multi-core processors imposes the necessity of finding viable strategies to exploit the additional computational resources in media processing. Among the challenges for video decoding are the appropriate partitioning of decoder steps, efficient tracking of dependencies and resource allocation/synchronization for multiple threads with respect to the resulting overhead. In this paper, we propose two variants of multithreading with distributed synchronization. The first method is optimized for minimum latency decoding, necessary for conversational applications. The second method aims to maximize the total throughput at the cost of a higher latency. In addition, we propose a method of dynamic core usage in order to reduce the total allocated processing resources due to inter-process communication overhead. This method is based on a coarse grained complexity estimation. To implicitly adapt to different combinations of processor architectures, associated memory interfaces and power-saving states, the scheme is feedback assisted. By correlating the initial estimate with the actual required processing time, a sufficiently accurate prediction of the required number of cores for the image processing part can be obtained. Experimental results demonstrate the scaling abilities of up to factor 3.5 on a quad-core machine, as well as the limits of the proposed approach regarding the complexity of sequential bitstream processing. We demonstrate that real-time 4k resolution decoding is feasible on current mid-range PC hardware. For less demanding streams, the adaptive mode reduces the total required CPU resources by up to 10% compared to the greedy approach.</abstract><pub>IEEE</pub><doi>10.1109/ACSSC.2009.5469999</doi><tpages>5</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1058-6393
ispartof 2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers, 2009, p.886-890
issn 1058-6393
2576-2303
language eng
recordid cdi_ieee_primary_5469999
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Automatic voltage control
Costs
Decoding
Delay
Multicore processing
Multithreading
Optimization methods
Resource management
Throughput
title Adaptive multithreaded H.264/AVC decoding
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T18%3A35%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Adaptive%20multithreaded%20H.264/AVC%20decoding&rft.btitle=2009%20Conference%20Record%20of%20the%20Forty-Third%20Asilomar%20Conference%20on%20Signals,%20Systems%20and%20Computers&rft.au=Richter,%20Henryk&rft.date=2009-11&rft.spage=886&rft.epage=890&rft.pages=886-890&rft.issn=1058-6393&rft.eissn=2576-2303&rft.isbn=1424458250&rft.isbn_list=9781424458257&rft_id=info:doi/10.1109/ACSSC.2009.5469999&rft_dat=%3Cieee_6IE%3E5469999%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781424458264&rft.eisbn_list=1424458277&rft.eisbn_list=1424458269&rft.eisbn_list=9781424458271&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5469999&rfr_iscdi=true