Analysis and architecture design of scalable fractional motion estimation for H.264 encoding
Fractional Motion Estimation (FME) is an important part of the H.264/AVC video encoding standard. The algorithm can significantly increase the compression ratio of video encoders while improving video quality. However, it is computationally expensive and can consist of over 45% of the total motion e...
Gespeichert in:
Veröffentlicht in: | Integration (Amsterdam) 2012-09, Vol.45 (4), p.427-438 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 438 |
---|---|
container_issue | 4 |
container_start_page | 427 |
container_title | Integration (Amsterdam) |
container_volume | 45 |
creator | Vasiljevic, Jasmina Ye, Andy |
description | Fractional Motion Estimation (FME) is an important part of the H.264/AVC video encoding standard. The algorithm can significantly increase the compression ratio of video encoders while improving video quality. However, it is computationally expensive and can consist of over 45% of the total motion estimation runtime. To maximize the performance and utilization of FME implementations on Field-Programmable Gate Arrays (FPGAs), one needs to effectively exploit the inherent parallelism in the algorithm. In this work, we explore two approaches to FME algorithm parallelization in order to effectively increase the processing power of the computing hardware. We call the first method vertical scaling and the second horizontal scaling. We implemented six scaled FME designs on a Xilinx XC5VLX85T (Virtex-5) FPGA. We found that scaling vertically within a 4×4 sub-block is more efficient than scaling horizontally across several sub-blocks. As a result, we were able to achieve higher video resolutions at lower hardware resource cost. In particular, it is shown that the best vertically scaled design can achieve 30fps of QSXGA video with 4 reference frames with only 25.5K LUTS and 28.7K registers.
► Explored Fractional Motion Estimation (FME) algorithm parallelization. ► Implemented six scaled designs on a Xilinx XC5VLX85T (Virtex-5) FPGA. ► Found that scaling vertically within a 4×4 sub-block is more efficient. ► Scaling horizontally across several sub-blocks is less efficient. ► Best vertically scaled design achieves 30fps at QSXGA with 4 reference frames. |
doi_str_mv | 10.1016/j.vlsi.2011.11.017 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1136563409</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167926011001064</els_id><sourcerecordid>1136563409</sourcerecordid><originalsourceid>FETCH-LOGICAL-c407t-ddaf7c686dd255afd8376cfd52fd484104f2e00c94cc7ca9266eaf771f2d83b43</originalsourceid><addsrcrecordid>eNp9kM1KAzEUhYMoWKsv4Cobwc2MSSaTdMBNKWqFghvdCSHNT02ZTmoyLfTtvWOLS-FC7uKcc3M-hG4pKSmh4mFd7tscSkYoLWEIlWdoRCeSFbJm7ByNQCSLhglyia5yXhNCKJf1CH1OO90ecshYdxbrZL5C70y_Sw5bl8Oqw9HjbHSrl63DPmnThwgWvInDgl3uw0b_rj4mPC-Z4Nh1JtrQra7Rhddtdjend4w-np_eZ_Ni8fbyOpsuCsOJ7AtrtZdGTIS1rK61t5NKCuNtzbzlE04J98wRYhpujDQaWggHDkk9A-mSV2N0f8zdpvi9gy-pTcjGta3uXNxlRWklalFx0oCUHaUmxZyT82qboEA6KErUgFKt1YBSDSjBqAAlmO5O-XpAARQ6E_Kfk4mqYrIZwh-POgdl98EllU0AGM6GBFCVjeG_Mz8jR4rl</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1136563409</pqid></control><display><type>article</type><title>Analysis and architecture design of scalable fractional motion estimation for H.264 encoding</title><source>Elsevier ScienceDirect Journals</source><creator>Vasiljevic, Jasmina ; Ye, Andy</creator><creatorcontrib>Vasiljevic, Jasmina ; Ye, Andy</creatorcontrib><description>Fractional Motion Estimation (FME) is an important part of the H.264/AVC video encoding standard. The algorithm can significantly increase the compression ratio of video encoders while improving video quality. However, it is computationally expensive and can consist of over 45% of the total motion estimation runtime. To maximize the performance and utilization of FME implementations on Field-Programmable Gate Arrays (FPGAs), one needs to effectively exploit the inherent parallelism in the algorithm. In this work, we explore two approaches to FME algorithm parallelization in order to effectively increase the processing power of the computing hardware. We call the first method vertical scaling and the second horizontal scaling. We implemented six scaled FME designs on a Xilinx XC5VLX85T (Virtex-5) FPGA. We found that scaling vertically within a 4×4 sub-block is more efficient than scaling horizontally across several sub-blocks. As a result, we were able to achieve higher video resolutions at lower hardware resource cost. In particular, it is shown that the best vertically scaled design can achieve 30fps of QSXGA video with 4 reference frames with only 25.5K LUTS and 28.7K registers.
► Explored Fractional Motion Estimation (FME) algorithm parallelization. ► Implemented six scaled designs on a Xilinx XC5VLX85T (Virtex-5) FPGA. ► Found that scaling vertically within a 4×4 sub-block is more efficient. ► Scaling horizontally across several sub-blocks is less efficient. ► Best vertically scaled design achieves 30fps at QSXGA with 4 reference frames.</description><identifier>ISSN: 0167-9260</identifier><identifier>EISSN: 1872-7522</identifier><identifier>DOI: 10.1016/j.vlsi.2011.11.017</identifier><identifier>CODEN: IVJODL</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Algorithms ; Applied sciences ; Architecture ; Circuit properties ; Coders ; Computation ; Design engineering ; Digital circuits ; Electric, optical and optoelectronic circuits ; Electronic circuits ; Electronics ; Encoders ; Encoding ; Exact sciences and technology ; Field-programmable gate arrays ; Fractional motion estimation ; H.264 ; Hardware ; Scalability ; Signal convertors</subject><ispartof>Integration (Amsterdam), 2012-09, Vol.45 (4), p.427-438</ispartof><rights>2011 Elsevier B.V.</rights><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c407t-ddaf7c686dd255afd8376cfd52fd484104f2e00c94cc7ca9266eaf771f2d83b43</citedby><cites>FETCH-LOGICAL-c407t-ddaf7c686dd255afd8376cfd52fd484104f2e00c94cc7ca9266eaf771f2d83b43</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.vlsi.2011.11.017$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,777,781,3537,27905,27906,45976</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=26332799$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Vasiljevic, Jasmina</creatorcontrib><creatorcontrib>Ye, Andy</creatorcontrib><title>Analysis and architecture design of scalable fractional motion estimation for H.264 encoding</title><title>Integration (Amsterdam)</title><description>Fractional Motion Estimation (FME) is an important part of the H.264/AVC video encoding standard. The algorithm can significantly increase the compression ratio of video encoders while improving video quality. However, it is computationally expensive and can consist of over 45% of the total motion estimation runtime. To maximize the performance and utilization of FME implementations on Field-Programmable Gate Arrays (FPGAs), one needs to effectively exploit the inherent parallelism in the algorithm. In this work, we explore two approaches to FME algorithm parallelization in order to effectively increase the processing power of the computing hardware. We call the first method vertical scaling and the second horizontal scaling. We implemented six scaled FME designs on a Xilinx XC5VLX85T (Virtex-5) FPGA. We found that scaling vertically within a 4×4 sub-block is more efficient than scaling horizontally across several sub-blocks. As a result, we were able to achieve higher video resolutions at lower hardware resource cost. In particular, it is shown that the best vertically scaled design can achieve 30fps of QSXGA video with 4 reference frames with only 25.5K LUTS and 28.7K registers.
► Explored Fractional Motion Estimation (FME) algorithm parallelization. ► Implemented six scaled designs on a Xilinx XC5VLX85T (Virtex-5) FPGA. ► Found that scaling vertically within a 4×4 sub-block is more efficient. ► Scaling horizontally across several sub-blocks is less efficient. ► Best vertically scaled design achieves 30fps at QSXGA with 4 reference frames.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Architecture</subject><subject>Circuit properties</subject><subject>Coders</subject><subject>Computation</subject><subject>Design engineering</subject><subject>Digital circuits</subject><subject>Electric, optical and optoelectronic circuits</subject><subject>Electronic circuits</subject><subject>Electronics</subject><subject>Encoders</subject><subject>Encoding</subject><subject>Exact sciences and technology</subject><subject>Field-programmable gate arrays</subject><subject>Fractional motion estimation</subject><subject>H.264</subject><subject>Hardware</subject><subject>Scalability</subject><subject>Signal convertors</subject><issn>0167-9260</issn><issn>1872-7522</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><recordid>eNp9kM1KAzEUhYMoWKsv4Cobwc2MSSaTdMBNKWqFghvdCSHNT02ZTmoyLfTtvWOLS-FC7uKcc3M-hG4pKSmh4mFd7tscSkYoLWEIlWdoRCeSFbJm7ByNQCSLhglyia5yXhNCKJf1CH1OO90ecshYdxbrZL5C70y_Sw5bl8Oqw9HjbHSrl63DPmnThwgWvInDgl3uw0b_rj4mPC-Z4Nh1JtrQra7Rhddtdjend4w-np_eZ_Ni8fbyOpsuCsOJ7AtrtZdGTIS1rK61t5NKCuNtzbzlE04J98wRYhpujDQaWggHDkk9A-mSV2N0f8zdpvi9gy-pTcjGta3uXNxlRWklalFx0oCUHaUmxZyT82qboEA6KErUgFKt1YBSDSjBqAAlmO5O-XpAARQ6E_Kfk4mqYrIZwh-POgdl98EllU0AGM6GBFCVjeG_Mz8jR4rl</recordid><startdate>20120901</startdate><enddate>20120901</enddate><creator>Vasiljevic, Jasmina</creator><creator>Ye, Andy</creator><general>Elsevier B.V</general><general>Elsevier</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope></search><sort><creationdate>20120901</creationdate><title>Analysis and architecture design of scalable fractional motion estimation for H.264 encoding</title><author>Vasiljevic, Jasmina ; Ye, Andy</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c407t-ddaf7c686dd255afd8376cfd52fd484104f2e00c94cc7ca9266eaf771f2d83b43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Architecture</topic><topic>Circuit properties</topic><topic>Coders</topic><topic>Computation</topic><topic>Design engineering</topic><topic>Digital circuits</topic><topic>Electric, optical and optoelectronic circuits</topic><topic>Electronic circuits</topic><topic>Electronics</topic><topic>Encoders</topic><topic>Encoding</topic><topic>Exact sciences and technology</topic><topic>Field-programmable gate arrays</topic><topic>Fractional motion estimation</topic><topic>H.264</topic><topic>Hardware</topic><topic>Scalability</topic><topic>Signal convertors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Vasiljevic, Jasmina</creatorcontrib><creatorcontrib>Ye, Andy</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>Integration (Amsterdam)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Vasiljevic, Jasmina</au><au>Ye, Andy</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Analysis and architecture design of scalable fractional motion estimation for H.264 encoding</atitle><jtitle>Integration (Amsterdam)</jtitle><date>2012-09-01</date><risdate>2012</risdate><volume>45</volume><issue>4</issue><spage>427</spage><epage>438</epage><pages>427-438</pages><issn>0167-9260</issn><eissn>1872-7522</eissn><coden>IVJODL</coden><abstract>Fractional Motion Estimation (FME) is an important part of the H.264/AVC video encoding standard. The algorithm can significantly increase the compression ratio of video encoders while improving video quality. However, it is computationally expensive and can consist of over 45% of the total motion estimation runtime. To maximize the performance and utilization of FME implementations on Field-Programmable Gate Arrays (FPGAs), one needs to effectively exploit the inherent parallelism in the algorithm. In this work, we explore two approaches to FME algorithm parallelization in order to effectively increase the processing power of the computing hardware. We call the first method vertical scaling and the second horizontal scaling. We implemented six scaled FME designs on a Xilinx XC5VLX85T (Virtex-5) FPGA. We found that scaling vertically within a 4×4 sub-block is more efficient than scaling horizontally across several sub-blocks. As a result, we were able to achieve higher video resolutions at lower hardware resource cost. In particular, it is shown that the best vertically scaled design can achieve 30fps of QSXGA video with 4 reference frames with only 25.5K LUTS and 28.7K registers.
► Explored Fractional Motion Estimation (FME) algorithm parallelization. ► Implemented six scaled designs on a Xilinx XC5VLX85T (Virtex-5) FPGA. ► Found that scaling vertically within a 4×4 sub-block is more efficient. ► Scaling horizontally across several sub-blocks is less efficient. ► Best vertically scaled design achieves 30fps at QSXGA with 4 reference frames.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.vlsi.2011.11.017</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0167-9260 |
ispartof | Integration (Amsterdam), 2012-09, Vol.45 (4), p.427-438 |
issn | 0167-9260 1872-7522 |
language | eng |
recordid | cdi_proquest_miscellaneous_1136563409 |
source | Elsevier ScienceDirect Journals |
subjects | Algorithms Applied sciences Architecture Circuit properties Coders Computation Design engineering Digital circuits Electric, optical and optoelectronic circuits Electronic circuits Electronics Encoders Encoding Exact sciences and technology Field-programmable gate arrays Fractional motion estimation H.264 Hardware Scalability Signal convertors |
title | Analysis and architecture design of scalable fractional motion estimation for H.264 encoding |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T01%3A13%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Analysis%20and%20architecture%20design%20of%20scalable%20fractional%20motion%20estimation%20for%20H.264%20encoding&rft.jtitle=Integration%20(Amsterdam)&rft.au=Vasiljevic,%20Jasmina&rft.date=2012-09-01&rft.volume=45&rft.issue=4&rft.spage=427&rft.epage=438&rft.pages=427-438&rft.issn=0167-9260&rft.eissn=1872-7522&rft.coden=IVJODL&rft_id=info:doi/10.1016/j.vlsi.2011.11.017&rft_dat=%3Cproquest_cross%3E1136563409%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1136563409&rft_id=info:pmid/&rft_els_id=S0167926011001064&rfr_iscdi=true |