Variable block size motion estimation implementation on compute unified device architecture (CUDA)

This paper proposes a highly parallel variable block size full search motion estimation algorithm with concurrent parallel reduction (CPR) on graphics processing unit (GPU) using compute unified device architecture (CUDA). This approach minimizes memory access latency by using high-speed on-chip mem...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Lee, Dong-Kyu, Oh, Seoung-Jun
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 634
container_issue
container_start_page 633
container_title
container_volume
creator Lee, Dong-Kyu
Oh, Seoung-Jun
description This paper proposes a highly parallel variable block size full search motion estimation algorithm with concurrent parallel reduction (CPR) on graphics processing unit (GPU) using compute unified device architecture (CUDA). This approach minimizes memory access latency by using high-speed on-chip memory of GPU. By applying parallel reductions concurrently depending on the amount of data and the data dependency, the proposed approach increases thread utilization and decreases the number of synchronization points which cause latency. Experimental results show that the proposed approach achieves substantial improvement up to 92 times than the central processing unit (CPU) only counterpart.
doi_str_mv 10.1109/ICCE.2013.6487048
format Conference Proceeding
fullrecord <record><control><sourceid>proquest_6IE</sourceid><recordid>TN_cdi_proquest_miscellaneous_1786154967</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6487048</ieee_id><sourcerecordid>1786154967</sourcerecordid><originalsourceid>FETCH-LOGICAL-i123t-181110b7716197ae27f0532dde14388e01adc991b7c2259892a57faa94ffd5883</originalsourceid><addsrcrecordid>eNpFUE1LxDAUjF_g7uoPEC85roeueUnaJEepqy4seHG9ljR9xWi_bFpBf73FXRAevDfM8JgZQq6ArQCYud2k6XrFGYhVIrViUh-ROchECRAJN8dkxiHWkWQMTv4JYKcHQhgjz8k8hPdJYUxsZiR_tb23eYU0r1r3QYP_QVq3g28bimHwtf07fd1VWGMz7OE0rq27cUA6Nr70WNACv7xDanv35gd0w9gjXaa7-7ubC3JW2irg5WEvyO5h_ZI-Rdvnx016t408cDFEoGHKmCsFCRhlkauSxYIXBYIUWiMDWzhjIFeO89how22sSmuNLMsi1losyHL_t-vbz3Eyn9U-OKwq22A7hgyUTiCWZmplQa73Uo-IWddPMfvv7NCp-AWO2WaF</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>1786154967</pqid></control><display><type>conference_proceeding</type><title>Variable block size motion estimation implementation on compute unified device architecture (CUDA)</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Lee, Dong-Kyu ; Oh, Seoung-Jun</creator><creatorcontrib>Lee, Dong-Kyu ; Oh, Seoung-Jun</creatorcontrib><description>This paper proposes a highly parallel variable block size full search motion estimation algorithm with concurrent parallel reduction (CPR) on graphics processing unit (GPU) using compute unified device architecture (CUDA). This approach minimizes memory access latency by using high-speed on-chip memory of GPU. By applying parallel reductions concurrently depending on the amount of data and the data dependency, the proposed approach increases thread utilization and decreases the number of synchronization points which cause latency. Experimental results show that the proposed approach achieves substantial improvement up to 92 times than the central processing unit (CPU) only counterpart.</description><identifier>ISSN: 2158-3994</identifier><identifier>ISBN: 1467313610</identifier><identifier>ISBN: 9781467313612</identifier><identifier>EISSN: 2158-4001</identifier><identifier>EISBN: 1467313629</identifier><identifier>EISBN: 9781467313636</identifier><identifier>EISBN: 1467313637</identifier><identifier>EISBN: 9781467313629</identifier><identifier>DOI: 10.1109/ICCE.2013.6487048</identifier><language>eng</language><publisher>IEEE</publisher><subject>Architecture ; Blocking ; Central processing units ; Computer architecture ; Consumption ; Devices ; Graphics processing units ; High definition video ; Instruction sets ; Mathematical models ; Motion estimation ; Motion simulation ; Reduction ; Synchronization ; Video coding</subject><ispartof>2013 IEEE International Conference on Consumer Electronics (ICCE), 2013, p.633-634</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6487048$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,314,780,784,789,790,2058,27924,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6487048$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Lee, Dong-Kyu</creatorcontrib><creatorcontrib>Oh, Seoung-Jun</creatorcontrib><title>Variable block size motion estimation implementation on compute unified device architecture (CUDA)</title><title>2013 IEEE International Conference on Consumer Electronics (ICCE)</title><addtitle>ICCE</addtitle><description>This paper proposes a highly parallel variable block size full search motion estimation algorithm with concurrent parallel reduction (CPR) on graphics processing unit (GPU) using compute unified device architecture (CUDA). This approach minimizes memory access latency by using high-speed on-chip memory of GPU. By applying parallel reductions concurrently depending on the amount of data and the data dependency, the proposed approach increases thread utilization and decreases the number of synchronization points which cause latency. Experimental results show that the proposed approach achieves substantial improvement up to 92 times than the central processing unit (CPU) only counterpart.</description><subject>Architecture</subject><subject>Blocking</subject><subject>Central processing units</subject><subject>Computer architecture</subject><subject>Consumption</subject><subject>Devices</subject><subject>Graphics processing units</subject><subject>High definition video</subject><subject>Instruction sets</subject><subject>Mathematical models</subject><subject>Motion estimation</subject><subject>Motion simulation</subject><subject>Reduction</subject><subject>Synchronization</subject><subject>Video coding</subject><issn>2158-3994</issn><issn>2158-4001</issn><isbn>1467313610</isbn><isbn>9781467313612</isbn><isbn>1467313629</isbn><isbn>9781467313636</isbn><isbn>1467313637</isbn><isbn>9781467313629</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2013</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpFUE1LxDAUjF_g7uoPEC85roeueUnaJEepqy4seHG9ljR9xWi_bFpBf73FXRAevDfM8JgZQq6ArQCYud2k6XrFGYhVIrViUh-ROchECRAJN8dkxiHWkWQMTv4JYKcHQhgjz8k8hPdJYUxsZiR_tb23eYU0r1r3QYP_QVq3g28bimHwtf07fd1VWGMz7OE0rq27cUA6Nr70WNACv7xDanv35gd0w9gjXaa7-7ubC3JW2irg5WEvyO5h_ZI-Rdvnx016t408cDFEoGHKmCsFCRhlkauSxYIXBYIUWiMDWzhjIFeO89how22sSmuNLMsi1losyHL_t-vbz3Eyn9U-OKwq22A7hgyUTiCWZmplQa73Uo-IWddPMfvv7NCp-AWO2WaF</recordid><startdate>201301</startdate><enddate>201301</enddate><creator>Lee, Dong-Kyu</creator><creator>Oh, Seoung-Jun</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope></search><sort><creationdate>201301</creationdate><title>Variable block size motion estimation implementation on compute unified device architecture (CUDA)</title><author>Lee, Dong-Kyu ; Oh, Seoung-Jun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i123t-181110b7716197ae27f0532dde14388e01adc991b7c2259892a57faa94ffd5883</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Architecture</topic><topic>Blocking</topic><topic>Central processing units</topic><topic>Computer architecture</topic><topic>Consumption</topic><topic>Devices</topic><topic>Graphics processing units</topic><topic>High definition video</topic><topic>Instruction sets</topic><topic>Mathematical models</topic><topic>Motion estimation</topic><topic>Motion simulation</topic><topic>Reduction</topic><topic>Synchronization</topic><topic>Video coding</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lee, Dong-Kyu</creatorcontrib><creatorcontrib>Oh, Seoung-Jun</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lee, Dong-Kyu</au><au>Oh, Seoung-Jun</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Variable block size motion estimation implementation on compute unified device architecture (CUDA)</atitle><btitle>2013 IEEE International Conference on Consumer Electronics (ICCE)</btitle><stitle>ICCE</stitle><date>2013-01</date><risdate>2013</risdate><spage>633</spage><epage>634</epage><pages>633-634</pages><issn>2158-3994</issn><eissn>2158-4001</eissn><isbn>1467313610</isbn><isbn>9781467313612</isbn><eisbn>1467313629</eisbn><eisbn>9781467313636</eisbn><eisbn>1467313637</eisbn><eisbn>9781467313629</eisbn><abstract>This paper proposes a highly parallel variable block size full search motion estimation algorithm with concurrent parallel reduction (CPR) on graphics processing unit (GPU) using compute unified device architecture (CUDA). This approach minimizes memory access latency by using high-speed on-chip memory of GPU. By applying parallel reductions concurrently depending on the amount of data and the data dependency, the proposed approach increases thread utilization and decreases the number of synchronization points which cause latency. Experimental results show that the proposed approach achieves substantial improvement up to 92 times than the central processing unit (CPU) only counterpart.</abstract><pub>IEEE</pub><doi>10.1109/ICCE.2013.6487048</doi><tpages>2</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2158-3994
ispartof 2013 IEEE International Conference on Consumer Electronics (ICCE), 2013, p.633-634
issn 2158-3994
2158-4001
language eng
recordid cdi_proquest_miscellaneous_1786154967
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Architecture
Blocking
Central processing units
Computer architecture
Consumption
Devices
Graphics processing units
High definition video
Instruction sets
Mathematical models
Motion estimation
Motion simulation
Reduction
Synchronization
Video coding
title Variable block size motion estimation implementation on compute unified device architecture (CUDA)
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T08%3A18%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Variable%20block%20size%20motion%20estimation%20implementation%20on%20compute%20unified%20device%20architecture%20(CUDA)&rft.btitle=2013%20IEEE%20International%20Conference%20on%20Consumer%20Electronics%20(ICCE)&rft.au=Lee,%20Dong-Kyu&rft.date=2013-01&rft.spage=633&rft.epage=634&rft.pages=633-634&rft.issn=2158-3994&rft.eissn=2158-4001&rft.isbn=1467313610&rft.isbn_list=9781467313612&rft_id=info:doi/10.1109/ICCE.2013.6487048&rft_dat=%3Cproquest_6IE%3E1786154967%3C/proquest_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1467313629&rft.eisbn_list=9781467313636&rft.eisbn_list=1467313637&rft.eisbn_list=9781467313629&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1786154967&rft_id=info:pmid/&rft_ieee_id=6487048&rfr_iscdi=true