Reducing operand communication overhead using instruction clustering for multimedia applications
As technology trends yield shorter cycle times and larger, wider datapaths in architectures for multimedia systems, global broadcast networks for operand communication are becoming a major bottleneck in processor performance. New low latency operand transport techniques are needed. This paper propos...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | 8 pp. |
container_title | |
container_volume | |
creator | Hongkyu Kim Wills, D.S. Wills, L.M. |
description | As technology trends yield shorter cycle times and larger, wider datapaths in architectures for multimedia systems, global broadcast networks for operand communication are becoming a major bottleneck in processor performance. New low latency operand transport techniques are needed. This paper proposes and evaluates lower cost mechanisms than traditional bypass networks, exploiting regular operand distribution patterns in multimedia applications. To reduce latency associated with operand movement within a datapath, our mechanism, called dynamic instruction clustering, groups chains of dependent instructions within a basic block at runtime, identifies intermediate value transportation, and schedules it on networked ALUs which are connected by a local dedicated network. By converting global communication into local, the transport latency can be minimized and the critical path of the application code can be executed in consecutive, shortened cycles, resulting in improved performance. We demonstrated that 28% and 30% of total dependence edges residing in the instruction window can be localized on 8 and 16-way machines, respectively. Our results show that the overall performance gains over a wide range of multimedia applications are 16% for 8-way and 35% for 16-way on average. |
doi_str_mv | 10.1109/ISM.2005.95 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_1565852</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1565852</ieee_id><sourcerecordid>1565852</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-c6ca060cea1d3e6263ae810a6cfe9622e6a80cc6574ad8480b7725a646f5fa583</originalsourceid><addsrcrecordid>eNotjEtLxDAUhQMiqOOsXLrpH2i9TZrbZCmDj4ERQWc_XpNbjfRF0gr-e51xzubA-Q6fEFclFGUJ9mb9-lRIAF1YfSIuoEarZWWsOhPLlL7gL8pqBHku3l7Yzy70H9kwcqTeZ27ourkPjqYw9NnwzfGTyWdz2p9Cn6Y4uwNy7Zwmjvu5GWLWze0UOvaBMhrH9ihIl-K0oTbx8tgLsb2_264e883zw3p1u8mDhSl36AgQHFPpFaNERWxKIHQNW5SSkQw4h7quyJvKwHtdS01YYaMb0kYtxPW_NjDzboyho_izKzVqo6X6Be55VAc</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Reducing operand communication overhead using instruction clustering for multimedia applications</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Hongkyu Kim ; Wills, D.S. ; Wills, L.M.</creator><creatorcontrib>Hongkyu Kim ; Wills, D.S. ; Wills, L.M.</creatorcontrib><description>As technology trends yield shorter cycle times and larger, wider datapaths in architectures for multimedia systems, global broadcast networks for operand communication are becoming a major bottleneck in processor performance. New low latency operand transport techniques are needed. This paper proposes and evaluates lower cost mechanisms than traditional bypass networks, exploiting regular operand distribution patterns in multimedia applications. To reduce latency associated with operand movement within a datapath, our mechanism, called dynamic instruction clustering, groups chains of dependent instructions within a basic block at runtime, identifies intermediate value transportation, and schedules it on networked ALUs which are connected by a local dedicated network. By converting global communication into local, the transport latency can be minimized and the critical path of the application code can be executed in consecutive, shortened cycles, resulting in improved performance. We demonstrated that 28% and 30% of total dependence edges residing in the instruction window can be localized on 8 and 16-way machines, respectively. Our results show that the overall performance gains over a wide range of multimedia applications are 16% for 8-way and 35% for 16-way on average.</description><identifier>ISBN: 0769524893</identifier><identifier>ISBN: 9780769524894</identifier><identifier>DOI: 10.1109/ISM.2005.95</identifier><language>eng</language><publisher>IEEE</publisher><subject>Broadcast technology ; Costs ; Delay ; Dynamic scheduling ; Global communication ; Multimedia communication ; Multimedia systems ; Performance gain ; Runtime ; Transportation</subject><ispartof>Seventh IEEE International Symposium on Multimedia (ISM'05), 2005, p.8 pp.</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1565852$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2052,4036,4037,27902,54895</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1565852$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hongkyu Kim</creatorcontrib><creatorcontrib>Wills, D.S.</creatorcontrib><creatorcontrib>Wills, L.M.</creatorcontrib><title>Reducing operand communication overhead using instruction clustering for multimedia applications</title><title>Seventh IEEE International Symposium on Multimedia (ISM'05)</title><addtitle>ISM</addtitle><description>As technology trends yield shorter cycle times and larger, wider datapaths in architectures for multimedia systems, global broadcast networks for operand communication are becoming a major bottleneck in processor performance. New low latency operand transport techniques are needed. This paper proposes and evaluates lower cost mechanisms than traditional bypass networks, exploiting regular operand distribution patterns in multimedia applications. To reduce latency associated with operand movement within a datapath, our mechanism, called dynamic instruction clustering, groups chains of dependent instructions within a basic block at runtime, identifies intermediate value transportation, and schedules it on networked ALUs which are connected by a local dedicated network. By converting global communication into local, the transport latency can be minimized and the critical path of the application code can be executed in consecutive, shortened cycles, resulting in improved performance. We demonstrated that 28% and 30% of total dependence edges residing in the instruction window can be localized on 8 and 16-way machines, respectively. Our results show that the overall performance gains over a wide range of multimedia applications are 16% for 8-way and 35% for 16-way on average.</description><subject>Broadcast technology</subject><subject>Costs</subject><subject>Delay</subject><subject>Dynamic scheduling</subject><subject>Global communication</subject><subject>Multimedia communication</subject><subject>Multimedia systems</subject><subject>Performance gain</subject><subject>Runtime</subject><subject>Transportation</subject><isbn>0769524893</isbn><isbn>9780769524894</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2005</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotjEtLxDAUhQMiqOOsXLrpH2i9TZrbZCmDj4ERQWc_XpNbjfRF0gr-e51xzubA-Q6fEFclFGUJ9mb9-lRIAF1YfSIuoEarZWWsOhPLlL7gL8pqBHku3l7Yzy70H9kwcqTeZ27ourkPjqYw9NnwzfGTyWdz2p9Cn6Y4uwNy7Zwmjvu5GWLWze0UOvaBMhrH9ihIl-K0oTbx8tgLsb2_264e883zw3p1u8mDhSl36AgQHFPpFaNERWxKIHQNW5SSkQw4h7quyJvKwHtdS01YYaMb0kYtxPW_NjDzboyho_izKzVqo6X6Be55VAc</recordid><startdate>2005</startdate><enddate>2005</enddate><creator>Hongkyu Kim</creator><creator>Wills, D.S.</creator><creator>Wills, L.M.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>2005</creationdate><title>Reducing operand communication overhead using instruction clustering for multimedia applications</title><author>Hongkyu Kim ; Wills, D.S. ; Wills, L.M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-c6ca060cea1d3e6263ae810a6cfe9622e6a80cc6574ad8480b7725a646f5fa583</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Broadcast technology</topic><topic>Costs</topic><topic>Delay</topic><topic>Dynamic scheduling</topic><topic>Global communication</topic><topic>Multimedia communication</topic><topic>Multimedia systems</topic><topic>Performance gain</topic><topic>Runtime</topic><topic>Transportation</topic><toplevel>online_resources</toplevel><creatorcontrib>Hongkyu Kim</creatorcontrib><creatorcontrib>Wills, D.S.</creatorcontrib><creatorcontrib>Wills, L.M.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hongkyu Kim</au><au>Wills, D.S.</au><au>Wills, L.M.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Reducing operand communication overhead using instruction clustering for multimedia applications</atitle><btitle>Seventh IEEE International Symposium on Multimedia (ISM'05)</btitle><stitle>ISM</stitle><date>2005</date><risdate>2005</risdate><spage>8 pp.</spage><pages>8 pp.-</pages><isbn>0769524893</isbn><isbn>9780769524894</isbn><abstract>As technology trends yield shorter cycle times and larger, wider datapaths in architectures for multimedia systems, global broadcast networks for operand communication are becoming a major bottleneck in processor performance. New low latency operand transport techniques are needed. This paper proposes and evaluates lower cost mechanisms than traditional bypass networks, exploiting regular operand distribution patterns in multimedia applications. To reduce latency associated with operand movement within a datapath, our mechanism, called dynamic instruction clustering, groups chains of dependent instructions within a basic block at runtime, identifies intermediate value transportation, and schedules it on networked ALUs which are connected by a local dedicated network. By converting global communication into local, the transport latency can be minimized and the critical path of the application code can be executed in consecutive, shortened cycles, resulting in improved performance. We demonstrated that 28% and 30% of total dependence edges residing in the instruction window can be localized on 8 and 16-way machines, respectively. Our results show that the overall performance gains over a wide range of multimedia applications are 16% for 8-way and 35% for 16-way on average.</abstract><pub>IEEE</pub><doi>10.1109/ISM.2005.95</doi></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISBN: 0769524893 |
ispartof | Seventh IEEE International Symposium on Multimedia (ISM'05), 2005, p.8 pp. |
issn | |
language | eng |
recordid | cdi_ieee_primary_1565852 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Broadcast technology Costs Delay Dynamic scheduling Global communication Multimedia communication Multimedia systems Performance gain Runtime Transportation |
title | Reducing operand communication overhead using instruction clustering for multimedia applications |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T02%3A25%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Reducing%20operand%20communication%20overhead%20using%20instruction%20clustering%20for%20multimedia%20applications&rft.btitle=Seventh%20IEEE%20International%20Symposium%20on%20Multimedia%20(ISM'05)&rft.au=Hongkyu%20Kim&rft.date=2005&rft.spage=8%20pp.&rft.pages=8%20pp.-&rft.isbn=0769524893&rft.isbn_list=9780769524894&rft_id=info:doi/10.1109/ISM.2005.95&rft_dat=%3Cieee_6IE%3E1565852%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1565852&rfr_iscdi=true |