Three-dimensional memory vectorization for high bandwidth media memory systems

Vector processors have good performance, cost and adaptability when targeting multimedia applications. However, for a significant number of media programs, conventional memory configurations fail to deliver enough memory references per cycle to feed the SIMD functional units. This paper addresses th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Corbal, J., Espasa, R., Valero, M.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 160
container_issue
container_start_page 149
container_title
container_volume
creator Corbal, J.
Espasa, R.
Valero, M.
description Vector processors have good performance, cost and adaptability when targeting multimedia applications. However, for a significant number of media programs, conventional memory configurations fail to deliver enough memory references per cycle to feed the SIMD functional units. This paper addresses the problem of the memory bandwidth. We propose a novel mechanism suitable for 2-dimensional vector architectures and targeted at providing high effective bandwidth for SIMD memory instructions. The basis of this mechanism is the extension of the scope of vectorization at the memory level, so that 3-dimensional memory patterns can be fetched into a second-level register file. By fetching long blocks of data and by reusing 2-dimensional memory streams at this second-level register file, we obtain a significant increase in the effective memory bandwidth. As side benefits, the new 3-dimensional load instructions provide a high robustness to memory latency and a significant reduction of the cache activity, thus reducing power and energy requirements. At the investment of a 50% more area than a regular SIMD register file, we have measured and average speed-up of 13% and the potential for power savings in the L2 cache of a 30%.
doi_str_mv 10.1109/MICRO.2002.1176246
format Conference Proceeding
fullrecord <record><control><sourceid>csuc_6IE</sourceid><recordid>TN_cdi_csuc_recercat_oai_recercat_cat_2072_284254</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1176246</ieee_id><sourcerecordid>oai_recercat_cat_2072_284254</sourcerecordid><originalsourceid>FETCH-LOGICAL-c261t-cce0a116f9fc22305646eb4a126c01ee2bc49bbd291ba27a111ae7bec46153253</originalsourceid><addsrcrecordid>eNpFkN1OwzAMhSMBEmPsBeCmL9ARu0naXKKJn0mDSWhcV0nq0qB1RUkBjacniCEuLOvY3zmSzdgF8DkA11cPy8XTeo6cY9KlQqGO2EyXFS-VllBJrY7ZBHiJuRASTtlZjK-c8yptJ-xx0wWivPE97aIfdmab9dQPYZ99kBuH4L_MmMZZO4Ss8y9dZs2u-fTN2CWu8eaPjvs4Uh_P2UlrtpFmhz5lz7c3m8V9vlrfLRfXq9yhgjF3jrgBUK1uHWLBpRKKrDCAynEgQuuEtrZBDdZgmVAwVFpyQoEsUBZTBr-5Lr67OpCj4MxYD8b_i5_CdHaNlUApkufy1-OJqH4LvjdhXx9eVnwDM-9hJQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Three-dimensional memory vectorization for high bandwidth media memory systems</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Corbal, J. ; Espasa, R. ; Valero, M.</creator><creatorcontrib>Corbal, J. ; Espasa, R. ; Valero, M.</creatorcontrib><description>Vector processors have good performance, cost and adaptability when targeting multimedia applications. However, for a significant number of media programs, conventional memory configurations fail to deliver enough memory references per cycle to feed the SIMD functional units. This paper addresses the problem of the memory bandwidth. We propose a novel mechanism suitable for 2-dimensional vector architectures and targeted at providing high effective bandwidth for SIMD memory instructions. The basis of this mechanism is the extension of the scope of vectorization at the memory level, so that 3-dimensional memory patterns can be fetched into a second-level register file. By fetching long blocks of data and by reusing 2-dimensional memory streams at this second-level register file, we obtain a significant increase in the effective memory bandwidth. As side benefits, the new 3-dimensional load instructions provide a high robustness to memory latency and a significant reduction of the cache activity, thus reducing power and energy requirements. At the investment of a 50% more area than a regular SIMD register file, we have measured and average speed-up of 13% and the potential for power savings in the L2 cache of a 30%.</description><identifier>ISSN: 1072-4451</identifier><identifier>ISBN: 9780769518596</identifier><identifier>ISBN: 0769518591</identifier><identifier>DOI: 10.1109/MICRO.2002.1176246</identifier><language>eng</language><publisher>IEEE</publisher><subject>Area measurement ; Arquitectura de computadors ; Bandwidth ; Costs ; Delay ; Feeds ; High bandwidth media memory systems ; Informàtica ; Investments ; Multimedia systems ; Parallel processing (Electronic computers) ; Power consumption ; Power measurement ; Processament en paral·lel (Ordinadors) ; Registers ; Robustness ; Sistemes multimèdia ; Vector processor systems ; Vector processors ; Àrees temàtiques de la UPC</subject><ispartof>35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings, 2002, p.149-160</ispartof><rights>info:eu-repo/semantics/openAccess</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1176246$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>230,309,310,780,784,789,790,885,2058,4050,4051,26974,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1176246$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Corbal, J.</creatorcontrib><creatorcontrib>Espasa, R.</creatorcontrib><creatorcontrib>Valero, M.</creatorcontrib><title>Three-dimensional memory vectorization for high bandwidth media memory systems</title><title>35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings</title><addtitle>MICRO</addtitle><description>Vector processors have good performance, cost and adaptability when targeting multimedia applications. However, for a significant number of media programs, conventional memory configurations fail to deliver enough memory references per cycle to feed the SIMD functional units. This paper addresses the problem of the memory bandwidth. We propose a novel mechanism suitable for 2-dimensional vector architectures and targeted at providing high effective bandwidth for SIMD memory instructions. The basis of this mechanism is the extension of the scope of vectorization at the memory level, so that 3-dimensional memory patterns can be fetched into a second-level register file. By fetching long blocks of data and by reusing 2-dimensional memory streams at this second-level register file, we obtain a significant increase in the effective memory bandwidth. As side benefits, the new 3-dimensional load instructions provide a high robustness to memory latency and a significant reduction of the cache activity, thus reducing power and energy requirements. At the investment of a 50% more area than a regular SIMD register file, we have measured and average speed-up of 13% and the potential for power savings in the L2 cache of a 30%.</description><subject>Area measurement</subject><subject>Arquitectura de computadors</subject><subject>Bandwidth</subject><subject>Costs</subject><subject>Delay</subject><subject>Feeds</subject><subject>High bandwidth media memory systems</subject><subject>Informàtica</subject><subject>Investments</subject><subject>Multimedia systems</subject><subject>Parallel processing (Electronic computers)</subject><subject>Power consumption</subject><subject>Power measurement</subject><subject>Processament en paral·lel (Ordinadors)</subject><subject>Registers</subject><subject>Robustness</subject><subject>Sistemes multimèdia</subject><subject>Vector processor systems</subject><subject>Vector processors</subject><subject>Àrees temàtiques de la UPC</subject><issn>1072-4451</issn><isbn>9780769518596</isbn><isbn>0769518591</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2002</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><sourceid>XX2</sourceid><recordid>eNpFkN1OwzAMhSMBEmPsBeCmL9ARu0naXKKJn0mDSWhcV0nq0qB1RUkBjacniCEuLOvY3zmSzdgF8DkA11cPy8XTeo6cY9KlQqGO2EyXFS-VllBJrY7ZBHiJuRASTtlZjK-c8yptJ-xx0wWivPE97aIfdmab9dQPYZ99kBuH4L_MmMZZO4Ss8y9dZs2u-fTN2CWu8eaPjvs4Uh_P2UlrtpFmhz5lz7c3m8V9vlrfLRfXq9yhgjF3jrgBUK1uHWLBpRKKrDCAynEgQuuEtrZBDdZgmVAwVFpyQoEsUBZTBr-5Lr67OpCj4MxYD8b_i5_CdHaNlUApkufy1-OJqH4LvjdhXx9eVnwDM-9hJQ</recordid><startdate>2002</startdate><enddate>2002</enddate><creator>Corbal, J.</creator><creator>Espasa, R.</creator><creator>Valero, M.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers (IEEE)</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope><scope>XX2</scope></search><sort><creationdate>2002</creationdate><title>Three-dimensional memory vectorization for high bandwidth media memory systems</title><author>Corbal, J. ; Espasa, R. ; Valero, M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c261t-cce0a116f9fc22305646eb4a126c01ee2bc49bbd291ba27a111ae7bec46153253</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2002</creationdate><topic>Area measurement</topic><topic>Arquitectura de computadors</topic><topic>Bandwidth</topic><topic>Costs</topic><topic>Delay</topic><topic>Feeds</topic><topic>High bandwidth media memory systems</topic><topic>Informàtica</topic><topic>Investments</topic><topic>Multimedia systems</topic><topic>Parallel processing (Electronic computers)</topic><topic>Power consumption</topic><topic>Power measurement</topic><topic>Processament en paral·lel (Ordinadors)</topic><topic>Registers</topic><topic>Robustness</topic><topic>Sistemes multimèdia</topic><topic>Vector processor systems</topic><topic>Vector processors</topic><topic>Àrees temàtiques de la UPC</topic><toplevel>online_resources</toplevel><creatorcontrib>Corbal, J.</creatorcontrib><creatorcontrib>Espasa, R.</creatorcontrib><creatorcontrib>Valero, M.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection><collection>Recercat</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Corbal, J.</au><au>Espasa, R.</au><au>Valero, M.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Three-dimensional memory vectorization for high bandwidth media memory systems</atitle><btitle>35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings</btitle><stitle>MICRO</stitle><date>2002</date><risdate>2002</risdate><spage>149</spage><epage>160</epage><pages>149-160</pages><issn>1072-4451</issn><isbn>9780769518596</isbn><isbn>0769518591</isbn><abstract>Vector processors have good performance, cost and adaptability when targeting multimedia applications. However, for a significant number of media programs, conventional memory configurations fail to deliver enough memory references per cycle to feed the SIMD functional units. This paper addresses the problem of the memory bandwidth. We propose a novel mechanism suitable for 2-dimensional vector architectures and targeted at providing high effective bandwidth for SIMD memory instructions. The basis of this mechanism is the extension of the scope of vectorization at the memory level, so that 3-dimensional memory patterns can be fetched into a second-level register file. By fetching long blocks of data and by reusing 2-dimensional memory streams at this second-level register file, we obtain a significant increase in the effective memory bandwidth. As side benefits, the new 3-dimensional load instructions provide a high robustness to memory latency and a significant reduction of the cache activity, thus reducing power and energy requirements. At the investment of a 50% more area than a regular SIMD register file, we have measured and average speed-up of 13% and the potential for power savings in the L2 cache of a 30%.</abstract><pub>IEEE</pub><doi>10.1109/MICRO.2002.1176246</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1072-4451
ispartof 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings, 2002, p.149-160
issn 1072-4451
language eng
recordid cdi_csuc_recercat_oai_recercat_cat_2072_284254
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Area measurement
Arquitectura de computadors
Bandwidth
Costs
Delay
Feeds
High bandwidth media memory systems
Informàtica
Investments
Multimedia systems
Parallel processing (Electronic computers)
Power consumption
Power measurement
Processament en paral·lel (Ordinadors)
Registers
Robustness
Sistemes multimèdia
Vector processor systems
Vector processors
Àrees temàtiques de la UPC
title Three-dimensional memory vectorization for high bandwidth media memory systems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T12%3A01%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-csuc_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Three-dimensional%20memory%20vectorization%20for%20high%20bandwidth%20media%20memory%20systems&rft.btitle=35th%20Annual%20IEEE/ACM%20International%20Symposium%20on%20Microarchitecture,%202002.%20(MICRO-35).%20Proceedings&rft.au=Corbal,%20J.&rft.date=2002&rft.spage=149&rft.epage=160&rft.pages=149-160&rft.issn=1072-4451&rft.isbn=9780769518596&rft.isbn_list=0769518591&rft_id=info:doi/10.1109/MICRO.2002.1176246&rft_dat=%3Ccsuc_6IE%3Eoai_recercat_cat_2072_284254%3C/csuc_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1176246&rfr_iscdi=true