Chameleon: Virtualizing idle acceleration cores of a heterogeneous multicore processor for caching and prefetching

Heterogeneous multicore processors have emerged as an energy- and area-efficient architectural solution to improving performance for domain-specific applications such as those with a plethora of data-level parallelism. These processors typically contain a large number of small, compute-centric cores...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on architecture and code optimization 2010-04, Vol.7 (1), p.1-35
Hauptverfasser:	Woo, Dong Hyuk, Fryman, Joshua B., Knies, Allan D., Lee, Hsien-Hsin S.
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	35
container_issue	1
container_start_page	1
container_title	ACM transactions on architecture and code optimization
container_volume	7
creator	Woo, Dong Hyuk Fryman, Joshua B. Knies, Allan D. Lee, Hsien-Hsin S.
description	Heterogeneous multicore processors have emerged as an energy- and area-efficient architectural solution to improving performance for domain-specific applications such as those with a plethora of data-level parallelism. These processors typically contain a large number of small, compute-centric cores for acceleration while keeping one or two high-performance ILP cores on the die to guarantee single-thread performance. Although a major portion of the transistors are occupied by the acceleration cores, these resources will sit idle when running unparallelized legacy codes or the sequential part of an application. To address this underutilization issue, in this article, we introduce Chameleon, a flexible heterogeneous multicore architecture to virtualize these resources for enhancing memory performance when running sequential programs. The Chameleon architecture can dynamically virtualize the idle acceleration cores into a last-level cache, a data prefetcher, or a hybrid between these two techniques. In addition, Chameleon can operate in an adaptive mode that dynamically configures the acceleration cores between the hybrid mode and the prefetch-only mode by monitoring the effectiveness of the Chameleon cache mode. In our evaluation with SPEC2006 benchmark suite, different levels of performance improvements were achieved in different modes for different applications. In the case of the adaptive mode, Chameleon improves the performance of SPECint06 and SPECfp06 by 31% and 15%, on average. When considering only memory-intensive applications, Chameleon improves the system performance by 50% and 26% for SPECint06 and SPECfp06, respectively.
doi_str_mv	10.1145/1736065.1736068
format	Article
fullrecord	<record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1145_1736065_1736068</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1145_1736065_1736068</sourcerecordid><originalsourceid>FETCH-LOGICAL-c195t-efafc5fc56254afb93f76d3a469eba5b0793991e75375094dd8161041e05c7fe3</originalsourceid><addsrcrecordid>eNo1j01rwkAQhpei0Kg991ckzmR2djNHCfYDBC96DptkFlu0KVkv_fe1mMILz3t64DHmGaFAtLxGTw4cF3dWDyZDtjYn8TT7_-zco1mk9AlQSgmQmaw-hYuedfhamXkM56RPE5fm-LI91G_5bv_6Xm92eYfC11xjiB3f5kq2IbZC0buegnWibeAWvJAIqmfyDGL7vkKHYFGBOx-VlmZ993bjkNKosfkePy5h_GkQmr-SZiqZWNEvaH442A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Chameleon: Virtualizing idle acceleration cores of a heterogeneous multicore processor for caching and prefetching</title><source>ACM Digital Library</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Woo, Dong Hyuk ; Fryman, Joshua B. ; Knies, Allan D. ; Lee, Hsien-Hsin S.</creator><creatorcontrib>Woo, Dong Hyuk ; Fryman, Joshua B. ; Knies, Allan D. ; Lee, Hsien-Hsin S.</creatorcontrib><description>Heterogeneous multicore processors have emerged as an energy- and area-efficient architectural solution to improving performance for domain-specific applications such as those with a plethora of data-level parallelism. These processors typically contain a large number of small, compute-centric cores for acceleration while keeping one or two high-performance ILP cores on the die to guarantee single-thread performance. Although a major portion of the transistors are occupied by the acceleration cores, these resources will sit idle when running unparallelized legacy codes or the sequential part of an application. To address this underutilization issue, in this article, we introduce Chameleon, a flexible heterogeneous multicore architecture to virtualize these resources for enhancing memory performance when running sequential programs. The Chameleon architecture can dynamically virtualize the idle acceleration cores into a last-level cache, a data prefetcher, or a hybrid between these two techniques. In addition, Chameleon can operate in an adaptive mode that dynamically configures the acceleration cores between the hybrid mode and the prefetch-only mode by monitoring the effectiveness of the Chameleon cache mode. In our evaluation with SPEC2006 benchmark suite, different levels of performance improvements were achieved in different modes for different applications. In the case of the adaptive mode, Chameleon improves the performance of SPECint06 and SPECfp06 by 31% and 15%, on average. When considering only memory-intensive applications, Chameleon improves the system performance by 50% and 26% for SPECint06 and SPECfp06, respectively.</description><identifier>ISSN: 1544-3566</identifier><identifier>EISSN: 1544-3973</identifier><identifier>DOI: 10.1145/1736065.1736068</identifier><language>eng</language><ispartof>ACM transactions on architecture and code optimization, 2010-04, Vol.7 (1), p.1-35</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c195t-efafc5fc56254afb93f76d3a469eba5b0793991e75375094dd8161041e05c7fe3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Woo, Dong Hyuk</creatorcontrib><creatorcontrib>Fryman, Joshua B.</creatorcontrib><creatorcontrib>Knies, Allan D.</creatorcontrib><creatorcontrib>Lee, Hsien-Hsin S.</creatorcontrib><title>Chameleon: Virtualizing idle acceleration cores of a heterogeneous multicore processor for caching and prefetching</title><title>ACM transactions on architecture and code optimization</title><description>Heterogeneous multicore processors have emerged as an energy- and area-efficient architectural solution to improving performance for domain-specific applications such as those with a plethora of data-level parallelism. These processors typically contain a large number of small, compute-centric cores for acceleration while keeping one or two high-performance ILP cores on the die to guarantee single-thread performance. Although a major portion of the transistors are occupied by the acceleration cores, these resources will sit idle when running unparallelized legacy codes or the sequential part of an application. To address this underutilization issue, in this article, we introduce Chameleon, a flexible heterogeneous multicore architecture to virtualize these resources for enhancing memory performance when running sequential programs. The Chameleon architecture can dynamically virtualize the idle acceleration cores into a last-level cache, a data prefetcher, or a hybrid between these two techniques. In addition, Chameleon can operate in an adaptive mode that dynamically configures the acceleration cores between the hybrid mode and the prefetch-only mode by monitoring the effectiveness of the Chameleon cache mode. In our evaluation with SPEC2006 benchmark suite, different levels of performance improvements were achieved in different modes for different applications. In the case of the adaptive mode, Chameleon improves the performance of SPECint06 and SPECfp06 by 31% and 15%, on average. When considering only memory-intensive applications, Chameleon improves the system performance by 50% and 26% for SPECint06 and SPECfp06, respectively.</description><issn>1544-3566</issn><issn>1544-3973</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><recordid>eNo1j01rwkAQhpei0Kg991ckzmR2djNHCfYDBC96DptkFlu0KVkv_fe1mMILz3t64DHmGaFAtLxGTw4cF3dWDyZDtjYn8TT7_-zco1mk9AlQSgmQmaw-hYuedfhamXkM56RPE5fm-LI91G_5bv_6Xm92eYfC11xjiB3f5kq2IbZC0buegnWibeAWvJAIqmfyDGL7vkKHYFGBOx-VlmZ993bjkNKosfkePy5h_GkQmr-SZiqZWNEvaH442A</recordid><startdate>201004</startdate><enddate>201004</enddate><creator>Woo, Dong Hyuk</creator><creator>Fryman, Joshua B.</creator><creator>Knies, Allan D.</creator><creator>Lee, Hsien-Hsin S.</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>201004</creationdate><title>Chameleon</title><author>Woo, Dong Hyuk ; Fryman, Joshua B. ; Knies, Allan D. ; Lee, Hsien-Hsin S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c195t-efafc5fc56254afb93f76d3a469eba5b0793991e75375094dd8161041e05c7fe3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Woo, Dong Hyuk</creatorcontrib><creatorcontrib>Fryman, Joshua B.</creatorcontrib><creatorcontrib>Knies, Allan D.</creatorcontrib><creatorcontrib>Lee, Hsien-Hsin S.</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on architecture and code optimization</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Woo, Dong Hyuk</au><au>Fryman, Joshua B.</au><au>Knies, Allan D.</au><au>Lee, Hsien-Hsin S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Chameleon: Virtualizing idle acceleration cores of a heterogeneous multicore processor for caching and prefetching</atitle><jtitle>ACM transactions on architecture and code optimization</jtitle><date>2010-04</date><risdate>2010</risdate><volume>7</volume><issue>1</issue><spage>1</spage><epage>35</epage><pages>1-35</pages><issn>1544-3566</issn><eissn>1544-3973</eissn><abstract>Heterogeneous multicore processors have emerged as an energy- and area-efficient architectural solution to improving performance for domain-specific applications such as those with a plethora of data-level parallelism. These processors typically contain a large number of small, compute-centric cores for acceleration while keeping one or two high-performance ILP cores on the die to guarantee single-thread performance. Although a major portion of the transistors are occupied by the acceleration cores, these resources will sit idle when running unparallelized legacy codes or the sequential part of an application. To address this underutilization issue, in this article, we introduce Chameleon, a flexible heterogeneous multicore architecture to virtualize these resources for enhancing memory performance when running sequential programs. The Chameleon architecture can dynamically virtualize the idle acceleration cores into a last-level cache, a data prefetcher, or a hybrid between these two techniques. In addition, Chameleon can operate in an adaptive mode that dynamically configures the acceleration cores between the hybrid mode and the prefetch-only mode by monitoring the effectiveness of the Chameleon cache mode. In our evaluation with SPEC2006 benchmark suite, different levels of performance improvements were achieved in different modes for different applications. In the case of the adaptive mode, Chameleon improves the performance of SPECint06 and SPECfp06 by 31% and 15%, on average. When considering only memory-intensive applications, Chameleon improves the system performance by 50% and 26% for SPECint06 and SPECfp06, respectively.</abstract><doi>10.1145/1736065.1736068</doi><tpages>35</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1544-3566
ispartof	ACM transactions on architecture and code optimization, 2010-04, Vol.7 (1), p.1-35
issn	1544-3566 1544-3973
language	eng
recordid	cdi_crossref_primary_10_1145_1736065_1736068
source	ACM Digital Library; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
title	Chameleon: Virtualizing idle acceleration cores of a heterogeneous multicore processor for caching and prefetching
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-12T23%3A47%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Chameleon:%20Virtualizing%20idle%20acceleration%20cores%20of%20a%20heterogeneous%20multicore%20processor%20for%20caching%20and%20prefetching&rft.jtitle=ACM%20transactions%20on%20architecture%20and%20code%20optimization&rft.au=Woo,%20Dong%20Hyuk&rft.date=2010-04&rft.volume=7&rft.issue=1&rft.spage=1&rft.epage=35&rft.pages=1-35&rft.issn=1544-3566&rft.eissn=1544-3973&rft_id=info:doi/10.1145/1736065.1736068&rft_dat=%3Ccrossref%3E10_1145_1736065_1736068%3C/crossref%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true