GREEN Cache: Exploiting the Disciplined Memory Model of OpenCL on GPUs

As various graphics processing unit architectures are deployed across broad computing spectrum from a hand-held or embedded device to a high-performance computing server, OpenCL becomes the de facto standard programming environment for general-purpose computing on graphics processing units. Unlike i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on computers 2015-11, Vol.64 (11), p.3167-3180
Hauptverfasser:	Jaekyu Lee, Dong Hyuk Woo, Hyesoon Kim, Azimi, Mani
Format:	Artikel
Sprache:	eng
Schlagworte:	Cache Computation Computational modeling Computer simulation Devices Dynamics Energy conservation GPU Graphics boards Graphics processing units Hardware Hierarchies Kernel Memory management Microprocessors OpenCL Product development Programming Semiconductors Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	3180
container_issue	11
container_start_page	3167
container_title	IEEE transactions on computers
container_volume	64
creator	Jaekyu Lee Dong Hyuk Woo Hyesoon Kim Azimi, Mani
description	As various graphics processing unit architectures are deployed across broad computing spectrum from a hand-held or embedded device to a high-performance computing server, OpenCL becomes the de facto standard programming environment for general-purpose computing on graphics processing units. Unlike its CPU counterpart, OpenCL has several distinct features such as its disciplined memory model, which is partially inherited from conventional 3D graphics programming models. On the other hand, due to ever increasing memory bandwidth pressure and low power requirement, the capacity of on-chip caches in GPUs keeps increasing overtime. Given such trends, we believe that we have interesting programming model/architecture co-optimization opportunities, in particular, how to energy-efficiently utilize large on-chip caches for GPUs. In this paper, as a showcase, we study the characteristics of the OpenCL memory model and propose a technique called GPU Region-aware energy-efficient non-inclusive cache hierarchy, or GREEN cache hierarchy. With the GREEN cache, our simulation results show that we can save 56 percent of dynamic energy in the L1 cache, 39 percent of dynamic energy in the L2 cache, and 50 percent of leakage energy in the L2 cache with practically no performance degradation and off-chip access increases.
doi_str_mv	10.1109/TC.2015.2395435
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_7018047</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7018047</ieee_id><sourcerecordid>1793277646</sourcerecordid><originalsourceid>FETCH-LOGICAL-c322t-a0dcc62cb8f70222e550793ea3413403890a92d299e3268b9b5572134faeaae13</originalsourceid><addsrcrecordid>eNpdkDFPwzAQhS0EEqUwM7BYYmFJsc9xErOhkBakliLUzpabXGiqNA5xKtF_j6tWDEw33PfevXuE3HI24pypx0U6AsblCISSoZBnZMCljAOlZHROBozxJFAiZJfkyrkNYywCpgZkPPnMsneamnyNTzT7aWtb9VXzRfs10pfK5VVbVw0WdIZb2-3pzBZYU1vSeYtNOqW2oZOPpbsmF6WpHd6c5pAsx9kifQ2m88lb-jwNcgHQB4YVeR5BvkrKmAEASsliJdCIkPtsIlHMKChAKRQQJSu18i-AX5UGjUEuhuTh6Nt29nuHrtdbnxHr2jRod05z7wZxHIWRR-__oRu76xqfzlOgAPxp6anHI5V31rkOS9121dZ0e82ZPvSqF6k-9KpPvXrF3VFRIeIfHft-WRiLX1iVb5c</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1729225075</pqid></control><display><type>article</type><title>GREEN Cache: Exploiting the Disciplined Memory Model of OpenCL on GPUs</title><source>IEEE Electronic Library (IEL)</source><creator>Jaekyu Lee ; Dong Hyuk Woo ; Hyesoon Kim ; Azimi, Mani</creator><creatorcontrib>Jaekyu Lee ; Dong Hyuk Woo ; Hyesoon Kim ; Azimi, Mani</creatorcontrib><description>As various graphics processing unit architectures are deployed across broad computing spectrum from a hand-held or embedded device to a high-performance computing server, OpenCL becomes the de facto standard programming environment for general-purpose computing on graphics processing units. Unlike its CPU counterpart, OpenCL has several distinct features such as its disciplined memory model, which is partially inherited from conventional 3D graphics programming models. On the other hand, due to ever increasing memory bandwidth pressure and low power requirement, the capacity of on-chip caches in GPUs keeps increasing overtime. Given such trends, we believe that we have interesting programming model/architecture co-optimization opportunities, in particular, how to energy-efficiently utilize large on-chip caches for GPUs. In this paper, as a showcase, we study the characteristics of the OpenCL memory model and propose a technique called GPU Region-aware energy-efficient non-inclusive cache hierarchy, or GREEN cache hierarchy. With the GREEN cache, our simulation results show that we can save 56 percent of dynamic energy in the L1 cache, 39 percent of dynamic energy in the L2 cache, and 50 percent of leakage energy in the L2 cache with practically no performance degradation and off-chip access increases.</description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/TC.2015.2395435</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Cache ; Computation ; Computational modeling ; Computer simulation ; Devices ; Dynamics ; Energy conservation ; GPU ; Graphics boards ; Graphics processing units ; Hardware ; Hierarchies ; Kernel ; Memory management ; Microprocessors ; OpenCL ; Product development ; Programming ; Semiconductors ; Training</subject><ispartof>IEEE transactions on computers, 2015-11, Vol.64 (11), p.3167-3180</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Nov 2015</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c322t-a0dcc62cb8f70222e550793ea3413403890a92d299e3268b9b5572134faeaae13</citedby><cites>FETCH-LOGICAL-c322t-a0dcc62cb8f70222e550793ea3413403890a92d299e3268b9b5572134faeaae13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7018047$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7018047$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Jaekyu Lee</creatorcontrib><creatorcontrib>Dong Hyuk Woo</creatorcontrib><creatorcontrib>Hyesoon Kim</creatorcontrib><creatorcontrib>Azimi, Mani</creatorcontrib><title>GREEN Cache: Exploiting the Disciplined Memory Model of OpenCL on GPUs</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description>As various graphics processing unit architectures are deployed across broad computing spectrum from a hand-held or embedded device to a high-performance computing server, OpenCL becomes the de facto standard programming environment for general-purpose computing on graphics processing units. Unlike its CPU counterpart, OpenCL has several distinct features such as its disciplined memory model, which is partially inherited from conventional 3D graphics programming models. On the other hand, due to ever increasing memory bandwidth pressure and low power requirement, the capacity of on-chip caches in GPUs keeps increasing overtime. Given such trends, we believe that we have interesting programming model/architecture co-optimization opportunities, in particular, how to energy-efficiently utilize large on-chip caches for GPUs. In this paper, as a showcase, we study the characteristics of the OpenCL memory model and propose a technique called GPU Region-aware energy-efficient non-inclusive cache hierarchy, or GREEN cache hierarchy. With the GREEN cache, our simulation results show that we can save 56 percent of dynamic energy in the L1 cache, 39 percent of dynamic energy in the L2 cache, and 50 percent of leakage energy in the L2 cache with practically no performance degradation and off-chip access increases.</description><subject>Cache</subject><subject>Computation</subject><subject>Computational modeling</subject><subject>Computer simulation</subject><subject>Devices</subject><subject>Dynamics</subject><subject>Energy conservation</subject><subject>GPU</subject><subject>Graphics boards</subject><subject>Graphics processing units</subject><subject>Hardware</subject><subject>Hierarchies</subject><subject>Kernel</subject><subject>Memory management</subject><subject>Microprocessors</subject><subject>OpenCL</subject><subject>Product development</subject><subject>Programming</subject><subject>Semiconductors</subject><subject>Training</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkDFPwzAQhS0EEqUwM7BYYmFJsc9xErOhkBakliLUzpabXGiqNA5xKtF_j6tWDEw33PfevXuE3HI24pypx0U6AsblCISSoZBnZMCljAOlZHROBozxJFAiZJfkyrkNYywCpgZkPPnMsneamnyNTzT7aWtb9VXzRfs10pfK5VVbVw0WdIZb2-3pzBZYU1vSeYtNOqW2oZOPpbsmF6WpHd6c5pAsx9kifQ2m88lb-jwNcgHQB4YVeR5BvkrKmAEASsliJdCIkPtsIlHMKChAKRQQJSu18i-AX5UGjUEuhuTh6Nt29nuHrtdbnxHr2jRod05z7wZxHIWRR-__oRu76xqfzlOgAPxp6anHI5V31rkOS9121dZ0e82ZPvSqF6k-9KpPvXrF3VFRIeIfHft-WRiLX1iVb5c</recordid><startdate>20151101</startdate><enddate>20151101</enddate><creator>Jaekyu Lee</creator><creator>Dong Hyuk Woo</creator><creator>Hyesoon Kim</creator><creator>Azimi, Mani</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20151101</creationdate><title>GREEN Cache: Exploiting the Disciplined Memory Model of OpenCL on GPUs</title><author>Jaekyu Lee ; Dong Hyuk Woo ; Hyesoon Kim ; Azimi, Mani</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c322t-a0dcc62cb8f70222e550793ea3413403890a92d299e3268b9b5572134faeaae13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Cache</topic><topic>Computation</topic><topic>Computational modeling</topic><topic>Computer simulation</topic><topic>Devices</topic><topic>Dynamics</topic><topic>Energy conservation</topic><topic>GPU</topic><topic>Graphics boards</topic><topic>Graphics processing units</topic><topic>Hardware</topic><topic>Hierarchies</topic><topic>Kernel</topic><topic>Memory management</topic><topic>Microprocessors</topic><topic>OpenCL</topic><topic>Product development</topic><topic>Programming</topic><topic>Semiconductors</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jaekyu Lee</creatorcontrib><creatorcontrib>Dong Hyuk Woo</creatorcontrib><creatorcontrib>Hyesoon Kim</creatorcontrib><creatorcontrib>Azimi, Mani</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Jaekyu Lee</au><au>Dong Hyuk Woo</au><au>Hyesoon Kim</au><au>Azimi, Mani</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>GREEN Cache: Exploiting the Disciplined Memory Model of OpenCL on GPUs</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2015-11-01</date><risdate>2015</risdate><volume>64</volume><issue>11</issue><spage>3167</spage><epage>3180</epage><pages>3167-3180</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract>As various graphics processing unit architectures are deployed across broad computing spectrum from a hand-held or embedded device to a high-performance computing server, OpenCL becomes the de facto standard programming environment for general-purpose computing on graphics processing units. Unlike its CPU counterpart, OpenCL has several distinct features such as its disciplined memory model, which is partially inherited from conventional 3D graphics programming models. On the other hand, due to ever increasing memory bandwidth pressure and low power requirement, the capacity of on-chip caches in GPUs keeps increasing overtime. Given such trends, we believe that we have interesting programming model/architecture co-optimization opportunities, in particular, how to energy-efficiently utilize large on-chip caches for GPUs. In this paper, as a showcase, we study the characteristics of the OpenCL memory model and propose a technique called GPU Region-aware energy-efficient non-inclusive cache hierarchy, or GREEN cache hierarchy. With the GREEN cache, our simulation results show that we can save 56 percent of dynamic energy in the L1 cache, 39 percent of dynamic energy in the L2 cache, and 50 percent of leakage energy in the L2 cache with practically no performance degradation and off-chip access increases.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TC.2015.2395435</doi><tpages>14</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0018-9340
ispartof	IEEE transactions on computers, 2015-11, Vol.64 (11), p.3167-3180
issn	0018-9340 1557-9956
language	eng
recordid	cdi_ieee_primary_7018047
source	IEEE Electronic Library (IEL)
subjects	Cache Computation Computational modeling Computer simulation Devices Dynamics Energy conservation GPU Graphics boards Graphics processing units Hardware Hierarchies Kernel Memory management Microprocessors OpenCL Product development Programming Semiconductors Training
title	GREEN Cache: Exploiting the Disciplined Memory Model of OpenCL on GPUs
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T13%3A29%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=GREEN%20Cache:%20Exploiting%20the%20Disciplined%20Memory%20Model%20of%20OpenCL%20on%20GPUs&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Jaekyu%20Lee&rft.date=2015-11-01&rft.volume=64&rft.issue=11&rft.spage=3167&rft.epage=3180&rft.pages=3167-3180&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/TC.2015.2395435&rft_dat=%3Cproquest_RIE%3E1793277646%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1729225075&rft_id=info:pmid/&rft_ieee_id=7018047&rfr_iscdi=true