Data prefetching by dependence graph precomputation

Data cache misses reduce the performance of wide-issue processors by stalling the data supply to the processor. Prefetching data by predicting the miss address is one way to tolerate the cache miss latencies. But current applications with irregular access patterns make it difficult to accurately pre...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Annavaram, Murali, Patel, Jignesh M., Davidson, Edward S.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Computer systems organization > Dependable and fault-tolerant systems and networks Computing methodologies > Modeling and simulation Computing methodologies > Modeling and simulation > Model development and analysis Computing methodologies > Modeling and simulation > Model development and analysis > Model verification and validation Computing methodologies > Modeling and simulation > Simulation evaluation General and reference > Cross-computing tools and techniques > Evaluation General and reference > Cross-computing tools and techniques > Metrics General and reference > Cross-computing tools and techniques > Performance Mathematics of computing > Discrete mathematics Networks > Network performance evaluation Software and its engineering > Software organization and properties > Contextual software domains > Operating systems > Process management
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	61
container_issue
container_start_page	52
container_title
container_volume
creator	Annavaram, Murali Patel, Jignesh M. Davidson, Edward S.
description	Data cache misses reduce the performance of wide-issue processors by stalling the data supply to the processor. Prefetching data by predicting the miss address is one way to tolerate the cache miss latencies. But current applications with irregular access patterns make it difficult to accurately predict the address sufficiently early to mask large cache miss latencies. This paper explores an alternative to predicting prefetch addresses, namely precomputing them. The Dependence Graph Precomputation scheme (DGP) introduced in this paper is a novel approach for dynamically identifying and precomputing the instructions that determine the addresses accessed by those load/store instructions marked as being responsible for most data cache misses. DGP's dependence graph generator efficiently generates the required dependence graphs at run time. A separate precomputation engine executes these graphs to generate the data addresses of the marked load/store instructions early enough for accurate prefetching. Our results show that 94% of the prefetches issued by DGP are useful, reducing the D-cache miss stall time by 47%. Thus DGP takes us about half way from an already highly tuned baseline system toward perfect D-cache performance. DGP improves the overall performance of a wide range of applications by 7% over tagged next line prefetching, by 13% over a baseline processor with no prefetching, and is within 15% of the perfect D-cache performance.
doi_str_mv	10.1145/379240.379251
format	Conference Proceeding
fullrecord	<record><control><sourceid>proquest_acm_b</sourceid><recordid>TN_cdi_acm_books_10_1145_379240_379251</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>31172402</sourcerecordid><originalsourceid>FETCH-LOGICAL-a215t-983c538e88cf02bd6947283fcc18d9a81d604a2742af36bc8768ae620fe7fd3</originalsourceid><addsrcrecordid>eNqNkDtPwzAUhS0BEqV0ZM_ERIqv3x5ReUqVGGC3HOe6DbRJiJ2Bf0-rILFylzPcT0c6HyFXQJcAQt5ybZmgy2NIOCEXVCsrARTTp2RGQfFSWiXOySKlD3o4IUFbOyP83mdf9ANGzGHbtJui-i5q7LGtsQ1YbAbfb4__0O37MfvcdO0lOYt-l3Dxm3Py9vjwvnou169PL6u7dekZyFxaw4PkBo0JkbKqVlZoZngMAUxtvYFaUeGZFsxHrqpgtDIeFaMRdaz5nFxPrf3QfY2Ysts3KeBu51vsxuQ4gD5MZn-gD3tXdd1nckDd0YqbrLjJygG8-RfoqqHByH8A-vpf3Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>31172402</pqid></control><display><type>conference_proceeding</type><title>Data prefetching by dependence graph precomputation</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><source>ACM Digital Library Complete</source><creator>Annavaram, Murali ; Patel, Jignesh M. ; Davidson, Edward S.</creator><creatorcontrib>Annavaram, Murali ; Patel, Jignesh M. ; Davidson, Edward S.</creatorcontrib><description>Data cache misses reduce the performance of wide-issue processors by stalling the data supply to the processor. Prefetching data by predicting the miss address is one way to tolerate the cache miss latencies. But current applications with irregular access patterns make it difficult to accurately predict the address sufficiently early to mask large cache miss latencies. This paper explores an alternative to predicting prefetch addresses, namely precomputing them. The Dependence Graph Precomputation scheme (DGP) introduced in this paper is a novel approach for dynamically identifying and precomputing the instructions that determine the addresses accessed by those load/store instructions marked as being responsible for most data cache misses. DGP's dependence graph generator efficiently generates the required dependence graphs at run time. A separate precomputation engine executes these graphs to generate the data addresses of the marked load/store instructions early enough for accurate prefetching. Our results show that 94% of the prefetches issued by DGP are useful, reducing the D-cache miss stall time by 47%. Thus DGP takes us about half way from an already highly tuned baseline system toward perfect D-cache performance. DGP improves the overall performance of a wide range of applications by 7% over tagged next line prefetching, by 13% over a baseline processor with no prefetching, and is within 15% of the perfect D-cache performance.</description><identifier>ISSN: 0163-5964</identifier><identifier>ISBN: 0769511627</identifier><identifier>ISBN: 9780769511627</identifier><identifier>DOI: 10.1145/379240.379251</identifier><language>eng</language><publisher>New York, NY, USA: ACM</publisher><subject>Computer systems organization -- Dependable and fault-tolerant systems and networks ; Computing methodologies -- Modeling and simulation ; Computing methodologies -- Modeling and simulation -- Model development and analysis ; Computing methodologies -- Modeling and simulation -- Model development and analysis -- Model verification and validation ; Computing methodologies -- Modeling and simulation -- Simulation evaluation ; General and reference -- Cross-computing tools and techniques -- Evaluation ; General and reference -- Cross-computing tools and techniques -- Metrics ; General and reference -- Cross-computing tools and techniques -- Performance ; Mathematics of computing -- Discrete mathematics ; Networks -- Network performance evaluation ; Software and its engineering -- Software organization and properties -- Contextual software domains -- Operating systems -- Process management</subject><ispartof>Computer architecture news, 2001, p.52-61</ispartof><rights>2001 Authors</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>309,310,780,784,789,790,23930,23931,25140,27925</link.rule.ids></links><search><creatorcontrib>Annavaram, Murali</creatorcontrib><creatorcontrib>Patel, Jignesh M.</creatorcontrib><creatorcontrib>Davidson, Edward S.</creatorcontrib><title>Data prefetching by dependence graph precomputation</title><title>Computer architecture news</title><description>Data cache misses reduce the performance of wide-issue processors by stalling the data supply to the processor. Prefetching data by predicting the miss address is one way to tolerate the cache miss latencies. But current applications with irregular access patterns make it difficult to accurately predict the address sufficiently early to mask large cache miss latencies. This paper explores an alternative to predicting prefetch addresses, namely precomputing them. The Dependence Graph Precomputation scheme (DGP) introduced in this paper is a novel approach for dynamically identifying and precomputing the instructions that determine the addresses accessed by those load/store instructions marked as being responsible for most data cache misses. DGP's dependence graph generator efficiently generates the required dependence graphs at run time. A separate precomputation engine executes these graphs to generate the data addresses of the marked load/store instructions early enough for accurate prefetching. Our results show that 94% of the prefetches issued by DGP are useful, reducing the D-cache miss stall time by 47%. Thus DGP takes us about half way from an already highly tuned baseline system toward perfect D-cache performance. DGP improves the overall performance of a wide range of applications by 7% over tagged next line prefetching, by 13% over a baseline processor with no prefetching, and is within 15% of the perfect D-cache performance.</description><subject>Computer systems organization -- Dependable and fault-tolerant systems and networks</subject><subject>Computing methodologies -- Modeling and simulation</subject><subject>Computing methodologies -- Modeling and simulation -- Model development and analysis</subject><subject>Computing methodologies -- Modeling and simulation -- Model development and analysis -- Model verification and validation</subject><subject>Computing methodologies -- Modeling and simulation -- Simulation evaluation</subject><subject>General and reference -- Cross-computing tools and techniques -- Evaluation</subject><subject>General and reference -- Cross-computing tools and techniques -- Metrics</subject><subject>General and reference -- Cross-computing tools and techniques -- Performance</subject><subject>Mathematics of computing -- Discrete mathematics</subject><subject>Networks -- Network performance evaluation</subject><subject>Software and its engineering -- Software organization and properties -- Contextual software domains -- Operating systems -- Process management</subject><issn>0163-5964</issn><isbn>0769511627</isbn><isbn>9780769511627</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2001</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNqNkDtPwzAUhS0BEqV0ZM_ERIqv3x5ReUqVGGC3HOe6DbRJiJ2Bf0-rILFylzPcT0c6HyFXQJcAQt5ybZmgy2NIOCEXVCsrARTTp2RGQfFSWiXOySKlD3o4IUFbOyP83mdf9ANGzGHbtJui-i5q7LGtsQ1YbAbfb4__0O37MfvcdO0lOYt-l3Dxm3Py9vjwvnou169PL6u7dekZyFxaw4PkBo0JkbKqVlZoZngMAUxtvYFaUeGZFsxHrqpgtDIeFaMRdaz5nFxPrf3QfY2Ysts3KeBu51vsxuQ4gD5MZn-gD3tXdd1nckDd0YqbrLjJygG8-RfoqqHByH8A-vpf3Q</recordid><startdate>20010101</startdate><enddate>20010101</enddate><creator>Annavaram, Murali</creator><creator>Patel, Jignesh M.</creator><creator>Davidson, Edward S.</creator><general>ACM</general><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20010101</creationdate><title>Data prefetching by dependence graph precomputation</title><author>Annavaram, Murali ; Patel, Jignesh M. ; Davidson, Edward S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a215t-983c538e88cf02bd6947283fcc18d9a81d604a2742af36bc8768ae620fe7fd3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2001</creationdate><topic>Computer systems organization -- Dependable and fault-tolerant systems and networks</topic><topic>Computing methodologies -- Modeling and simulation</topic><topic>Computing methodologies -- Modeling and simulation -- Model development and analysis</topic><topic>Computing methodologies -- Modeling and simulation -- Model development and analysis -- Model verification and validation</topic><topic>Computing methodologies -- Modeling and simulation -- Simulation evaluation</topic><topic>General and reference -- Cross-computing tools and techniques -- Evaluation</topic><topic>General and reference -- Cross-computing tools and techniques -- Metrics</topic><topic>General and reference -- Cross-computing tools and techniques -- Performance</topic><topic>Mathematics of computing -- Discrete mathematics</topic><topic>Networks -- Network performance evaluation</topic><topic>Software and its engineering -- Software organization and properties -- Contextual software domains -- Operating systems -- Process management</topic><toplevel>online_resources</toplevel><creatorcontrib>Annavaram, Murali</creatorcontrib><creatorcontrib>Patel, Jignesh M.</creatorcontrib><creatorcontrib>Davidson, Edward S.</creatorcontrib><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Annavaram, Murali</au><au>Patel, Jignesh M.</au><au>Davidson, Edward S.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Data prefetching by dependence graph precomputation</atitle><btitle>Computer architecture news</btitle><date>2001-01-01</date><risdate>2001</risdate><spage>52</spage><epage>61</epage><pages>52-61</pages><issn>0163-5964</issn><isbn>0769511627</isbn><isbn>9780769511627</isbn><abstract>Data cache misses reduce the performance of wide-issue processors by stalling the data supply to the processor. Prefetching data by predicting the miss address is one way to tolerate the cache miss latencies. But current applications with irregular access patterns make it difficult to accurately predict the address sufficiently early to mask large cache miss latencies. This paper explores an alternative to predicting prefetch addresses, namely precomputing them. The Dependence Graph Precomputation scheme (DGP) introduced in this paper is a novel approach for dynamically identifying and precomputing the instructions that determine the addresses accessed by those load/store instructions marked as being responsible for most data cache misses. DGP's dependence graph generator efficiently generates the required dependence graphs at run time. A separate precomputation engine executes these graphs to generate the data addresses of the marked load/store instructions early enough for accurate prefetching. Our results show that 94% of the prefetches issued by DGP are useful, reducing the D-cache miss stall time by 47%. Thus DGP takes us about half way from an already highly tuned baseline system toward perfect D-cache performance. DGP improves the overall performance of a wide range of applications by 7% over tagged next line prefetching, by 13% over a baseline processor with no prefetching, and is within 15% of the perfect D-cache performance.</abstract><cop>New York, NY, USA</cop><pub>ACM</pub><doi>10.1145/379240.379251</doi><tpages>10</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0163-5964
ispartof	Computer architecture news, 2001, p.52-61
issn	0163-5964
language	eng
recordid	cdi_acm_books_10_1145_379240_379251
source	IEEE Electronic Library (IEL) Conference Proceedings; ACM Digital Library Complete
subjects	Computer systems organization -- Dependable and fault-tolerant systems and networks Computing methodologies -- Modeling and simulation Computing methodologies -- Modeling and simulation -- Model development and analysis Computing methodologies -- Modeling and simulation -- Model development and analysis -- Model verification and validation Computing methodologies -- Modeling and simulation -- Simulation evaluation General and reference -- Cross-computing tools and techniques -- Evaluation General and reference -- Cross-computing tools and techniques -- Metrics General and reference -- Cross-computing tools and techniques -- Performance Mathematics of computing -- Discrete mathematics Networks -- Network performance evaluation Software and its engineering -- Software organization and properties -- Contextual software domains -- Operating systems -- Process management
title	Data prefetching by dependence graph precomputation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T16%3A01%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_acm_b&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Data%20prefetching%20by%20dependence%20graph%20precomputation&rft.btitle=Computer%20architecture%20news&rft.au=Annavaram,%20Murali&rft.date=2001-01-01&rft.spage=52&rft.epage=61&rft.pages=52-61&rft.issn=0163-5964&rft.isbn=0769511627&rft.isbn_list=9780769511627&rft_id=info:doi/10.1145/379240.379251&rft_dat=%3Cproquest_acm_b%3E31172402%3C/proquest_acm_b%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=31172402&rft_id=info:pmid/&rfr_iscdi=true