Improving hash join performance through prefetching

Hash join algorithms suffer from extensive CPU cache stalls. We show that the standard hash join algorithm/or disk-oriented databases (i.e. GRACE) spends over 73% of its user time stalled on CPU cache misses, and explores the use of prefetching to improve its cache performance. Applying prefetching...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Chen, S., Ailamaki, A., Gibbons, P.B., Mowry, T.C.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Applied sciences Computer science control theory systems Costs Database systems Delay Electric breakdown Exact sciences and technology Information systems. Data bases Memory organisation. Data processing Partitioning algorithms Prefetching Probes Software
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	127
container_issue
container_start_page	116
container_title
container_volume
creator	Chen, S. Ailamaki, A. Gibbons, P.B. Mowry, T.C.
description	Hash join algorithms suffer from extensive CPU cache stalls. We show that the standard hash join algorithm/or disk-oriented databases (i.e. GRACE) spends over 73% of its user time stalled on CPU cache misses, and explores the use of prefetching to improve its cache performance. Applying prefetching to hash joins is complicated by the data dependencies, multiple code paths, and inherent randomness of hashing. We present two techniques, group prefetching and software-pipelined prefetching, that overcome these complications. These schemes achieve 2.0-2.9X speedups for the join phase and 1.4-2.6X speedups for the partition phase over GRACE and simple prefetching approaches. Compared with previous cache-aware approaches (i.e. cache partitioning), the schemes are at least 50% faster on large relations and do not require exclusive use of the CPU cache to be effective.
doi_str_mv	10.1109/ICDE.2004.1319989
format	Conference Proceeding
fullrecord	<record><control><sourceid>pascalfrancis_6IE</sourceid><recordid>TN_cdi_pascalfrancis_primary_18596791</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1319989</ieee_id><sourcerecordid>18596791</sourcerecordid><originalsourceid>FETCH-LOGICAL-i203t-a124b19ba1181e79278bd6361554476c6e26f631777bb36460f7faa0fa972cf93</originalsourceid><addsrcrecordid>eNpFkEtLxEAQhAcf4LLmB4iXXDwmds9ketJHWVcNLHhR8LZM4swmy-bBJAr-ewMRrEsd6qMoSogbhBQR-L7YPG5TCZClqJA55zOxksroBCR9nIuITQ6GWEsgDRdihUAqIZXLKxGN4xFmcYaoYSVU0Q6h_266Q1zbsY6PfdPFgwu-D63tKhdPdei_DnU8BOfdVNUzeS0uvT2NLvrztXh_2r5tXpLd63OxedgljQQ1JRZlViKXFjFHZ1iavPwkRah1lhmqyEnypNAYU5aKMgJvvLXgLRtZeVZrcbf0Dnas7MmHeVAz7ofQtDb87DHXTIZx5m4XrnHO_cfLNeoXgg5Txw</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Improving hash join performance through prefetching</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Chen, S. ; Ailamaki, A. ; Gibbons, P.B. ; Mowry, T.C.</creator><creatorcontrib>Chen, S. ; Ailamaki, A. ; Gibbons, P.B. ; Mowry, T.C.</creatorcontrib><description>Hash join algorithms suffer from extensive CPU cache stalls. We show that the standard hash join algorithm/or disk-oriented databases (i.e. GRACE) spends over 73% of its user time stalled on CPU cache misses, and explores the use of prefetching to improve its cache performance. Applying prefetching to hash joins is complicated by the data dependencies, multiple code paths, and inherent randomness of hashing. We present two techniques, group prefetching and software-pipelined prefetching, that overcome these complications. These schemes achieve 2.0-2.9X speedups for the join phase and 1.4-2.6X speedups for the partition phase over GRACE and simple prefetching approaches. Compared with previous cache-aware approaches (i.e. cache partitioning), the schemes are at least 50% faster on large relations and do not require exclusive use of the CPU cache to be effective.</description><identifier>ISSN: 1063-6382</identifier><identifier>ISBN: 9780769520650</identifier><identifier>ISBN: 0769520650</identifier><identifier>EISSN: 2375-026X</identifier><identifier>DOI: 10.1109/ICDE.2004.1319989</identifier><language>eng</language><publisher>Los Alamitos CA: IEEE</publisher><subject>Applied sciences ; Computer science; control theory; systems ; Costs ; Database systems ; Delay ; Electric breakdown ; Exact sciences and technology ; Information systems. Data bases ; Memory organisation. Data processing ; Partitioning algorithms ; Prefetching ; Probes ; Software</subject><ispartof>Proceedings. 20th International Conference on Data Engineering, 2004, p.116-127</ispartof><rights>2007 INIST-CNRS</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1319989$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,777,781,786,787,2052,4036,4037,27906,54901</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1319989$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=18596791$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Chen, S.</creatorcontrib><creatorcontrib>Ailamaki, A.</creatorcontrib><creatorcontrib>Gibbons, P.B.</creatorcontrib><creatorcontrib>Mowry, T.C.</creatorcontrib><title>Improving hash join performance through prefetching</title><title>Proceedings. 20th International Conference on Data Engineering</title><addtitle>ICDE</addtitle><description>Hash join algorithms suffer from extensive CPU cache stalls. We show that the standard hash join algorithm/or disk-oriented databases (i.e. GRACE) spends over 73% of its user time stalled on CPU cache misses, and explores the use of prefetching to improve its cache performance. Applying prefetching to hash joins is complicated by the data dependencies, multiple code paths, and inherent randomness of hashing. We present two techniques, group prefetching and software-pipelined prefetching, that overcome these complications. These schemes achieve 2.0-2.9X speedups for the join phase and 1.4-2.6X speedups for the partition phase over GRACE and simple prefetching approaches. Compared with previous cache-aware approaches (i.e. cache partitioning), the schemes are at least 50% faster on large relations and do not require exclusive use of the CPU cache to be effective.</description><subject>Applied sciences</subject><subject>Computer science; control theory; systems</subject><subject>Costs</subject><subject>Database systems</subject><subject>Delay</subject><subject>Electric breakdown</subject><subject>Exact sciences and technology</subject><subject>Information systems. Data bases</subject><subject>Memory organisation. Data processing</subject><subject>Partitioning algorithms</subject><subject>Prefetching</subject><subject>Probes</subject><subject>Software</subject><issn>1063-6382</issn><issn>2375-026X</issn><isbn>9780769520650</isbn><isbn>0769520650</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2004</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpFkEtLxEAQhAcf4LLmB4iXXDwmds9ketJHWVcNLHhR8LZM4swmy-bBJAr-ewMRrEsd6qMoSogbhBQR-L7YPG5TCZClqJA55zOxksroBCR9nIuITQ6GWEsgDRdihUAqIZXLKxGN4xFmcYaoYSVU0Q6h_266Q1zbsY6PfdPFgwu-D63tKhdPdei_DnU8BOfdVNUzeS0uvT2NLvrztXh_2r5tXpLd63OxedgljQQ1JRZlViKXFjFHZ1iavPwkRah1lhmqyEnypNAYU5aKMgJvvLXgLRtZeVZrcbf0Dnas7MmHeVAz7ofQtDb87DHXTIZx5m4XrnHO_cfLNeoXgg5Txw</recordid><startdate>2004</startdate><enddate>2004</enddate><creator>Chen, S.</creator><creator>Ailamaki, A.</creator><creator>Gibbons, P.B.</creator><creator>Mowry, T.C.</creator><general>IEEE</general><general>IEEE Computer Society</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope><scope>IQODW</scope></search><sort><creationdate>2004</creationdate><title>Improving hash join performance through prefetching</title><author>Chen, S. ; Ailamaki, A. ; Gibbons, P.B. ; Mowry, T.C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i203t-a124b19ba1181e79278bd6361554476c6e26f631777bb36460f7faa0fa972cf93</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Applied sciences</topic><topic>Computer science; control theory; systems</topic><topic>Costs</topic><topic>Database systems</topic><topic>Delay</topic><topic>Electric breakdown</topic><topic>Exact sciences and technology</topic><topic>Information systems. Data bases</topic><topic>Memory organisation. Data processing</topic><topic>Partitioning algorithms</topic><topic>Prefetching</topic><topic>Probes</topic><topic>Software</topic><toplevel>online_resources</toplevel><creatorcontrib>Chen, S.</creatorcontrib><creatorcontrib>Ailamaki, A.</creatorcontrib><creatorcontrib>Gibbons, P.B.</creatorcontrib><creatorcontrib>Mowry, T.C.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chen, S.</au><au>Ailamaki, A.</au><au>Gibbons, P.B.</au><au>Mowry, T.C.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Improving hash join performance through prefetching</atitle><btitle>Proceedings. 20th International Conference on Data Engineering</btitle><stitle>ICDE</stitle><date>2004</date><risdate>2004</risdate><spage>116</spage><epage>127</epage><pages>116-127</pages><issn>1063-6382</issn><eissn>2375-026X</eissn><isbn>9780769520650</isbn><isbn>0769520650</isbn><abstract>Hash join algorithms suffer from extensive CPU cache stalls. We show that the standard hash join algorithm/or disk-oriented databases (i.e. GRACE) spends over 73% of its user time stalled on CPU cache misses, and explores the use of prefetching to improve its cache performance. Applying prefetching to hash joins is complicated by the data dependencies, multiple code paths, and inherent randomness of hashing. We present two techniques, group prefetching and software-pipelined prefetching, that overcome these complications. These schemes achieve 2.0-2.9X speedups for the join phase and 1.4-2.6X speedups for the partition phase over GRACE and simple prefetching approaches. Compared with previous cache-aware approaches (i.e. cache partitioning), the schemes are at least 50% faster on large relations and do not require exclusive use of the CPU cache to be effective.</abstract><cop>Los Alamitos CA</cop><pub>IEEE</pub><doi>10.1109/ICDE.2004.1319989</doi><tpages>12</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1063-6382
ispartof	Proceedings. 20th International Conference on Data Engineering, 2004, p.116-127
issn	1063-6382 2375-026X
language	eng
recordid	cdi_pascalfrancis_primary_18596791
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Applied sciences Computer science control theory systems Costs Database systems Delay Electric breakdown Exact sciences and technology Information systems. Data bases Memory organisation. Data processing Partitioning algorithms Prefetching Probes Software
title	Improving hash join performance through prefetching
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T09%3A33%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Improving%20hash%20join%20performance%20through%20prefetching&rft.btitle=Proceedings.%2020th%20International%20Conference%20on%20Data%20Engineering&rft.au=Chen,%20S.&rft.date=2004&rft.spage=116&rft.epage=127&rft.pages=116-127&rft.issn=1063-6382&rft.eissn=2375-026X&rft.isbn=9780769520650&rft.isbn_list=0769520650&rft_id=info:doi/10.1109/ICDE.2004.1319989&rft_dat=%3Cpascalfrancis_6IE%3E18596791%3C/pascalfrancis_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1319989&rfr_iscdi=true