Comparative evaluation of latency tolerance techniques for software distributed shared memory

A key challenge in achieving high performance on software DSMs is overcoming their relatively large communication latencies. In this paper, we consider two techniques which address this problem: prefetching and multithreading. While previous studies have examined each of these techniques in isolatio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Mowry, T.C., Chan, C.Q.C., Lo, A.K.W.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 311
container_issue
container_start_page 300
container_title
container_volume
creator Mowry, T.C.
Chan, C.Q.C.
Lo, A.K.W.
description A key challenge in achieving high performance on software DSMs is overcoming their relatively large communication latencies. In this paper, we consider two techniques which address this problem: prefetching and multithreading. While previous studies have examined each of these techniques in isolation, this paper is the first to evaluate both techniques using a consistent hardware platform and set of applications, thereby allowing direct comparisons. In addition, this is the first study to consider combining prefetching and multithreading in a software DSM . We performed our experiments on real hardware using a full implementation of both techniques. Our experimental results demonstrate that both prefetching and multithreading result in significant performance improvements when applied individually. In addition, we observe that prefetching and multithreading can potentially complement each other by using prefetching to hide memory latency and multithreading to hide synchronization latency.
doi_str_mv 10.1109/HPCA.1998.650569
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_650569</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>650569</ieee_id><sourcerecordid>650569</sourcerecordid><originalsourceid>FETCH-LOGICAL-i104t-8d1a44f9bd4ee7e81340d847ca5a2590d7cadf8b61781ddc150515dcc70db6983</originalsourceid><addsrcrecordid>eNotkEFLwzAcxQMiqHN38ZQv0Jq0SZocR1EnDPSgRxlp8g-LtM1M0km_vYXtXd6Pd3jwHkIPlJSUEvW0_Wg3JVVKloITLtQVuiOSSiHrqhY3aJ3SD1lUK05UdYu-2zAcddTZnwDDSffTgmHEweFeZxjNjHPoIerRAM5gDqP_nSBhFyJOweU_HQFbn3L03ZTB4nRYEosHGEKc79G1032C9cVX6Ovl-bPdFrv317d2sys8JSwX0lLNmFOdZQANSFozYiVrjOa64orYhayTnaCNpNYauiyj3BrTENsJJesVejz3egDYH6MfdJz35wPqf7QHU8Q</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Comparative evaluation of latency tolerance techniques for software distributed shared memory</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Mowry, T.C. ; Chan, C.Q.C. ; Lo, A.K.W.</creator><creatorcontrib>Mowry, T.C. ; Chan, C.Q.C. ; Lo, A.K.W.</creatorcontrib><description>A key challenge in achieving high performance on software DSMs is overcoming their relatively large communication latencies. In this paper, we consider two techniques which address this problem: prefetching and multithreading. While previous studies have examined each of these techniques in isolation, this paper is the first to evaluate both techniques using a consistent hardware platform and set of applications, thereby allowing direct comparisons. In addition, this is the first study to consider combining prefetching and multithreading in a software DSM . We performed our experiments on real hardware using a full implementation of both techniques. Our experimental results demonstrate that both prefetching and multithreading result in significant performance improvements when applied individually. In addition, we observe that prefetching and multithreading can potentially complement each other by using prefetching to hide memory latency and multithreading to hide synchronization latency.</description><identifier>ISBN: 0818683236</identifier><identifier>ISBN: 9780818683237</identifier><identifier>DOI: 10.1109/HPCA.1998.650569</identifier><language>eng</language><publisher>IEEE</publisher><subject>Application software ; Communication system software ; Computer science ; Delay ; Electric breakdown ; Hardware ; Multithreading ; Prefetching ; Software performance ; Workstations</subject><ispartof>Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture, 1998, p.300-311</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/650569$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2052,4036,4037,27902,54895</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/650569$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Mowry, T.C.</creatorcontrib><creatorcontrib>Chan, C.Q.C.</creatorcontrib><creatorcontrib>Lo, A.K.W.</creatorcontrib><title>Comparative evaluation of latency tolerance techniques for software distributed shared memory</title><title>Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture</title><addtitle>HPCA</addtitle><description>A key challenge in achieving high performance on software DSMs is overcoming their relatively large communication latencies. In this paper, we consider two techniques which address this problem: prefetching and multithreading. While previous studies have examined each of these techniques in isolation, this paper is the first to evaluate both techniques using a consistent hardware platform and set of applications, thereby allowing direct comparisons. In addition, this is the first study to consider combining prefetching and multithreading in a software DSM . We performed our experiments on real hardware using a full implementation of both techniques. Our experimental results demonstrate that both prefetching and multithreading result in significant performance improvements when applied individually. In addition, we observe that prefetching and multithreading can potentially complement each other by using prefetching to hide memory latency and multithreading to hide synchronization latency.</description><subject>Application software</subject><subject>Communication system software</subject><subject>Computer science</subject><subject>Delay</subject><subject>Electric breakdown</subject><subject>Hardware</subject><subject>Multithreading</subject><subject>Prefetching</subject><subject>Software performance</subject><subject>Workstations</subject><isbn>0818683236</isbn><isbn>9780818683237</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>1998</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotkEFLwzAcxQMiqHN38ZQv0Jq0SZocR1EnDPSgRxlp8g-LtM1M0km_vYXtXd6Pd3jwHkIPlJSUEvW0_Wg3JVVKloITLtQVuiOSSiHrqhY3aJ3SD1lUK05UdYu-2zAcddTZnwDDSffTgmHEweFeZxjNjHPoIerRAM5gDqP_nSBhFyJOweU_HQFbn3L03ZTB4nRYEosHGEKc79G1032C9cVX6Ovl-bPdFrv317d2sys8JSwX0lLNmFOdZQANSFozYiVrjOa64orYhayTnaCNpNYauiyj3BrTENsJJesVejz3egDYH6MfdJz35wPqf7QHU8Q</recordid><startdate>1998</startdate><enddate>1998</enddate><creator>Mowry, T.C.</creator><creator>Chan, C.Q.C.</creator><creator>Lo, A.K.W.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>1998</creationdate><title>Comparative evaluation of latency tolerance techniques for software distributed shared memory</title><author>Mowry, T.C. ; Chan, C.Q.C. ; Lo, A.K.W.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i104t-8d1a44f9bd4ee7e81340d847ca5a2590d7cadf8b61781ddc150515dcc70db6983</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>1998</creationdate><topic>Application software</topic><topic>Communication system software</topic><topic>Computer science</topic><topic>Delay</topic><topic>Electric breakdown</topic><topic>Hardware</topic><topic>Multithreading</topic><topic>Prefetching</topic><topic>Software performance</topic><topic>Workstations</topic><toplevel>online_resources</toplevel><creatorcontrib>Mowry, T.C.</creatorcontrib><creatorcontrib>Chan, C.Q.C.</creatorcontrib><creatorcontrib>Lo, A.K.W.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mowry, T.C.</au><au>Chan, C.Q.C.</au><au>Lo, A.K.W.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Comparative evaluation of latency tolerance techniques for software distributed shared memory</atitle><btitle>Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture</btitle><stitle>HPCA</stitle><date>1998</date><risdate>1998</risdate><spage>300</spage><epage>311</epage><pages>300-311</pages><isbn>0818683236</isbn><isbn>9780818683237</isbn><abstract>A key challenge in achieving high performance on software DSMs is overcoming their relatively large communication latencies. In this paper, we consider two techniques which address this problem: prefetching and multithreading. While previous studies have examined each of these techniques in isolation, this paper is the first to evaluate both techniques using a consistent hardware platform and set of applications, thereby allowing direct comparisons. In addition, this is the first study to consider combining prefetching and multithreading in a software DSM . We performed our experiments on real hardware using a full implementation of both techniques. Our experimental results demonstrate that both prefetching and multithreading result in significant performance improvements when applied individually. In addition, we observe that prefetching and multithreading can potentially complement each other by using prefetching to hide memory latency and multithreading to hide synchronization latency.</abstract><pub>IEEE</pub><doi>10.1109/HPCA.1998.650569</doi><tpages>12</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISBN: 0818683236
ispartof Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture, 1998, p.300-311
issn
language eng
recordid cdi_ieee_primary_650569
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Application software
Communication system software
Computer science
Delay
Electric breakdown
Hardware
Multithreading
Prefetching
Software performance
Workstations
title Comparative evaluation of latency tolerance techniques for software distributed shared memory
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T11%3A32%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Comparative%20evaluation%20of%20latency%20tolerance%20techniques%20for%20software%20distributed%20shared%20memory&rft.btitle=Proceedings%201998%20Fourth%20International%20Symposium%20on%20High-Performance%20Computer%20Architecture&rft.au=Mowry,%20T.C.&rft.date=1998&rft.spage=300&rft.epage=311&rft.pages=300-311&rft.isbn=0818683236&rft.isbn_list=9780818683237&rft_id=info:doi/10.1109/HPCA.1998.650569&rft_dat=%3Cieee_6IE%3E650569%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=650569&rfr_iscdi=true