Hiding communication latency and coherence overhead in software DSMs
In this paper we propose the use of a PCI-based programmable protocol controller for hiding communication and coherence overheads in software DSMs. Our protocol controller provides three different types of overhead tolerance: a) moving basic communication and coherence tasks away from computation pr...
Gespeichert in:
Veröffentlicht in: | Computer architecture news 1996-01, Vol.24 (Special Issu), p.198-209 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 209 |
---|---|
container_issue | Special Issu |
container_start_page | 198 |
container_title | Computer architecture news |
container_volume | 24 |
creator | Bianchini, R Kontothanassis, L I Pinto, R De Maria, M Abud, M Amorim, C L |
description | In this paper we propose the use of a PCI-based programmable protocol controller for hiding communication and coherence overheads in software DSMs. Our protocol controller provides three different types of overhead tolerance: a) moving basic communication and coherence tasks away from computation processors; b) prefetching of diffs; and c) generating and applying diffs with hardware assistance. We evaluate the isolated and combined impact of these features on the performance of TreadMarks. We also compare performance against two versions of the Shrimp-based AURC protocol. Using detailed execution-driven simulations of a 16-node network of workstations, we show that the greatest performance benefits provided by our protocol controller come from our hardware-supported diffs. Reducing the burden of communication and coherence transactions on the computation processor is also beneficial but to a smaller extent. Prefetching is not always profitable. Our results show that our protocol controller can improve running time performance by up to 50% for TreadMarks, which means that it can double the TreadMarks speedups. The overlapping implementation of TreadMarks performs as well or better than AURC for 5 of our 6 applications. We conclude that the simple hardware support we propose allows for the implementation of high-performance software DSMs at low cost. Based on this conclusion, we are building the NCP sub(2) parallel system at COPPE/UFRJ. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_miscellaneous_26217609</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>26217609</sourcerecordid><originalsourceid>FETCH-proquest_miscellaneous_262176093</originalsourceid><addsrcrecordid>eNqNirsOgjAUQDtoIj7-oZMbSaFawiwaFifdyU25SE1ptbdo_HsZ_ACnk5NzZiwRmZLpvlS7BVsS3cXkhRQJq2rTGnfj2g_D6IyGaLzjFiI6_eHg2qn0GCZD7l8YeoSWG8fJd_ENAXl1OdOazTuwhJsfV2x7Ol4PdfoI_jkixWYwpNFacOhHanKVZ4USpfx7_AJ6kD1H</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>26217609</pqid></control><display><type>article</type><title>Hiding communication latency and coherence overhead in software DSMs</title><source>Access via ACM Digital Library</source><creator>Bianchini, R ; Kontothanassis, L I ; Pinto, R ; De Maria, M ; Abud, M ; Amorim, C L</creator><creatorcontrib>Bianchini, R ; Kontothanassis, L I ; Pinto, R ; De Maria, M ; Abud, M ; Amorim, C L</creatorcontrib><description>In this paper we propose the use of a PCI-based programmable protocol controller for hiding communication and coherence overheads in software DSMs. Our protocol controller provides three different types of overhead tolerance: a) moving basic communication and coherence tasks away from computation processors; b) prefetching of diffs; and c) generating and applying diffs with hardware assistance. We evaluate the isolated and combined impact of these features on the performance of TreadMarks. We also compare performance against two versions of the Shrimp-based AURC protocol. Using detailed execution-driven simulations of a 16-node network of workstations, we show that the greatest performance benefits provided by our protocol controller come from our hardware-supported diffs. Reducing the burden of communication and coherence transactions on the computation processor is also beneficial but to a smaller extent. Prefetching is not always profitable. Our results show that our protocol controller can improve running time performance by up to 50% for TreadMarks, which means that it can double the TreadMarks speedups. The overlapping implementation of TreadMarks performs as well or better than AURC for 5 of our 6 applications. We conclude that the simple hardware support we propose allows for the implementation of high-performance software DSMs at low cost. Based on this conclusion, we are building the NCP sub(2) parallel system at COPPE/UFRJ.</description><identifier>ISSN: 0163-5964</identifier><language>eng</language><ispartof>Computer architecture news, 1996-01, Vol.24 (Special Issu), p.198-209</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784</link.rule.ids></links><search><creatorcontrib>Bianchini, R</creatorcontrib><creatorcontrib>Kontothanassis, L I</creatorcontrib><creatorcontrib>Pinto, R</creatorcontrib><creatorcontrib>De Maria, M</creatorcontrib><creatorcontrib>Abud, M</creatorcontrib><creatorcontrib>Amorim, C L</creatorcontrib><title>Hiding communication latency and coherence overhead in software DSMs</title><title>Computer architecture news</title><description>In this paper we propose the use of a PCI-based programmable protocol controller for hiding communication and coherence overheads in software DSMs. Our protocol controller provides three different types of overhead tolerance: a) moving basic communication and coherence tasks away from computation processors; b) prefetching of diffs; and c) generating and applying diffs with hardware assistance. We evaluate the isolated and combined impact of these features on the performance of TreadMarks. We also compare performance against two versions of the Shrimp-based AURC protocol. Using detailed execution-driven simulations of a 16-node network of workstations, we show that the greatest performance benefits provided by our protocol controller come from our hardware-supported diffs. Reducing the burden of communication and coherence transactions on the computation processor is also beneficial but to a smaller extent. Prefetching is not always profitable. Our results show that our protocol controller can improve running time performance by up to 50% for TreadMarks, which means that it can double the TreadMarks speedups. The overlapping implementation of TreadMarks performs as well or better than AURC for 5 of our 6 applications. We conclude that the simple hardware support we propose allows for the implementation of high-performance software DSMs at low cost. Based on this conclusion, we are building the NCP sub(2) parallel system at COPPE/UFRJ.</description><issn>0163-5964</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1996</creationdate><recordtype>article</recordtype><recordid>eNqNirsOgjAUQDtoIj7-oZMbSaFawiwaFifdyU25SE1ptbdo_HsZ_ACnk5NzZiwRmZLpvlS7BVsS3cXkhRQJq2rTGnfj2g_D6IyGaLzjFiI6_eHg2qn0GCZD7l8YeoSWG8fJd_ENAXl1OdOazTuwhJsfV2x7Ol4PdfoI_jkixWYwpNFacOhHanKVZ4USpfx7_AJ6kD1H</recordid><startdate>19960101</startdate><enddate>19960101</enddate><creator>Bianchini, R</creator><creator>Kontothanassis, L I</creator><creator>Pinto, R</creator><creator>De Maria, M</creator><creator>Abud, M</creator><creator>Amorim, C L</creator><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>19960101</creationdate><title>Hiding communication latency and coherence overhead in software DSMs</title><author>Bianchini, R ; Kontothanassis, L I ; Pinto, R ; De Maria, M ; Abud, M ; Amorim, C L</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_miscellaneous_262176093</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1996</creationdate><toplevel>online_resources</toplevel><creatorcontrib>Bianchini, R</creatorcontrib><creatorcontrib>Kontothanassis, L I</creatorcontrib><creatorcontrib>Pinto, R</creatorcontrib><creatorcontrib>De Maria, M</creatorcontrib><creatorcontrib>Abud, M</creatorcontrib><creatorcontrib>Amorim, C L</creatorcontrib><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Computer architecture news</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bianchini, R</au><au>Kontothanassis, L I</au><au>Pinto, R</au><au>De Maria, M</au><au>Abud, M</au><au>Amorim, C L</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hiding communication latency and coherence overhead in software DSMs</atitle><jtitle>Computer architecture news</jtitle><date>1996-01-01</date><risdate>1996</risdate><volume>24</volume><issue>Special Issu</issue><spage>198</spage><epage>209</epage><pages>198-209</pages><issn>0163-5964</issn><abstract>In this paper we propose the use of a PCI-based programmable protocol controller for hiding communication and coherence overheads in software DSMs. Our protocol controller provides three different types of overhead tolerance: a) moving basic communication and coherence tasks away from computation processors; b) prefetching of diffs; and c) generating and applying diffs with hardware assistance. We evaluate the isolated and combined impact of these features on the performance of TreadMarks. We also compare performance against two versions of the Shrimp-based AURC protocol. Using detailed execution-driven simulations of a 16-node network of workstations, we show that the greatest performance benefits provided by our protocol controller come from our hardware-supported diffs. Reducing the burden of communication and coherence transactions on the computation processor is also beneficial but to a smaller extent. Prefetching is not always profitable. Our results show that our protocol controller can improve running time performance by up to 50% for TreadMarks, which means that it can double the TreadMarks speedups. The overlapping implementation of TreadMarks performs as well or better than AURC for 5 of our 6 applications. We conclude that the simple hardware support we propose allows for the implementation of high-performance software DSMs at low cost. Based on this conclusion, we are building the NCP sub(2) parallel system at COPPE/UFRJ.</abstract></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0163-5964 |
ispartof | Computer architecture news, 1996-01, Vol.24 (Special Issu), p.198-209 |
issn | 0163-5964 |
language | eng |
recordid | cdi_proquest_miscellaneous_26217609 |
source | Access via ACM Digital Library |
title | Hiding communication latency and coherence overhead in software DSMs |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T09%3A00%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hiding%20communication%20latency%20and%20coherence%20overhead%20in%20software%20DSMs&rft.jtitle=Computer%20architecture%20news&rft.au=Bianchini,%20R&rft.date=1996-01-01&rft.volume=24&rft.issue=Special%20Issu&rft.spage=198&rft.epage=209&rft.pages=198-209&rft.issn=0163-5964&rft_id=info:doi/&rft_dat=%3Cproquest%3E26217609%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=26217609&rft_id=info:pmid/&rfr_iscdi=true |