Optimization of Collective Communication Operations in MPICH
We describe our work on improving the performance of collective communication operations in MPICH for clusters connected by switched networks. For each collective operation, we use multiple algorithms depending on the message size, with the goal of minimizing latency for short messages and minimizin...
Gespeichert in:
Veröffentlicht in: | The international journal of high performance computing applications 2005-04, Vol.19 (1), p.49-66 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 66 |
---|---|
container_issue | 1 |
container_start_page | 49 |
container_title | The international journal of high performance computing applications |
container_volume | 19 |
creator | Thakur, Rajeev Rabenseifner, Rolf Gropp, William |
description | We describe our work on improving the performance of collective communication
operations in MPICH for clusters connected by switched networks. For each collective
operation, we use multiple algorithms depending on the message size, with the goal
of minimizing latency for short messages and minimizing bandwidth use for long
messages. Although we have implemented new algorithms for all MPI (Message Passing
Interface) collective operations, because of limited space we describe only the
algorithms for allgather, broadcast, all-to-all, reduce-scatter, reduce, and
allreduce. Performance results on a Myrinet-connected Linux cluster and an IBM SP
indicate that, in all cases, the new algorithms significantly outperform the old
algorithms used in MPICH on the Myrinet cluster, and, in many cases, they outperform
the algorithms used in IBM's MPI on the SP. We also explore in further
detail the optimization of two of the most commonly used collective operations,
allreduce and reduce, particularly for long messages and nonpower-of-two numbers of
processes. The optimized algorithms for these operations perform several times
better than the native algorithms on a Myrinet cluster, IBM SP, and Cray T3E. Our
results indicate that to achieve the best performance for a collective communication
operation, one needs to use a number of different algorithms and select the right
algorithm for a particular message size and number of processes. |
doi_str_mv | 10.1177/1094342005051521 |
format | Article |
fullrecord | <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_35029898</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A141082538</galeid><sage_id>10.1177_1094342005051521</sage_id><sourcerecordid>A141082538</sourcerecordid><originalsourceid>FETCH-LOGICAL-c503t-77b1e9b972183ceeb1609067170a0e965ba4f9b870db9e41adb938623d0839ad3</originalsourceid><addsrcrecordid>eNqFkctLxDAQxoso-Lx7XBS8VWfyaBLwIosvUNaDnkPanUqkbdamK-hfb9YKoiiSw3zk-30zCZNl-wjHiEqdIBjBBQOQIFEyXMu2UAnMmRbFetLJzlf-ZrYd4xMAFILLrex0thh869_c4EM3CfVkGpqGqsG_UJJtu-x8NXqzBfUfKk58N7m9u55e7WYbtWsi7X3Wnezh4vx-epXfzC6vp2c3eSWBD7lSJZIpjWKoeUVUYgEGCoUKHJApZOlEbUqtYF4aEuhS4bpgfA6aGzfnO9nR2HfRh-clxcG2PlbUNK6jsIyWS2BGG_0vyIxhKARL4MEP8Cks-y59wjIGSmojMEGHI_ToGrK-q8PQu2rV0Z6hQNBM8tXM41-odObU-ip0VPt0_y0AY6DqQ4w91XbR-9b1rxbBrnZpf-4yRfIxEt0jfT31T_4dqcSZxg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>220758941</pqid></control><display><type>article</type><title>Optimization of Collective Communication Operations in MPICH</title><source>Access via SAGE</source><source>Alma/SFX Local Collection</source><creator>Thakur, Rajeev ; Rabenseifner, Rolf ; Gropp, William</creator><creatorcontrib>Thakur, Rajeev ; Rabenseifner, Rolf ; Gropp, William</creatorcontrib><description>We describe our work on improving the performance of collective communication
operations in MPICH for clusters connected by switched networks. For each collective
operation, we use multiple algorithms depending on the message size, with the goal
of minimizing latency for short messages and minimizing bandwidth use for long
messages. Although we have implemented new algorithms for all MPI (Message Passing
Interface) collective operations, because of limited space we describe only the
algorithms for allgather, broadcast, all-to-all, reduce-scatter, reduce, and
allreduce. Performance results on a Myrinet-connected Linux cluster and an IBM SP
indicate that, in all cases, the new algorithms significantly outperform the old
algorithms used in MPICH on the Myrinet cluster, and, in many cases, they outperform
the algorithms used in IBM's MPI on the SP. We also explore in further
detail the optimization of two of the most commonly used collective operations,
allreduce and reduce, particularly for long messages and nonpower-of-two numbers of
processes. The optimized algorithms for these operations perform several times
better than the native algorithms on a Myrinet cluster, IBM SP, and Cray T3E. Our
results indicate that to achieve the best performance for a collective communication
operation, one needs to use a number of different algorithms and select the right
algorithm for a particular message size and number of processes.</description><identifier>ISSN: 1094-3420</identifier><identifier>EISSN: 1741-2846</identifier><identifier>DOI: 10.1177/1094342005051521</identifier><language>eng</language><publisher>Thousand Oaks, CA: Sage Publications</publisher><subject>Algorithms ; Communication ; Computer networks ; Data communications ; High performance systems ; Information networks ; Management ; Optimization ; Studies</subject><ispartof>The international journal of high performance computing applications, 2005-04, Vol.19 (1), p.49-66</ispartof><rights>COPYRIGHT 2005 Sage Publications Ltd. (UK)</rights><rights>Copyright SAGE PUBLICATIONS, INC. Spring 2005</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c503t-77b1e9b972183ceeb1609067170a0e965ba4f9b870db9e41adb938623d0839ad3</citedby><cites>FETCH-LOGICAL-c503t-77b1e9b972183ceeb1609067170a0e965ba4f9b870db9e41adb938623d0839ad3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://journals.sagepub.com/doi/pdf/10.1177/1094342005051521$$EPDF$$P50$$Gsage$$H</linktopdf><linktohtml>$$Uhttps://journals.sagepub.com/doi/10.1177/1094342005051521$$EHTML$$P50$$Gsage$$H</linktohtml><link.rule.ids>314,780,784,21819,27924,27925,43621,43622</link.rule.ids></links><search><creatorcontrib>Thakur, Rajeev</creatorcontrib><creatorcontrib>Rabenseifner, Rolf</creatorcontrib><creatorcontrib>Gropp, William</creatorcontrib><title>Optimization of Collective Communication Operations in MPICH</title><title>The international journal of high performance computing applications</title><description>We describe our work on improving the performance of collective communication
operations in MPICH for clusters connected by switched networks. For each collective
operation, we use multiple algorithms depending on the message size, with the goal
of minimizing latency for short messages and minimizing bandwidth use for long
messages. Although we have implemented new algorithms for all MPI (Message Passing
Interface) collective operations, because of limited space we describe only the
algorithms for allgather, broadcast, all-to-all, reduce-scatter, reduce, and
allreduce. Performance results on a Myrinet-connected Linux cluster and an IBM SP
indicate that, in all cases, the new algorithms significantly outperform the old
algorithms used in MPICH on the Myrinet cluster, and, in many cases, they outperform
the algorithms used in IBM's MPI on the SP. We also explore in further
detail the optimization of two of the most commonly used collective operations,
allreduce and reduce, particularly for long messages and nonpower-of-two numbers of
processes. The optimized algorithms for these operations perform several times
better than the native algorithms on a Myrinet cluster, IBM SP, and Cray T3E. Our
results indicate that to achieve the best performance for a collective communication
operation, one needs to use a number of different algorithms and select the right
algorithm for a particular message size and number of processes.</description><subject>Algorithms</subject><subject>Communication</subject><subject>Computer networks</subject><subject>Data communications</subject><subject>High performance systems</subject><subject>Information networks</subject><subject>Management</subject><subject>Optimization</subject><subject>Studies</subject><issn>1094-3420</issn><issn>1741-2846</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2005</creationdate><recordtype>article</recordtype><recordid>eNqFkctLxDAQxoso-Lx7XBS8VWfyaBLwIosvUNaDnkPanUqkbdamK-hfb9YKoiiSw3zk-30zCZNl-wjHiEqdIBjBBQOQIFEyXMu2UAnMmRbFetLJzlf-ZrYd4xMAFILLrex0thh869_c4EM3CfVkGpqGqsG_UJJtu-x8NXqzBfUfKk58N7m9u55e7WYbtWsi7X3Wnezh4vx-epXfzC6vp2c3eSWBD7lSJZIpjWKoeUVUYgEGCoUKHJApZOlEbUqtYF4aEuhS4bpgfA6aGzfnO9nR2HfRh-clxcG2PlbUNK6jsIyWS2BGG_0vyIxhKARL4MEP8Cks-y59wjIGSmojMEGHI_ToGrK-q8PQu2rV0Z6hQNBM8tXM41-odObU-ip0VPt0_y0AY6DqQ4w91XbR-9b1rxbBrnZpf-4yRfIxEt0jfT31T_4dqcSZxg</recordid><startdate>20050401</startdate><enddate>20050401</enddate><creator>Thakur, Rajeev</creator><creator>Rabenseifner, Rolf</creator><creator>Gropp, William</creator><general>Sage Publications</general><general>Sage Publications Ltd. (UK)</general><general>SAGE PUBLICATIONS, INC</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20050401</creationdate><title>Optimization of Collective Communication Operations in MPICH</title><author>Thakur, Rajeev ; Rabenseifner, Rolf ; Gropp, William</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c503t-77b1e9b972183ceeb1609067170a0e965ba4f9b870db9e41adb938623d0839ad3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Algorithms</topic><topic>Communication</topic><topic>Computer networks</topic><topic>Data communications</topic><topic>High performance systems</topic><topic>Information networks</topic><topic>Management</topic><topic>Optimization</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Thakur, Rajeev</creatorcontrib><creatorcontrib>Rabenseifner, Rolf</creatorcontrib><creatorcontrib>Gropp, William</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>The international journal of high performance computing applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Thakur, Rajeev</au><au>Rabenseifner, Rolf</au><au>Gropp, William</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Optimization of Collective Communication Operations in MPICH</atitle><jtitle>The international journal of high performance computing applications</jtitle><date>2005-04-01</date><risdate>2005</risdate><volume>19</volume><issue>1</issue><spage>49</spage><epage>66</epage><pages>49-66</pages><issn>1094-3420</issn><eissn>1741-2846</eissn><abstract>We describe our work on improving the performance of collective communication
operations in MPICH for clusters connected by switched networks. For each collective
operation, we use multiple algorithms depending on the message size, with the goal
of minimizing latency for short messages and minimizing bandwidth use for long
messages. Although we have implemented new algorithms for all MPI (Message Passing
Interface) collective operations, because of limited space we describe only the
algorithms for allgather, broadcast, all-to-all, reduce-scatter, reduce, and
allreduce. Performance results on a Myrinet-connected Linux cluster and an IBM SP
indicate that, in all cases, the new algorithms significantly outperform the old
algorithms used in MPICH on the Myrinet cluster, and, in many cases, they outperform
the algorithms used in IBM's MPI on the SP. We also explore in further
detail the optimization of two of the most commonly used collective operations,
allreduce and reduce, particularly for long messages and nonpower-of-two numbers of
processes. The optimized algorithms for these operations perform several times
better than the native algorithms on a Myrinet cluster, IBM SP, and Cray T3E. Our
results indicate that to achieve the best performance for a collective communication
operation, one needs to use a number of different algorithms and select the right
algorithm for a particular message size and number of processes.</abstract><cop>Thousand Oaks, CA</cop><pub>Sage Publications</pub><doi>10.1177/1094342005051521</doi><tpages>18</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1094-3420 |
ispartof | The international journal of high performance computing applications, 2005-04, Vol.19 (1), p.49-66 |
issn | 1094-3420 1741-2846 |
language | eng |
recordid | cdi_proquest_miscellaneous_35029898 |
source | Access via SAGE; Alma/SFX Local Collection |
subjects | Algorithms Communication Computer networks Data communications High performance systems Information networks Management Optimization Studies |
title | Optimization of Collective Communication Operations in MPICH |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T20%3A08%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Optimization%20of%20Collective%20Communication%20Operations%20in%20MPICH&rft.jtitle=The%20international%20journal%20of%20high%20performance%20computing%20applications&rft.au=Thakur,%20Rajeev&rft.date=2005-04-01&rft.volume=19&rft.issue=1&rft.spage=49&rft.epage=66&rft.pages=49-66&rft.issn=1094-3420&rft.eissn=1741-2846&rft_id=info:doi/10.1177/1094342005051521&rft_dat=%3Cgale_proqu%3EA141082538%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=220758941&rft_id=info:pmid/&rft_galeid=A141082538&rft_sage_id=10.1177_1094342005051521&rfr_iscdi=true |