Multi-toroidal Interconnects: Using Additional Communication Links to Improve Utilization of Parallel Computers

Three-dimensional torus is a common topology of network interconnects of multicomputers due to its simplicity and high scalability. A parallel job submitted to a three-dimensional toroidal machine typically requires an isolated, contiguous, rectangular partition connected as a mesh or a torus. Such...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Aridor, Yariv, Domany, Tamar, Goldshmidt, Oleg, Shmueli, Edi, Moreira, Jose, Stockmeier, Larry
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 159
container_issue
container_start_page 144
container_title
container_volume
creator Aridor, Yariv
Domany, Tamar
Goldshmidt, Oleg
Shmueli, Edi
Moreira, Jose
Stockmeier, Larry
description Three-dimensional torus is a common topology of network interconnects of multicomputers due to its simplicity and high scalability. A parallel job submitted to a three-dimensional toroidal machine typically requires an isolated, contiguous, rectangular partition connected as a mesh or a torus. Such partitioning leads to fragmentation and thus reduces resource utilization of the machines. In particular, toroidal partitions often require allocation of additional communication links to close the torus. If the links are treated as dedicated resources (due to the partition isolation requirement) this may prevent allocation of other partitions that could, otherwise, use those links. Overall, on toroidal machines, the likelihood of successful allocation of a new partition decreases as the number of toroidal partitions increases. This paper presents a novel ”multi-toroidal” interconnect topology that is able to accommodate multiple adjacent meshed and toroidal partitions at the same time. We prove that this topology allows connecting every free partition of the machine as a torus without affecting existing partitions. We also show that for toroidal jobs this interconnect topology increases machine utilization by a factor of 2 to 4 (depending on the workload) compared with three-dimensional toroidal machines. This effect exists for different scheduling policies. The BlueGene/L supercomputer being developed by IBM Research is an example of a multi-toroidal interconnect architecture.
doi_str_mv 10.1007/11407522_8
format Conference Proceeding
fullrecord <record><control><sourceid>pascalfrancis_sprin</sourceid><recordid>TN_cdi_pascalfrancis_primary_16894934</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>16894934</sourcerecordid><originalsourceid>FETCH-LOGICAL-p218t-6f156552aab6a35c3712a84323bda146685168b4d2cbbd8ed65a9380465ba7723</originalsourceid><addsrcrecordid>eNpFkEtLAzEUheMLrLUbf0E2gpvRJDevcVeKj0JFF3Y9ZF4ldpoMSSror3fqCN7N4fKdcxYHoStKbikh6o5STpRgrNBH6AIEJ0BVLuAYTaikNAPg-ckImAAg5BRNCBCW5YrDOZrF-EGGA6oFVxPkX_ZdslnywdvadHjpUhMq71xTpXiP19G6DZ7XtU3Wu4Ev_G63d7Yyhx-vrNtGnDxe7vrgPxu8Traz3yP0LX4zwXRd8xvr90NzvERnreliM_vTKVo_PrwvnrPV69NyMV9lPaM6ZbKlQgrBjCmlAVGBosxoDgzK2lAupRZU6pLXrCrLWje1FCYHTbgUpVGKwRRdj729iZXp2mBcZWPRB7sz4asYwjnPgQ--m9EXB-Q2TShK77exoKQ4rF38rw0_-hJtpg</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Multi-toroidal Interconnects: Using Additional Communication Links to Improve Utilization of Parallel Computers</title><source>Springer Books</source><creator>Aridor, Yariv ; Domany, Tamar ; Goldshmidt, Oleg ; Shmueli, Edi ; Moreira, Jose ; Stockmeier, Larry</creator><contributor>Schwiegelshohn, Uwe ; Rudolph, Larry ; Feitelson, Dror G.</contributor><creatorcontrib>Aridor, Yariv ; Domany, Tamar ; Goldshmidt, Oleg ; Shmueli, Edi ; Moreira, Jose ; Stockmeier, Larry ; Schwiegelshohn, Uwe ; Rudolph, Larry ; Feitelson, Dror G.</creatorcontrib><description>Three-dimensional torus is a common topology of network interconnects of multicomputers due to its simplicity and high scalability. A parallel job submitted to a three-dimensional toroidal machine typically requires an isolated, contiguous, rectangular partition connected as a mesh or a torus. Such partitioning leads to fragmentation and thus reduces resource utilization of the machines. In particular, toroidal partitions often require allocation of additional communication links to close the torus. If the links are treated as dedicated resources (due to the partition isolation requirement) this may prevent allocation of other partitions that could, otherwise, use those links. Overall, on toroidal machines, the likelihood of successful allocation of a new partition decreases as the number of toroidal partitions increases. This paper presents a novel ”multi-toroidal” interconnect topology that is able to accommodate multiple adjacent meshed and toroidal partitions at the same time. We prove that this topology allows connecting every free partition of the machine as a torus without affecting existing partitions. We also show that for toroidal jobs this interconnect topology increases machine utilization by a factor of 2 to 4 (depending on the workload) compared with three-dimensional toroidal machines. This effect exists for different scheduling policies. The BlueGene/L supercomputer being developed by IBM Research is an example of a multi-toroidal interconnect architecture.</description><identifier>ISSN: 0302-9743</identifier><identifier>ISBN: 3540253300</identifier><identifier>ISBN: 9783540253303</identifier><identifier>EISSN: 1611-3349</identifier><identifier>EISBN: 3540317953</identifier><identifier>EISBN: 9783540317951</identifier><identifier>DOI: 10.1007/11407522_8</identifier><language>eng</language><publisher>Berlin, Heidelberg: Springer Berlin Heidelberg</publisher><subject>Additional Link ; Allocation Unit ; Applied sciences ; Computer science; control theory; systems ; Computer systems and distributed systems. User interface ; Connection Rule ; Exact sciences and technology ; Machine Utilization ; Operational research and scientific management ; Operational research. Management science ; Rectangular Partition ; Scheduling, sequencing ; Software</subject><ispartof>Job Scheduling Strategies for Parallel Processing, 2005, p.144-159</ispartof><rights>Springer-Verlag Berlin Heidelberg 2005</rights><rights>2005 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/11407522_8$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/11407522_8$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>309,310,777,778,782,787,788,791,4038,4039,27912,38242,41429,42498</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=16894934$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><contributor>Schwiegelshohn, Uwe</contributor><contributor>Rudolph, Larry</contributor><contributor>Feitelson, Dror G.</contributor><creatorcontrib>Aridor, Yariv</creatorcontrib><creatorcontrib>Domany, Tamar</creatorcontrib><creatorcontrib>Goldshmidt, Oleg</creatorcontrib><creatorcontrib>Shmueli, Edi</creatorcontrib><creatorcontrib>Moreira, Jose</creatorcontrib><creatorcontrib>Stockmeier, Larry</creatorcontrib><title>Multi-toroidal Interconnects: Using Additional Communication Links to Improve Utilization of Parallel Computers</title><title>Job Scheduling Strategies for Parallel Processing</title><description>Three-dimensional torus is a common topology of network interconnects of multicomputers due to its simplicity and high scalability. A parallel job submitted to a three-dimensional toroidal machine typically requires an isolated, contiguous, rectangular partition connected as a mesh or a torus. Such partitioning leads to fragmentation and thus reduces resource utilization of the machines. In particular, toroidal partitions often require allocation of additional communication links to close the torus. If the links are treated as dedicated resources (due to the partition isolation requirement) this may prevent allocation of other partitions that could, otherwise, use those links. Overall, on toroidal machines, the likelihood of successful allocation of a new partition decreases as the number of toroidal partitions increases. This paper presents a novel ”multi-toroidal” interconnect topology that is able to accommodate multiple adjacent meshed and toroidal partitions at the same time. We prove that this topology allows connecting every free partition of the machine as a torus without affecting existing partitions. We also show that for toroidal jobs this interconnect topology increases machine utilization by a factor of 2 to 4 (depending on the workload) compared with three-dimensional toroidal machines. This effect exists for different scheduling policies. The BlueGene/L supercomputer being developed by IBM Research is an example of a multi-toroidal interconnect architecture.</description><subject>Additional Link</subject><subject>Allocation Unit</subject><subject>Applied sciences</subject><subject>Computer science; control theory; systems</subject><subject>Computer systems and distributed systems. User interface</subject><subject>Connection Rule</subject><subject>Exact sciences and technology</subject><subject>Machine Utilization</subject><subject>Operational research and scientific management</subject><subject>Operational research. Management science</subject><subject>Rectangular Partition</subject><subject>Scheduling, sequencing</subject><subject>Software</subject><issn>0302-9743</issn><issn>1611-3349</issn><isbn>3540253300</isbn><isbn>9783540253303</isbn><isbn>3540317953</isbn><isbn>9783540317951</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2005</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNpFkEtLAzEUheMLrLUbf0E2gpvRJDevcVeKj0JFF3Y9ZF4ldpoMSSror3fqCN7N4fKdcxYHoStKbikh6o5STpRgrNBH6AIEJ0BVLuAYTaikNAPg-ckImAAg5BRNCBCW5YrDOZrF-EGGA6oFVxPkX_ZdslnywdvadHjpUhMq71xTpXiP19G6DZ7XtU3Wu4Ev_G63d7Yyhx-vrNtGnDxe7vrgPxu8Traz3yP0LX4zwXRd8xvr90NzvERnreliM_vTKVo_PrwvnrPV69NyMV9lPaM6ZbKlQgrBjCmlAVGBosxoDgzK2lAupRZU6pLXrCrLWje1FCYHTbgUpVGKwRRdj729iZXp2mBcZWPRB7sz4asYwjnPgQ--m9EXB-Q2TShK77exoKQ4rF38rw0_-hJtpg</recordid><startdate>2005</startdate><enddate>2005</enddate><creator>Aridor, Yariv</creator><creator>Domany, Tamar</creator><creator>Goldshmidt, Oleg</creator><creator>Shmueli, Edi</creator><creator>Moreira, Jose</creator><creator>Stockmeier, Larry</creator><general>Springer Berlin Heidelberg</general><general>Springer</general><scope>IQODW</scope></search><sort><creationdate>2005</creationdate><title>Multi-toroidal Interconnects: Using Additional Communication Links to Improve Utilization of Parallel Computers</title><author>Aridor, Yariv ; Domany, Tamar ; Goldshmidt, Oleg ; Shmueli, Edi ; Moreira, Jose ; Stockmeier, Larry</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p218t-6f156552aab6a35c3712a84323bda146685168b4d2cbbd8ed65a9380465ba7723</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Additional Link</topic><topic>Allocation Unit</topic><topic>Applied sciences</topic><topic>Computer science; control theory; systems</topic><topic>Computer systems and distributed systems. User interface</topic><topic>Connection Rule</topic><topic>Exact sciences and technology</topic><topic>Machine Utilization</topic><topic>Operational research and scientific management</topic><topic>Operational research. Management science</topic><topic>Rectangular Partition</topic><topic>Scheduling, sequencing</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Aridor, Yariv</creatorcontrib><creatorcontrib>Domany, Tamar</creatorcontrib><creatorcontrib>Goldshmidt, Oleg</creatorcontrib><creatorcontrib>Shmueli, Edi</creatorcontrib><creatorcontrib>Moreira, Jose</creatorcontrib><creatorcontrib>Stockmeier, Larry</creatorcontrib><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Aridor, Yariv</au><au>Domany, Tamar</au><au>Goldshmidt, Oleg</au><au>Shmueli, Edi</au><au>Moreira, Jose</au><au>Stockmeier, Larry</au><au>Schwiegelshohn, Uwe</au><au>Rudolph, Larry</au><au>Feitelson, Dror G.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Multi-toroidal Interconnects: Using Additional Communication Links to Improve Utilization of Parallel Computers</atitle><btitle>Job Scheduling Strategies for Parallel Processing</btitle><date>2005</date><risdate>2005</risdate><spage>144</spage><epage>159</epage><pages>144-159</pages><issn>0302-9743</issn><eissn>1611-3349</eissn><isbn>3540253300</isbn><isbn>9783540253303</isbn><eisbn>3540317953</eisbn><eisbn>9783540317951</eisbn><abstract>Three-dimensional torus is a common topology of network interconnects of multicomputers due to its simplicity and high scalability. A parallel job submitted to a three-dimensional toroidal machine typically requires an isolated, contiguous, rectangular partition connected as a mesh or a torus. Such partitioning leads to fragmentation and thus reduces resource utilization of the machines. In particular, toroidal partitions often require allocation of additional communication links to close the torus. If the links are treated as dedicated resources (due to the partition isolation requirement) this may prevent allocation of other partitions that could, otherwise, use those links. Overall, on toroidal machines, the likelihood of successful allocation of a new partition decreases as the number of toroidal partitions increases. This paper presents a novel ”multi-toroidal” interconnect topology that is able to accommodate multiple adjacent meshed and toroidal partitions at the same time. We prove that this topology allows connecting every free partition of the machine as a torus without affecting existing partitions. We also show that for toroidal jobs this interconnect topology increases machine utilization by a factor of 2 to 4 (depending on the workload) compared with three-dimensional toroidal machines. This effect exists for different scheduling policies. The BlueGene/L supercomputer being developed by IBM Research is an example of a multi-toroidal interconnect architecture.</abstract><cop>Berlin, Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/11407522_8</doi><tpages>16</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0302-9743
ispartof Job Scheduling Strategies for Parallel Processing, 2005, p.144-159
issn 0302-9743
1611-3349
language eng
recordid cdi_pascalfrancis_primary_16894934
source Springer Books
subjects Additional Link
Allocation Unit
Applied sciences
Computer science
control theory
systems
Computer systems and distributed systems. User interface
Connection Rule
Exact sciences and technology
Machine Utilization
Operational research and scientific management
Operational research. Management science
Rectangular Partition
Scheduling, sequencing
Software
title Multi-toroidal Interconnects: Using Additional Communication Links to Improve Utilization of Parallel Computers
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T04%3A55%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_sprin&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Multi-toroidal%20Interconnects:%20Using%20Additional%20Communication%20Links%20to%20Improve%20Utilization%20of%20Parallel%20Computers&rft.btitle=Job%20Scheduling%20Strategies%20for%20Parallel%20Processing&rft.au=Aridor,%20Yariv&rft.date=2005&rft.spage=144&rft.epage=159&rft.pages=144-159&rft.issn=0302-9743&rft.eissn=1611-3349&rft.isbn=3540253300&rft.isbn_list=9783540253303&rft_id=info:doi/10.1007/11407522_8&rft_dat=%3Cpascalfrancis_sprin%3E16894934%3C/pascalfrancis_sprin%3E%3Curl%3E%3C/url%3E&rft.eisbn=3540317953&rft.eisbn_list=9783540317951&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true