Graph based routing algorithm for torus topology and its evaluation for the Angara interconnect

Several approaches and techniques exist to resolve load balancing problem in general and torus topology networks. Graph methods are natural ways to perform balancing of routing paths. A routing balancing algorithm must operate within the constraints of the underlying network architecture that limits...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of parallel and distributed computing 2024-01, Vol.183, p.104765, Article 104765
Hauptverfasser: Mukosey, Anatoly, Semenov, Alexander, Tretiakov, Aleksandr
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Several approaches and techniques exist to resolve load balancing problem in general and torus topology networks. Graph methods are natural ways to perform balancing of routing paths. A routing balancing algorithm must operate within the constraints of the underlying network architecture that limits several parameters, such as the number of logical paths in the network. In this paper, we consider a torus topology that is one of the common topologies that are used in high performance computing systems. We introduce a routing graph that corresponds to a interconnect topology, abstracts interconnect routing rules, providing one-to-one correspondence of network routes and graph paths. We propose a deadlock-free routing algorithm based on a fast single-source shortest path algorithm in a routing graph for the deterministic routing of a torus topology interconnect with a single virtual channel. The proposed algorithm is a tradeoff between two existing routing algorithms, also the algorithm is optimized for multidimensional torus topology and can be applied to any torus topology HPC system. We present a complete description of a routing graph that abstracts the Angara interconnect with 4D torus topology. We evaluated the proposed routing algorithm and benchmarked the maximum performance improvement of 71% for the Alltoall pattern for torus topology systems up to 432 nodes on the Angara interconnect simulator. The performance improvement of more than 5% was obtained for the NPB FT and IS application kernels on a 32-node supercomputer. •A routing graph abstracts network routing rules.•A deadlock-free routing algorithm is based on a fast single-source shortest path algorithm and optimized for a torus topology.•A routing graph for the Angara network reflects the restrictions on a further packet route with the previous route steps in mind.•The maximum performance improvement is 71% for the Alltoall pattern on the simulated torus topology systems up to 432 nodes.•The performance improvement on the 32-node Angara based cluster is 5% for the NPB FT and IS application kernels.
ISSN:0743-7315
1096-0848
DOI:10.1016/j.jpdc.2023.104765