Dynamic load-balancing for a parallel electromagnetic particle-in-cell code
QUICKSILVER is a 3-D electromagnetic particle-in-cell simulation code developed and used at Sandia to model relativistic charged particle transport. It was originally written for shared-memory, multi-processor supercomputers such as the Cray X/MP. A new parallel version of QUICKSILVER has been devel...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | QUICKSILVER is a 3-D electromagnetic particle-in-cell simulation code developed and used at Sandia to model relativistic charged particle transport. It was originally written for shared-memory, multi-processor supercomputers such as the Cray X/MP. A new parallel version of QUICKSILVER has been developed to enable large-scale simulations to be efficiently run on massively-parallel distributed memory supercomputers with thousands of processors, such as the DOE ASCI (Accelerated Strategic Computing Initiative) platforms. The new parallel code implements all features of the original QUICKSILVER and runs on any platform that supports the message-passing interface (MPI) standard as well as on single-processor workstations. The original QUICKSILVER code was based on a multiple-block grid, which provided a natural strategy for extending the code to partition a simulation among multiple processors. By adding the automated capability to divide QUICKSILVER's existing blocks into subblocks and then distribute those subblocks among processors, a simulation's spatial domain can be easily and efficiently partitioned. Based upon this partitioning scheme as well as QUICKSILVER's existing particle-handling infrastructure, an efficient algorithm has been developed for dynamically rebalancing the particle workload on a timestep-by-timestep. This paper will elaborate on the strategies used and describe the algorithms developed to parallelize and dynamically load-balance the code. Results of several benchmark simulations will be presented that illustrate the code's performance and parallel efficiency for a wide variety of simulation conditions. These calculations have as many as 10/sup 8/ grid cells and 10/sup 9/ particles and were run on thousands of processors. |
---|---|
DOI: | 10.1109/PPPS.2001.1001711 |