Dynamic load-balancing for a parallel electromagnetic particle-in-cell code

Summary form only given. QUICKSILVER is a 3-D electromagnetic particle-in-cell simulation code developed and used at Sandia to model relativistic charged particle transport. It was originally written for shared-memory, multi-processor supercomputers such as the Cray X/MP. A new parallel version of Q...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Seidel, D.B., Plimpton, S.J., Pasik, M.F., Coats, R.S., Montry, G.R.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Summary form only given. QUICKSILVER is a 3-D electromagnetic particle-in-cell simulation code developed and used at Sandia to model relativistic charged particle transport. It was originally written for shared-memory, multi-processor supercomputers such as the Cray X/MP. A new parallel version of QUICKSILVER has been developed to enable large-scale simulations to be efficiently run on massively-parallel distributed memory supercomputers with thousands of processors, such as the Intel Tflops and Cplant machines at Sandia. The new parallel code implements all the features of the original QUICKSILVER and can be run on any platform that supports the message-passing interface (MPI) standard as well as on single-processor workstations. The original QUICKSILVER code was based on a multiple-block grid, which provided a natural strategy for extending the code to partition a simulation among multiple processors. By adding the automated capability to divide QUICKSILVER's existing blocks into sub-blocks and then distribute those sub-blocks among processors, a simulation's spatial domain can be easily and efficiently partitioned. Based upon this partitioning scheme as well as QUICKSILVER's existing particle-handling infrastructure, an algorithm has been developed for dynamically rebalancing the particle workload on a timestep-by-timestep basis that has proven to be very efficient. This paper will elaborate on the strategies used and describe the algorithms developed to parallelize and dynamically load-balance the code. Results of several benchmark simulations will be presented that illustrate the code's performance and parallel efficiency for a wide variety of simulation conditions. These calculations have as many as 10/sup 8/ grid cells and 10/sup 9/ particles and were run on thousands of processors.
DOI:10.1109/PPPS.2001.960907