On the Parallelization of a Cache-Optimal Iterative Solver for PDEs Based on Hierarchical Data Structures and Space-Filling Curves

Competitive numerical simulation codes solving partial differential equations have to tap the full potential of both modern numerical methods – like multi-grid and adaptive grid refinement – and available computing resources. In general, these two are rival tasks. Typically, hierarchical data struct...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Günther, Frank, Krahnke, Andreas, Langlotz, Markus, Mehl, Miriam, Pögl, Markus, Zenger, Christoph
Format: Buchkapitel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Competitive numerical simulation codes solving partial differential equations have to tap the full potential of both modern numerical methods – like multi-grid and adaptive grid refinement – and available computing resources. In general, these two are rival tasks. Typically, hierarchical data structures resulting from multigrid and adaptive grid refinement impede efficient usage of modern memory architectures on the one hand and complicate the efficient parallelization on the other hand due to scattered data for coarse-level-points and unbalanced data trees. In our previous work, we managed to bring together high performance aspects in numerics as well as in hardware usage in a very satisfying way. The key to this success was to integrate space-filling curves consequently not only in the programs flow control but also in the construction of data structures which are processed linearly even for hierarchical multilevel data. In this paper, we present first results on the second challenge, namely the efficient parallelization of algorithms working on hierarchical data. It shows that with the same algorithms as desribed above, the two main demands on good parellel programs can be fulfilled in a natural way, too: The balanced data partitioning can be done quite easily and cheaply by cutting the queue of data linearized along the space-filling curve into equal pieces. Furtheron, this partitioning is quasi-optimal regarding the amount of communication. Thus, we will end up with a code that overcomes the quandary between hierarchical data and efficient memory usage and parallelization in a very natural way by a very deep integration of space-filling-curves in the underlying algorithm.
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-540-30218-6_58