Parallelization of particle-mass-transfer algorithms on shared-memory, multi-core CPUs

Simulating the transfer of mass between particles is not straightforwardly parallelized because it involves the calculation of the influence of many particles on each other. Engdahl et al. (2019) intuited that the number of matrix operations used for mass transfer grows quadratically with the number...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Advances in water resources 2024-11, Vol.193, p.104818, Article 104818
Hauptverfasser: Benson, David A., Pribec, Ivan, Engdahl, Nicholas B., Pankavich, Stephen, Schauer, Lucas
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Simulating the transfer of mass between particles is not straightforwardly parallelized because it involves the calculation of the influence of many particles on each other. Engdahl et al. (2019) intuited that the number of matrix operations used for mass transfer grows quadratically with the number of particles, so that dividing the domain geometrically into sub-domains will give speed and memory advantages, even on a single processing thread. Those authors also showed the speed scalability of several one-dimensional examples on multiple cores. Here, we extend those results for more general cases, both in terms of spatial dimensions and algorithmic implementation. We show that there is an optimal subdivision scheme for naive, full-matrix calculations on a multi-processor, or multi-threading shared-memory machine. A similar sparse-matrix implementation that also uses row-and-column-sum normalization often greatly reduces the memory requirements. We also introduce a completely new mass transfer algorithm that uses a non-geometric domain decomposition and only matrix row-sum normalization. This allows the mass-transfer “matrix” to be constructed and solved one row at a time in parallel, so it is faster and vastly more memory efficient than previous methods, but requires more care for suitable accuracy. •Analysis of memory requirements for three mass-transfer particle-tracking (MTPT) algorithms.•Theoretical and empirical analysis of parallel speedup of MTPT algorithms on shared-memory, multi-core CPUs.•Introduction of novel MTPT algorithm with many orders-of-magnitude memory savings.•Derivation of constraints for MTPT matrix construction for mass conservation.
ISSN:0309-1708
DOI:10.1016/j.advwatres.2024.104818