Dynamic load/propagate/store for data assimilation with particle filters on supercomputers

Several ensemble-based Data Assimilation (DA) methods rely on a propagate/update cycle, where a potentially compute intensive simulation code propagates multiple states for several consecutive time steps, that are then analyzed to update the states to be propagated for the next cycle. In this paper...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of computational science 2024-03, Vol.76, p.102229, Article 102229
Hauptverfasser: Friedemann, Sebastian, Keller, Kai, Lu, Yen-Sen, Raffin, Bruno, Bautista-Gomez, Leonardo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Several ensemble-based Data Assimilation (DA) methods rely on a propagate/update cycle, where a potentially compute intensive simulation code propagates multiple states for several consecutive time steps, that are then analyzed to update the states to be propagated for the next cycle. In this paper we focus on DA methods where the update can be computed by gathering only lightweight data obtained independently from each of the propagated states. This encompasses particle filters where one weight is computed from each state, but also methods like Approximate Bayesian Computation (ABC) or Markov Chain Monte Carlo (MCMC). Such methods can be very compute intensive and running efficiently at scale on supercomputers is challenging. This paper proposes a framework based on an elastic and fault-tolerant runner/server architecture minimizing data movements while enabling dynamic load balancing. Our approach relies on runners that load, propagate and store particles from an asynchronously managed distributed particle cache permitting particles to move from one runner to another in the background while particle propagation proceeds. The framework is validated with a bootstrap particle filter with the WRF simulation code. We handle up to 2555 particles on 20,442 compute cores. Compared to a file-based implementation, our solution spends up to 2.84 less resources (cores×seconds) per particle.
ISSN:1877-7503
1877-7511
DOI:10.1016/j.jocs.2024.102229