Parallelisation of the Lagrangian atmospheric dispersion model NAME

The NAME  Atmospheric Dispersion Model is a Lagrangian particle model used by the Met Office to predict the propagation and spread of pollutants in the atmosphere. The model is routinely used in emergency response applications, where it is important to obtain results as quickly as possible. This req...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer physics communications 2013-12, Vol.184 (12), p.2734-2745
Hauptverfasser: Müller, Eike H., Ford, Rupert, Hort, Matthew C., Huggett, Lois, Riley, Graham, Thomson, David J.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The NAME  Atmospheric Dispersion Model is a Lagrangian particle model used by the Met Office to predict the propagation and spread of pollutants in the atmosphere. The model is routinely used in emergency response applications, where it is important to obtain results as quickly as possible. This requirement for a short runtime and the increase in core number of commonly available CPUs, such as the Intel Xeon series, has motivated the parallelisation of NAME  in the OpenMP  shared memory framework. In this work we describe the implementation of this parallelisation strategy in NAME  and discuss the performance of the model for different setups. Due to the independence of the model particles, the parallelisation of the main compute intensive loops is relatively straightforward. The random number generator for modelling sub-grid scale turbulent motion needs to be adapted to ensure that different particles use independent sets of random numbers. We find that on Intel Xeon X5680 CPUs the model shows very good strong scaling up to 12 cores in a realistic emergency response application for predicting the dispersion of volcanic ash in the North Atlantic airspace. We implemented a mechanism for asynchronous reading of meteorological data from disk and demonstrate how this can reduce the runtime if disk access plays a significant role in a model run. To explore the performance on different chip architectures we also ported the part of the code which is used for calculating the gamma dose from a cloud of radioactive particles to a graphics processing unit (GPU) using CUDA-C. We were able to demonstrate a significant speedup of around one order of magnitude relative to the serial CPU version.
ISSN:0010-4655
1879-2944
DOI:10.1016/j.cpc.2013.06.022