Parallel GPU architecture framework for the WRF Single Moment 6-class microphysics scheme

An Earth-observing remote sensing instrument is used to collect information about the physical environment within its instantaneous-field-of-view and is often placed aboard a suborbital or satellite platform for maximal spatial coverage. Remote sensing inversion techniques can extract valuable meteo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers & geosciences 2015-10, Vol.83, p.17-26
Hauptverfasser: Huang, Melin, Huang, Bormin, Gu, Lingjia, Allen Huang, H.-L., Goldberg, Mitchell D.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:An Earth-observing remote sensing instrument is used to collect information about the physical environment within its instantaneous-field-of-view and is often placed aboard a suborbital or satellite platform for maximal spatial coverage. Remote sensing inversion techniques can extract valuable meteorological parameters that are subsequently passed through weather models for research and forecasting. One of the several microphysics packages for clouds and precipitation in weather models is WRF Single Moment 6-class (WSM6) scheme, and it is now widely used. With the advancement in Graphics Processing Units (GPUs), implementation of a fast and parallel WSM6 scheme is achievable. This paper describes a massively parallel GPU design of the WSM6 scheme. The performance is compared to a CPU implementation running on Intel Xeon E5-2603 at 1.8GHz. Our implementation shows a speedup of 216× using a single NVIDIA K40 GPU as compared to its CPU counterpart running on one CPU core, whereas the speedup for one CPU socket (4 cores) with respect to one CPU core is only 3.7×. •We develop a massively parallel GPU design of the WRF WSM6 microphysics scheme.•NVIDIA CUDA C programming language is used to implement the WSM6 scheme.•We show a speedup of 216× using a single NVIDIA K40 GPU.•The speedup for one CPU socket (4 cores) with respect to one CPU core is only 3.7×.
ISSN:0098-3004
1873-7803
DOI:10.1016/j.cageo.2015.06.014