21 May 2015 Parallel implementation of WRF double moment 5-class cloud microphysics scheme on multiple GPUs
Author Affiliations +
Abstract
The Weather Research and Forecast (WRF) Double Moment 5-class (WDM5) mixed ice microphysics scheme predicts the mixing ratio of hydrometeors and their number concentrations for warm rain species including clouds and rain. WDM5 can be computed in parallel in the horizontal domain using multi-core GPUs. In order to obtain a better GPU performance, we manually rewrite the original WDM5 Fortran module into a highly parallel CUDA C program. We explore the usage of coalesced memory access and asynchronous data transfer. Our GPU-based WDM5 module is scalable to run on multiple GPUs. By employing one NVIDIA Tesla K40 GPU, our GPU optimization effort on this scheme achieves a speedup of 252x with respect to its CPU counterpart Fortran code running on one CPU core of Intel Xeon E5-2603, whereas the speedup for one CPU socket (4 cores) with respect to one CPU core is only 4.2x. We can even boost the speedup of this scheme to 468x with respect to one CPU core when two NVIDIA Tesla K40 GPUs are applied.
© (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Melin Huang, Melin Huang, Bormin Huang, Bormin Huang, Allen H.-L. Huang, Allen H.-L. Huang, } "Parallel implementation of WRF double moment 5-class cloud microphysics scheme on multiple GPUs", Proc. SPIE 9501, Satellite Data Compression, Communications, and Processing XI, 95010K (21 May 2015); doi: 10.1117/12.2180039; https://doi.org/10.1117/12.2180039
PROCEEDINGS
10 PAGES


SHARE
Back to Top