The Weather Research and Forecasting (WRF) is a system of numerical weather prediction and atmospheric
simulation with dual purposes for forecasting and research. The WRF software infrastructure consists of several
components such as dynamic solvers and physical simulation modules. WRF includes several Land-Surface
Models (LSMs). The LSMs use atmospheric information, the radiative and precipitation forcing from the surface
layer scheme, the radiation scheme, and the microphysics/convective scheme all together with the lands state
variables and land-surface properties, to provide heat and moisture fluxes over land and sea-ice points. The
WRF 5-layer thermal diffusion simulation is an LSM based on the MM5 5-layer soil temperature model with
an energy budget that includes radiation, sensible, and latent heat flux. The WRF LSMs are very suitable
for massively parallel computation as there are no interactions among horizontal grid points. More and more
scientific applications have adopted graphics processing units (GPUs) to accelerate the computing performance.
This study demonstrates our GPU massively parallel computation efforts on the WRF 5-layer thermal diffusion
scheme. Since this scheme is only an intermediate module of the entire WRF model, the I/O transfer does
not involve in the intermediate process. Without data transfer, this module can achieve a speedup of 36x with
one GPU and 108x with four GPUs as compared to a single threaded CPU processor. With CPU/GPU hybrid
strategy, this module can accomplish a even higher speedup, ~114x with one GPU and ~240x with four GPUs.
Meanwhile, we are seeking other approaches to improve the speeds.