Estimation of daily average temperature using multisource spatial data in data sparse regions of Central Asia

Abstract The distribution of the daily average air temperature with high spatial resolution is vital for hydro-ecological applications. The air temperature usually recorded at fixed-point stations provides little distribution information and easily suffers from the scarce amount and uneven distribution of the stations in the data sparse regions. In this study, a method based on multisource spatial data was developed to estimate the spatial distribution of daily average temperature, especially for data sparse regions. In this method, the instantaneous temperature was retrieved first using the moderate resolution imaging spectroradiometer data, which was then transformed to a daily value using transformation equations. Second, the global land data assimilation system air temperature data were spatially downscaled and used to improve the data accuracy from step 1 at low temperatures. This method was applied in the Ili River basin in Central Asia, and the results were evaluated against data from two stations’ observations and in situ data from a field test site. The results showed the correlation coefficient varies from 0.90 to 0.94 and the root mean square deviation is ∼ 3 ° C , indicating the generated temperature matched the observations well. This suggests the method is an alternative for data sparse regions.

Estimation of daily average temperature using multisource spatial data in data sparse regions of Central Asia

Introduction
Air temperature is one of the most important parameters for a wide range of applications in ecology and hydrology. Near surface air temperature with varying spatial and temporal resolution is often required in many hydro-ecological modeling techniques. The values of surface air temperature are now usually recorded at a fixed point at stations, which provide little distribution of air temperature over regions. However, spatial distribution of air temperature is vital for the regional hydro-ecological modeling.
Researchers have explored various methods to generate spatial air temperature. Some researchers like Peterson et al., 1 Anderson, 2 and Florio et al. 3 used spatial interpolation to estimate spatial distribution of air temperatures. However, the reliability of results from interpolations greatly depends on the density stations and the numbers of observations; thus, these methods are limited for application over large area, especially in data sparse regions.
Other researchers have used remote sensing data to derive spatial distribution of air temperatures because the observations from satellites can provide high temporal-spatial distribution information of the underlying surface, which can be used to derive the air temperature. Land surface temperature (LST), which can be obtained through the split-window technique, is usually used to derive the near surface air temperature. [4][5][6] Specific algorithms have been developed for different sensors, such as advanced very high resolution radiometer, moderate resolution imaging spectroradiometer (MODIS), and Meteosat to get LST. [7][8][9][10] At present, the methods used to derive near surface air temperature from LST can be divided into three main categories: the statistical approach, [11][12][13][14] the temperature-vegetation index (TVX) approach, [15][16][17][18][19] and the energy-balance approach. [20][21][22] The statistical approach is usually based on the linear regression between observed air temperature and LST, which is relatively simple and commonly used. Air temperature data can be derived with high accuracy in the regions where the methods or the relationship were established. To enhance the physical basis of the statistical methods, many researchers have attempted to introduce more parameters reflecting the effects of various factors (seasons, vegetation, topography, etc.) when the relationship was built. 23,24 However, these methods suffer from the dependence on the calibration data, which limits the application of these methods, especially in data sparse areas.
The TVX approach is based on the assumptions that a linear and negative relationship exists between LST and normalized difference vegetation index (NDVI) and that the air temperature is equal to the radiometric temperature in a fully vegetated area. This method involves only two factors (LST and NDVI) to retrieve the air temperature and seems easy to implement, but it has limitations as well. Many factors like solar radiation, local topography, and seasonality are overlooked in this approach, and only NDVI is taken into account besides LST, so this approach is sometimes inadequate when the NDVI does not work well. 25 Energy-balance methods are physically based. In these methods, the physical processes of energy transportation and transformation are considered, instead of using only empirical or statistical relationships. However, energy-balance models require soil, canopy, and atmospheric data that are often unavailable over large areas.
In summary, the methods mentioned above face challenges in deriving spatial distribution of temperature over large areas, especially in data sparse regions either due to the strong dependence on the stations' calibration data, the less consideration of the underlying conditions, or the availability in data acquisition. The temperature derived using remote sensing is often the instantaneous value when the satellite passes by, while the daily value is required in many hydrological models.
In this study, a new method combining multisource spatial data was developed to estimate the daily mean near surface air temperature at 2 m height (T 2m ), especially for data sparse regions. In this new method, the Klemen algorithm created by Zakšek 24 was used to first obtain the instant T 2m values on the basis of MODIS data products. The Klemen method was adopted for its consideration of various radiation and underlying surface factors and its access to remote sensing data. The instant T 2m from Klemen method was transformed to the daily mean temperature value by introducing the National Center for Atmospheric Research/National Centers for Environmental Prediction (NCAR/NCEP) reanalysis temperature data. The derived daily mean T 2m was finally adjusted with the downscaled temperature data from global land data assimilation system (GLDAS) to achieve higher accuracy at low temperatures.

Study Area and In Situ Measurement
The temperature data inversion in this study was a part of the hydrological modeling in ungauged regions in Central Asia. This study was carried out at the region containing runoff of the Ili River (an international river lying between China and Kazakhstan, Fig. 1) covering an area of 115;000 km 2 . The outlet of the study area is the same as the outlet of the Kapchagay Reservoir. The precipitation in this area varies greatly due to the topography. Depending on the variation in topography, the precipitation in the Ili basin can reach up to 1000 mm∕year in the mountains compared with 200 mm∕year in the valley. The temperature also changes extensively because of the great differences in altitude.
Daily average temperature data from meteorological stations (at Yining and Zhaosu, located in the plain and mountainous areas, respectively) were collected from 2005 to 2009, corresponding to the remote sensing-based temperature simulation and hydrological modeling period.
These data were collected for the derived daily average temperature validation. It is noticed that the derived daily average temperature using the Klemen method based on multisource spatial data has a height of 2 m above the surface, 24 while the observations from Chinese meteorological stations are recorded at 1.5 m above ground according to the "Technical regulation of meteorological observations" published by the China Meteorological Administration. The derived daily average temperature was transformed to the height of 1.5 m above ground before it was validated using observations. A field experiment was carried out for a test site (81.10°E, 42.78°N and 2017 m altitude) near Aheyz, a southern study area (Fig. 1). The instantaneous air temperatures are recorded using the automatic meteorological station HOBO and are available at 4 min intervals from August to November in 2009 (see Table 1 for details on the instrumentation for the temperature measurements). The instant temperature recorded by HOBO was used to have the simulated instant temperature value from Klemen method tested, and the daily mean value from the HOBO records were used to validate the derived daily mean values from the Klemen method after temporal transformation and downscaled GLDAS temperature. So the measurements using HOBO were carried out 2 m above the ground to be consistent with the height described in Zakšek. 24

Spatial Data
Multisource spatial data platform products were used in this study to estimate daily mean values of near surface air temperatures (T 2m ), including the MODIS, GLDAS, Shuttle Radar Topography Mission (SRTM), and NCAR/NCEP products. These data products were selected for their global coverage and common use in various research fields, both of which may offer this study wider application in other regions.  Three MODIS products at 1 km spatial resolution (version 5) were used to derive the instant T 2m : (1) 8-day albedo (MCD43B3), (2) 16-day vegetation index (MOD13A2), and (3) daily land surface temperature and emissivity (MOD11A1).
NCAR/NCEP near surface air temperature data were used in this study to transform the instant T 2m to a daily mean value. [26][27][28] This NCAR/NCEP dataset has a spatial resolution of 1.875 deg and a temporal resolution of 6 h, so there are four instant values (UTC 00:00, 06:00, 12:00, and 18:00) for each day.
GLDAS provides the near surface air temperature data with a spatial resolution of 0.25 deg and a temporal resolution of 3 h. 29 These data were then spatially downscaled to 1 km using the SRTM elevation data on the basis of an assumed temperature lapse rate. The downscaled results were then used to adjust the MODIS retrieval results.

Methodology
Daily average air temperature data were simulated and compared with the observations from two weather stations and a field observation site. For this comparison, two processes were applied: (1) the instant temperature value when the satellite (Terra) passed above was derived first from MODIS products and then was transformed to daily mean value using a statistical method presented in this study; (2) the derived daily average air temperature from step (1) was modified by replacing the simulations at low temperatures with downscaled GLDAS air temperature to achieve better accuracy when simulating low temperatures. The daily average air temperature data from GLDAS with coarse spatial resolution were downscaled to the resolution of 1 km using the relationship between altitude and temperature.

Instantaneous temperature retrieval using the Klemen method
The method developed by Zakšek 24 was used to obtain instant air temperature values at a height of 2 m. This method [Eq. (1)] was adopted because of its consideration of various effects including season, solar radiation, and underlying surface conditions. The Klemen method accounts for more physical processes compared with the simple statistical methods mentioned above, and all the parameters required for this method can be prepared using multisource spatial data.
where T 2m is the instant temperature at 2 m height and the unit is K; LST is the instantaneous daytime land surface temperature from MODIS MOD11A1 in K; z is the solar zenith angle in rad; α is the solar azimuth start from the south in rad; AL is the surface albedo; i is the solar incident angle in rad; s is the slope in rad; R s ↓ is the down-welling surface short-wave radiation flux in w∕m 2 ; and Δh is the difference between the pixel elevation and the mean elevation within the vicinity of 20 km in km.
The key parameter in this step is the accuracy and representativeness of the LST. As described in the spatial data section, the LST data used in this study were instant values from the MODIS product (MOD11A1). The LST contained in the MODIS MOD11A1 dataset was generated using the generalized split-window method proposed by Wan and Dozier 30 on the basis of MODIS thermal infrared ray band 31 and band 32. According to Wan and Dozier, 30 the atmospheric effects in the LST retrieving were corrected on the basis of differential absorption in adjacent thermal infrared bands rather than on absolute atmospheric transmission in a single band, so it is less sensitive to the uncertainties in optical properties of the atmosphere and no profiles of atmospheric water vapor and temperature were needed. The accuracy of the MODIS LST was discussed by many authors and the nominal accuracy was thought to be AE1K. [30][31][32] This study was a part of the hydrological modeling in ungauged regions and was designed to provide daily average surface air temperature estimation for distributed hydrological models, especially in data sparse regions, and no further atmospheric correction was applied to the MODIS LST data.

Time-scale transformation to daily average value
The temperature obtained directly from the Klemen method is the instantaneous value when the satellite (Terra) passes by, while the daily mean value is required in most hydrological models. The temporal scale of the simulated temperature must be transformed before it can be used for hydrological modeling. In this study, the time-scale transformation was based on the statistical relationship between the instant temperature and the daily mean value. Unlike in the research carried out by Colombi, few weather stations exist in the study area, so an insufficient number of observations are available to establish the relationship between instant temperature and daily mean temperature. The NCAR/NCEP reanalysis temperature data were introduced to enhance this relationship.
As described above (in Sec. 2.2), the NCAR/NCEP reanalysis temperature data provide four instant temperature values in the study area (UTC 00:00, 06:00, 12:00, and 18:00) each day. The local time in the study area is ∼5h before the UTC time, which means that the UTC 06:00 is approximately 11:00 in the study area, very close to the time the satellite (Terra) passes this region (the time that Terra passes the study area varies in local time from about 10:00 a.m. to about 12:00 a.m.). If we premise two simple assumptions, the daily average temperature can be derived: (1) a strong relationship exists between the NCAR/NCEP 06:00 temperature and the daily mean NCAR/NCEP temperature and (2) the same relationship exists between the derived instantaneous Klemen T 2m and the simulated daily mean temperature. Figure 2 shows 10 NCAR/NCEP pixels of the study area and the relationship between the instant temperature value (UTC 06:00) and the daily mean value (the mean of the four NCAR/ NCEP instant values) that was established for each pixel [Eqs. (2) to (11)]. Fig. 2 Location of the National Center for Atmospheric Research/National Centers for Environmental Prediction pixels over the study area.
Good correlation between the instant NCAR/NCEP value (UTC 06:00) and the daily mean value (obtained from NCAR/NCEP instant values) can be found from Eqs. (2) to (11), each of which stands for the linear relationship between the NCAR/NCEP 06:00 instant temperature and the daily mean NCAR/NCEP temperature within each NCAR/NCEP pixels. The R 2 of these statistical equations vary from 0.96 to 0.98 and most RMSE are <2°C, proving the validity of assumption 1 in all 10 NCAR/NCEP pixels.
Though assumption 2 was not directly proven like assumption 1, it is still supported indirectly by the following analysis. The NCAR/NCEP reanalysis data were generated through data assimilation system using multisource observations with advanced quality control and monitoring components. [26][27][28] It is widely used in the research and climate monitoring communities. So the NCAR/NCEP air temperature values in this study were supposed to be adequate to demonstrate the comprehensive conditions within NCAR/NCEP pixels, and the temporal scale transformation equations [Eqs. (2) to (11)] from the statistical analysis of NCAR/NCEP data are reasonable and representative within NCAR/NCEP pixels. As described above, the time of the derived instantaneous T 2m was close to the time of NCAR/NCEP 06:00 temperature, so the temporal scale transformation equations between NCAR/NCEP 06:00 temperature and daily mean value based on assumption 1 of using reanalysis temperature data can be applied to estimate the daily mean temperature using derived instantaneous T 2m .
In addition, the derived instantaneous T 2m data using Klemen method has higher spatial resolution than the NCAR/NCEP pixels, so there are many instantaneous T 2m pixels within an NCAR/NCEP pixel. In this study, the temporal scale transformation equation of each NCAR/NCEP pixel was applied to the smaller T 2m pixels within its coverage.

New Daily Average Temperature from Downscaled GLDAS Data and Klemen Method
The GLDAS temperature data were used to improve the simulations accuracy at low temperatures from Sec. 3.1. Original GLDAS datasets provided the near surface air temperature with a high temporal resolution of 3 h and a coarse spatial resolution of 0.25 deg. Before adjusting the simulated temperature data based on the MODIS products, the GLDAS temperature data were downscaled to the resolution of the MODIS grid size. The downscaling process rested on an assumption that the GLDAS temperature value of each grid represents the average condition within this grid, so the GLDAS temperature value of each grid should equal the temperature at the average elevation of the grid. On the basis of this assumption, we introduced the temperature lapse rate to quantify the extent of temperature change with elevation [Eq. (12)]. Along with the prerequisites described above, the core of this method is the introduction of a higher (compared with GLDAS) spatial resolution (1 km) digital elevation model (DEM) refining the spatial distribution of the GLDAS temperature value and proper temperature lapse rate value assignment.
ΔT ¼ ΔH Ã δ; (12) where ΔT stands for the temperature difference between two places A and B, ΔH is the altitude difference between B and A, and δ is the temperature lapse rate.
The lapse rate at which air cools with elevation change varies from ∼0.98°C per 100 m for dry air (i.e., the dry-air adiabatic lapse rate) to ∼0.40°C per 100 m (i.e., the saturated adiabatic lapse rate). 33 In this study, the temperature lapse rate was fixed at 0.56°C∕100 m according to the previous research work carried out in the study area. 34 The downscaling of GLDAS consists of four steps: (1) the original GLDAS daily mean temperature data (the mean value of eight 3 h instant temperature values) was calculated; (2) the average elevation with the spatial resolution of ∼25 km matching the original GLDAS grid size was prepared; (3) the temperature at sea level with a spatial resolution of 25 km was calculated according to Eq. (12) using data from (1) and (2) and then bilaterally interpolated to 1 km; (4) the final downscaled temperature result with a spatial resolution of 1 km was calculated by data from (3) and DEM (1 km) using Eq. (12). Figure 3 gives the procedure for downscaling.
Based on the two daily mean air temperature data calculated above, new combined temperature data were generated. According to our simulation results (details in Sec. 4), the daily mean values based on the Klemen method show good heterogeneity of spatial distribution of temperature and perform better over a certain threshold, with these values becoming less accurate below this threshold. The downscaled GLDAS daily mean temperature data were used to replace the values based on the Klemen method under the threshold because of their higher accuracy compared with the station observations at low temperature. However, above the threshold, the values based on the Klemen method were retained because of their ability to represent the spatial heterogeneity of temperatures affected by various underlying parameters, unlike the downscaled temperature data, which are dominated by elevation only. The flow chart of the new daily mean temperature data generation is shown in Fig. 4.

Klemen Method-Based Daily Average Temperature
As described above, the station observations used in this study were recorded at a height of 1.5 m, which differs from the height of simulated surface daily average air temperature. So, Fig. 3 Flow chart of the global land data assimilation system (GLDAS) temperature downscaling. T sea and T cell mean the air temperature at sea level and pixel, respectively. H cell is the pixel elevation and δ means the temperature lapse rate. before the validation was carried out, the simulations were transformed to the height of 1.5 m. The temperature lapse rate in Sec. 3.2 was used for this transformation. The lapse rate at 0.56°C∕100 m should be appropriate and reasonable for the reason that it was from the local study in this region based on the weather stations observations.

Derived instantaneous temperature
The instant air temperature was simulated and validated with the field observations. The recorded instantaneous air temperature using HOBO at Ahyz was selected according to the transit time of the Terra satellite. The simulation results and HOBO observations with comparable time were then used for the accuracy analysis. Because of the data missing from the MODIS products (no values were available in some image pixels), only part of the observations from HOBO during the observation period have matched simulated values (Fig. 5).
The derived instantaneous temperature has a satisfactory result. The results indicate that the instantaneous air temperature based on multisource spatial data has a high correlation with the observations, and the R 2 can be as high as 0.73 and the RMSE between derived air temperature and observed values is 2.65°C.

Daily average temperature
The derived instantaneous air temperature was transformed to a daily average value and was then validated using the observed daily average temperature from 2005 to 2009 at the Zhaosu and Yining stations; the daily mean temperature observations from HOBO equipment were also used. Figure 6 shows that the daily average temperature after temporal transformation was less accurate compared to the instantaneous values (Fig. 5). The RMSE and R 2 at the Zhaosu station during the simulation period are 4.9°C and 0.91, respectively. These two indexes performed Fig. 4 The flow chart of daily average temperature estimation using the combined method.
better at the Yining station where the RMSE is 4.19°C and the R 2 is 0.94. When validated using the observed HOBO daily average air temperature, the results based on the Klemen method have a good performance, where the R 2 is 0.79 and the RMSE is 2.29°C. According to Fig. 6, the derived daily average temperature was more consistent with the observations at high temperatures. The simulated daily average temperature was lower than the actual value when the temperature was low, and the difference increases as the temperature decreases. The threshold is ∼0°C.

New Daily Average Air Temperature
Similarly, the observed daily average temperature from 2005 to 2009 at the Zhaosu and Yining stations, and the observations from HOBO were also used for downscaled GLDAS daily average temperature validation. The results are shown in Fig. 7.
As shown in Fig. 7, the downscaled GLDAS daily average temperature performed well at both weather stations. According to the regression analysis, the R 2 at the Zhaosu station is 0.96 and the RMSE is 2.27°C during the simulation period, while the R 2 and RMSE at the Yining station are 0.96 and 2.73°C, respectively. According to the validation at HOBO location, the downscaled GLDAS data have a similar accuracy as the data from the Klemen method. The R 2 and RMSE are 0.86 and 2.26°C, respectively.
The new daily average temperature data comprised a combination of the derived daily mean temperature on the basis of remote sensing and the downscaled GLDAS data. Through the analysis of Fig. 6, 0°C was selected as the threshold value mentioned in Sec. 3.3 to determine if the simulated daily mean temperature using the Klemen method should be replaced with the GLDAS downscaled value.  The new daily average temperature shows significant improvement at low temperatures compared with data obtained using the Klemen method (Fig. 8). During the simulation period, the R 2 between the combined daily average temperature and observations from the Zhaosu station was 0.90, and this value was 0.94 at the Yining station. The RMSE at the Zhaosu station is 3.38°C, while the RMSE is 3.13°C at the Yining station.

Klemen Method-Based Daily Average Temperature
The instantaneous temperature simulated on the basis of multisource spatial data using the Klemen method performs better than the daily average value derived by temporal transformation. Two factors can possibly explain the obvious decrease in accuracy: temporal transformation and the characteristics of the Klemen method.
The relationship between the instantaneous temperature at the satellite transit time and the daily mean value was assumed to be linear in this study. This simplification will certainly introduce uncertainties. The relationship established using NCAR/NCEP data stands for the comprehensive level within each NCAR/NCEP grid. Error will also be introduced when this relationship was used in pixels with higher spatial resolution. However, considering the high correlation between the temperature at the transit time and the daily average value described in Sec. 3.1.2, the high accuracy of the simulations validated using HOBO observations at both instant and daily time scale, and the obvious underestimation shown in Fig. 6 for low temperatures, the temporal transformation should not be primarily responsible for the decrease in accuracy in the daily average temperature simulation.  Equation (1) in Sec. 3.1.1 indicates that many parameters were included in the Klemen method to depict the influence of various factors, such as radiation, season, vegetation, and topography. The air temperature derived on the basis of this method shows sufficient spatial distribution details, and the spatial pattern of the air temperature depending on the underlying surface conditions is more reasonable (see Sec. 5

.3). The Klemen method was proposed by
Zakšek 24 using observations (from May to December 2005) from dozens of meteorological stations in Slovenia, Germany, and France. Due to the warm and humid climate in Europe, most of the observations from these stations in 2005 fell into the temperature span above 0°C. Thus, Eq. (1), which is based on these observations, may be innately more suitable for use at high temperatures. The simulated instantaneous air temperature and daily average air temperature validated using HOBO observations [Figs. 5 and 6(c)] have high accuracy and may profit from the field observation time between August and September, when the temperature is high. The climate conditions in the study area are different from where Eq. (1) was established; the temperature span is wider and more low temperatures (<0°C) appear, which may make the regional LST and NDVI in the study area no longer fit the relationship calibrated in Europe at low temperatures.
In this study, no further atmospheric correction was applied to the LST data from MODIS products. The atmospheric effects such as air moisture were partly removed or corrected when the LST was generated using the generalized split-window method; the accuracy of MODIS LST is ∼ AE 1K. The air moisture, which will become lower and drier at low temperatures, will certainly influence the simulation accuracy, but it should not mainly account for the low simulation accuracy of this method (the overall RMSE of daily average simulation is >4K, Fig. 6) at low temperatures.
In addition, in this study, the transformation equations obtained at coarse spatial resolution based on NCAR/NCEP data were applied to the smaller T 2m pixels within each NCAR/NCEP pixel with MODIS grid size. Uncertainties will be introduced in this procedure because the statistical relationship between instant air temperature and the daily average values vary when the spatial resolutions change. This method was carried out for its simplicity and feasibility in the data sparse area where the transformation equations with high spatial resolution are difficult to establish and the computational burden will be heavy. In further study, more spatial resolution independent transformation equations are needed.

Downscaled GLDAS Daily Average Temperature Data
The downscaled GLDAS daily average temperature performed better at two validated weather stations than the data based on the Klemen method [Figs. 7(a) and 7(b)]. The RMSE between the downscaled data and the observation is within 3 K, while the RMSE is >4K in Figs. 6(a) and 6(b). Though the downscaled data have better accuracy at the stations, this improved accuracy did not equate to higher data quality. The downscaling process was based on the temperature lapse rate and the assumption of a relationship between the air temperature and the elevation (see Sec. 3.2). During this process, no factors other than elevation in this area were taken into account to downscale the GLDAS air temperature data. Without considering the underlying surface complexity, the downscaled GLDAS temperature data may face challenges where the underlying surface conditions (water surface, dense vegetation, aspect, and so on) have significant influences. The comparison between Figs. 6(c) and 7(c) shows that the data from downscaled GLDAS and the Klemen method have similar accuracy or the data from the Klemen method perform a little better at high temperatures using HOBO observations during the observation period. It is reasonable to believe that the downscaled daily average air temperature may not be more accurate than the data from the Klemen method for other locations (such as the valley in Fig. 9) when the temperature is high. The reasonable regional spatial distribution and interrelation of parameters are important in the hydrological or ecological applications. The spatial pattern of the downscaled air temperature data was determined only by the elevation, following the variation of DEM. Due to more physically based parameters being introduced into the simulation using the Klemen method, the derived daily average temperature has a better spatial pattern and a better depiction of the distribution, especially at high temperatures.
In addition, though the temperature lapse rate used in the spatial downscaling of GLDAS data and in the process of translating the derived daily air temperature data to the height of 1.5 m above the ground was based on previous study in this area using meteorological observations, problems still exist. The temperature lapse rate in this study was fixed to the constant for the whole study area while values with spatial variation are more reasonable, though it is difficult to obtain the spatial distribution of lapse rate in the data sparse region. The temperature lapse rate is more suitable to demonstrate the variation of temperature with elevation in large scale, so it may be coarse to use the temperature lapse rate for the temperature transformation with only 0.5 m height differences. A more accurate algorithm should be discussed in further study to eliminate the height differences in the simulated daily average temperature and observations.

New Daily Average Temperature Data
The new data were generated from the combination of the two daily average air temperatures discussed previously. Due to the discussion in Sec. 5.2, the daily average temperature based on the Klemen method was adopted at high temperatures (>0°C) to keep its reasonable spatial pattern and was replaced by the downscaled GLDAS temperature data (see Sec. 3.2) to increase the accuracy at low temperatures (<0°C). At high temperatures, more details about the air temperature distribution are shown in Figs. 9(a) and 9(c), reflecting the influence of various underlying surface conditions. A large reservoir named Kapchagay exists in the study area downstream of the Ili River. The water surface evaporation cools the air above the reservoir leading to temperatures (dark gray area under the red rectangle) lower than the surrounding temperatures. A large area of the Gobi Desert lies within the study area (the bright parts), where the temperature is obviously higher. Most of the light gray areas in Figs. 9(a) and 9(c) cover well-developed vegetation. Many of these spatial pattern details cannot be seen in the downscaled air temperature distribution [ Fig. 9(b)].
At the low temperatures [the black areas in Fig. 9(a)], most of which appear on the top of the mountain, the data from the Klemen method become less accurate. A useful spatial texture of the air temperature can no longer be observed, and the accuracy of the simulated temperatures obviously decreases. The downscaled GLDAS air temperature data with the DEM texture were used.
In this study, the derived daily average air temperature based on the Klemen method was simply replaced with the downscaled GLDAS data to achieve overall simulation accuracy. In the future, weights may be introduced to integrate these two temperature data. Different weight values according to certain rules will be assigned to the Klemen method data and the downscaled GLDAS data to reflect the contributions from each, instead of simply replacing one with another. In order to achieve better overall simulation accuracy, the coefficients used in the Klemen method can also be recalibrated in a wider temperature span to make it more suitable in different regions.
The temporal scale transformation equations used in this study were based on the NCAR/ NCEP reanalysis temperature data (details in Sec. 3.1.2). NCAR/NCEP provides four instant values (UTC 00:00, 06:00, 12:00, and 18:00) each day. If the Terra passing time in certain regions is close to any of the four times, the temporal scale transformation equations and the daily average air temperature can be made in the same way. In this study, the time that satellite passes the study area varies from about local 10:00 a.m. to about local 12:00 a.m., is about an hour before or after the UTC 06:00 (local time 11:00 a.m.). So that the local satellite passing time is close to any of the four NCAR/NCEP time means the difference between them is <1h. Figure 10 shows the regions around the world where the daily average air temperature deriving method in this study can be used.

Conclusions
This study shows how the daily average air temperature at 2 m height can be obtained using multisource spatial datasets over a large area, especially in data sparse regions. An integrated method using the combination of an advanced statistical model (the Klemen method) with temporal transformation and spatial downscaling was proposed. Using the integrated method, air temperature was estimated during the study period with an accuracy of ∼3K for two stations in the Ili River basin in Central Asia; the portability of this method is also discussed.
The Klemen method provides a depiction of the spatial pattern and distribution of the air temperature by incorporating parameters such as radiation, season, and underlying surface conditions. At high temperatures, this method has high simulation accuracy. One disadvantage of this approach is its less accurate simulation at low temperatures. The Klemen method was developed and validated using observations from meteorological stations in Germany, Slovenia, and France between May and December 2005, when low temperatures (<0°C) seldom appear. The statistical relationship between the air temperature and parameters developed under such conditions does not fit for application at low temperatures. Further research can be carried out to have the coefficients in the Klemen equations recalibrated in a wider temperature span to make this method more suitable at low temperatures.
The downscaling procedure used in this study is based on the vertical temperature gradient. The origin GLDAS temperature data with coarse spatial resolution were spatially detailed by introducing DEM with higher resolution. Though this procedure does not suffer from inaccuracy at low temperatures, it does not account for other factors influencing the air temperature. For further applications, additional factors related to the season, radiation, and underlying surface conditions should be taken into consideration during the downscaling.
The daily average temperature data obtained by the integrated method provided both sufficient spatial distribution details and high simulation accuracy, which were especially valuable because no field observations were used in the simulation. In this study, the integrated data comprised data from the Klemen method after temporal transformation for high temperatures and downscaled GLDAS data for low temperatures. More sophisticated methods can be developed by combining these two datasets in future applications.
According to the discussion, the regions where the satellite (Terra) passing time is close to any of the four UTC times provided by the NCAR/NCEP, which is about one third of the total global area, can use the method proposed in this study to estimate the daily average air temperature. The feasibility of this method in other regions in the world needs further study.