Satellite-based thermal infrared remote sensing is the main method used to acquire distributed land surface thermal information. However, the spatial resolution of existing satellite-based remote sensing thermal infrared images is generally low. For example, the spatial resolution of thermal infrared imageries collected by the moderate resolution imaging spectroradiometer (MODIS) and the advanced very high resolution radiometer (AVHRR) is , and the resolution of thermal infrared images collected by the Geostationary Operational Environmental Satellite Imager is 4000 m. Such low spatial resolution significantly restricts the usefulness of these images in drought monitoring and other agricultural and forestry practices,12.3.–4 particularly in mountainous regions with complex landscape. High-resolution images can be obtained through sharpening of coarse resolution images under the premise that most physical information is maintained. Sharpening is an effective method used to solve the problem of low spatial resolution of thermal images.3 Thermal image sharpening, which is also called thermal information downscaling or subpixel temperature estimation, has been an important topic in the international remote sensing field in recent years.126.96.36.199.188.8.131.52.10.11.12.13.14.–15 The two following paragraphs will provide an analysis of previous studies from the study area types and explanatory variables.
The studied areas in previous studies can be classified into agricultural areas (e.g., see Refs. 2, 3, and 5), urban areas (e.g., see Refs. 11, 1617.18.–19), and areas with complex landscapes (e.g., see Refs. 8 and 20). The performance of a certain method in different landscapes was reported.14,21 Most of these areas featured flat terrain. Thus far, studies on thermal image sharpening in mountainous areas have been relatively rare.1 Functioning as complicated energy budget systems, mountainous regions refer to regions that are characterized by a certain altitude, relative height, and gradient.22 Different from other geographical units, mountainous regions have their own laws in weather and climate, including: (1) the temperature, water vapor pressure, solar radiation, rainfall, and other factors vary with the change in altitude.23,24 (2) Solar radiation, wind speed, and other factors vary with the difference in slope aspect and gradient.2324.25.–26 (3) Different terrains exert differentiated aggregating effects on water, mineral substances, plant litter, and other substances.22 (4) Terrain has a significant effect on plant species and growth status.22 The aforementioned influences have direct or indirect effects on the surface energy balance (SEB) in mountainous regions, and incoming and outgoing items of SEB vary with topographical change. As a state variable in the surface energy budgeting process, land surface temperature (LST) in mountainous regions results in more obvious spatial variation than that in flat areas and has intrinsic relationships with more environmental factors.
At the beginning of this research field, disTrad (later known as TsHARP) was the typical thermal sharpening method.27 It used the normalized difference vegetation index (NDVI) as the explanatory variable (the term of “explanatory variable” mentioned in the paper is basically tantamount to “kernal” and “scale factor” involved in similar literature, which have been summarized in Ref. 1). Subsequent scholars improved TsHARP and used the transformed NDVI as explanatory variables [e.g., Refs. 5 and 28 used vegetation coverage (VC) as explanatory variable] or introduced new explanatory variables (e.g., albedo11 and normalized multiband drought index20). Some scholars still used NDVI as explanatory variables but introduced new regression approaches or interpolation methods (such as least-median square regression downscaling,2 regression-kriging,3 the combination of TsHARP and thin plate spline,21 and so on). An emerging tendency for this research field is to use multiple factors as explanatory variables and adopt machine learning algorithm for modeling.8,14 Most previous studies in this field used merely the optical image, without further exploring the role that external data [such as digital elevation model (DEM), meteorological data, microwave data, and so on] play on thermal image sharpening (see the discussion in Ref. 1). As stated earlier, LST in mountainous regions features special laws in spatial variation and is subject to many environmental factors. Viewed from this perspective, it is impossible to satisfyingly sharpen thermal images in mountainous regions if only explanatory factors from previous studies are used. Still further studies should be conducted on issues like the selection of explanatory variables suited to thermal image sharpening in mountainous regions and the ways for spatially depicting these explanatory variables at high resolution.
To improve the effectiveness and accuracy of the thermal sharpening technique in mountainous regions, this study proposed a method based on the geographical statistical model (GSM). Given that LST is the result of land SEB,29,30 it is determined by land surface properties and energy balance together. This study selected various parameters in the SEB process as the explanatory variables. These variables were spatially simulated to generate their raster layers. On this basis, the local adaptation statistical relationship between low-resolution top-of-atmosphere brightness temperature (BT) and explanatory variables, i.e., the GSM, was established through multivariate adaptive regression splines (MARS).31 Then, the GSM was applied to high-resolution explanatory variables, thereby obtaining high-resolution BT image. In this paper, the GSM method was assessed by using it in sharpening a simulated coarse resolution (1026 m) BT image to fine resolution (57 m). A series of experimental designs was made in this study to analyze the influences of explanatory variable combination, sampling method, and residual error correction on sharpening results, as well as their influence mechanisms.
Study Area and Data Source
The study area, which is located west of Oregon in the northwestern United States, is an approximate rectangular region between the west longitude of 123 to 124 deg and the northern latitude of 43 to 44 deg. The elevation of the region is 0 to 1299 m (335 m on average), and the slope degree is 0 to 54.5 deg (15.9 deg on average). The main land cover types (LCTs) in this region include forest, cutover land, grassland, farmland, and water bodies. This region was selected as the study area because (1) it is a typical mountainous region, (2) it has rich LCTs and rather complicated landscape, and (3) sufficient data were available in this region, especially DEM with high spatial resolution and accuracy.
The Landsat enhanced thematic mapper plus (ETM+) remote sensing images used, whose path and row number was P46R30, were generated on September 7, 1999. These images were L1T terrain correction data products downloaded from United States Geological Survey (USGS) Earth Explorer. The resolution of these images is 57 m for thermal band and 28.5 m for visible and near-infrared (Vis-NIR) bands. The thermal band was used to simulate a coarse resolution (1026 m) thermal image (see Sec. 2.2 for details). In the following process, the coarse resolution thermal image would be sharpened back to 57-m resolution, and then the sharpening result would be verified based on the initial 57-m resolution thermal image. Using ETM+ data in the study is justified by the following considerations: first, it provides a resolution (57 m) in the thermal infrared band much higher than most legacy and current satellite-borne sensors do. As to regions having a strong spatial heterogeneity in LCT, a more practical value may be provided if the study effort focuses on how to sharpen the thermal images from a 1000- to 60-m resolution level. Second, from May 2003 to the present, about a quarter of the data in ETM+ images are missing because of a scan line corrector (SLC) failure, but fortunately, the remaining three quarters are accurate. Using these data to evaluate the thermal image sharpening effect creates a new application of these data. (As a basic study, this paper used the ETM+ data before the SLC failure; however, the approach developed in such a way is likewise suitable for post-SLC-failure data).
A DEM, with a spatial resolution of 10 m and a mean square error of , was obtained from the National Elevation Dataset of USGS. Precipitation data 2 weeks before the imaging date and air temperature (AT) and atmospheric precipitable water data on the imaging date were collected from the U.S. National Climatic Data Center of National Oceanic and Atmospheric Administration. The solar radiation data of the stations on the imaging date and the atmospheric data related with solar radiation were obtained from the solar radiation database of the American National Renewable Energy Laboratory (ANREL). The National Land Cover Database (NLCD; Edition 2) of USGS was used as a reference in the classification of remote sensing images.
Preparation of Images for Sharpening
The sharpening object in previous studies can be divided into three classes: digital number (DN) images (e.g., Ref. 6), top-of-atmosphere BT images (e.g., Refs. 11 and 14), and land surface real temperature images (e.g., Refs. 2, 5, and 10). All these classes can be called a “thermal image.” Because there are various retrieval methods from DN or top-of-atmosphere BT to land surface real temperature (e.g., Refs. 3184.108.40.206.–36) and different methods will get different results,37,38 the relatively original data, top-of-atmosphere BT images, are better sharpening objects compared to the land surface real-temperature images. This study used the top-of-atmosphere BT image as a sharpening object.
First, the original DN image of ETM+ Band6 with 57-m resolution was converted into the radiance image at the entrance pupil () according to the gain and bias parameters in its header file. Then, was converted into a top-of-atmosphere BT image (abbreviated as BT57) according to the Planck formula. Finally, the BT image with 1026-m resolution (abbreviated as BT1026) was simulated based on BT57. In the following procedure (see Sec. 2.7), BT1026 would be used as sharpening object. The reasons that we selected 1026 m as the source resolution were as follows: (1) 1026 m is integral multiple of the resolution of BT57, which made the aggregating process relatively easy; (2) 1026 m is approximately equal to the spatial resolution of thermal infrared images collected by the MODIS or AVHRR; via the GSM’s performance in sharpening the simulated 1026-m resolution image, its performance in sharpening MODIS or AVHRR thermal image can be evaluated. The procedure of generating BT1026 was as follows:14 every BT57 pixel aggregates into one pixel with a resolution of 1026 m, and the pixel value was calculated from the values of the corresponding BT57 pixels with the equation:
Selection of Explanatory Variables and Establishment of Their Raster Layers
LST is related to many natural factors, and a good understanding of the intrinsic relationships between LST and its determinants is the basis of downscaling. Since the basic idea of this method was to simulate LST as the result of SEB, the main environmental factors of the SEB were selected as explanatory variables. These environmental factors were judged on the basis of previous studies focusing on SEB.29,30,39,40 In addition, some other explanatory variables were proxy variables, which were used to replace those environmental factors proved by previous studies as directly related to the BT, whereas their raster layers were difficult to obtain. This process could guarantee the simplicity and feasibility of this method. In this study, 13 kinds of environmental factors were selected according to the aforementioned principles. The specific selection basis and the raster layer establishment method of these factors are detailed in Secs. 2.3.1–2.3.6
Total solar radiation
In mountainous regions, the spatial variation of total solar radiation () received by the land surface caused by topographic relief25,26 is a significant cause of the small-scale spatial variation of surface temperature in mountainous regions. For instance, a simulation study based on an ecohydrological model shows that for the climate conditions considered in that study, the incoming solar radiation is one of the major factors controlling LST spatial distribution.41 In addition, because there is a time lag between the surface temperature intraday change and intraday change, the effect of in some period before imaging on thermal image is possibly superior to the effect of simultaneous with imaging on thermal image;42 Ref. 42 shows that under the conditions of their experiment, a time lag of 1 h considerably increases the linear relation between vegetation canopy temperature and local insolation angle. Therefore, in several tiny periods before imaging was selected as an alternative explanatory variable and the simulation method is as follows:
1. Tiny period definition: to facilitate calculation, we divided the period from 8:06:00 to 10:36:00 (i.e., 2.5 h before ETM+ imaging) into 30 tiny periods, i.e., each tiny period is 5 min. The tiny periods were named 5-min-periods. In each 5-min-period, all factors related to solar radiation were assumed to remain unchanged and be the status at the 5-min-period’s midpoint.
2. Astronomical solar radiation (): it was determined whether each DEM grid cell is irradiated directly in a specific 5-min-period(s) with the method indicated in Ref. 43. For each DEM grid cell that can be irradiated directly, was calculated as follows:25 and is the cosine of the insolation angle on a tilted surface, which can be calculated as follows:4325 and is the solar hour angle (rad).
3. Direct solar radiation (): of each grid cell that is irradiated directly in clear sky condition was calculated according to Iqbal C model25 as follows:25. Among critical parameters needed in the calculation, atmosphere moisture content and aerosol optical thickness came from the solar radiation database of ANREL, and vertical thickness of the ozone layer was calculated with latitude and date according to Ref. 44.
4. Scatting radiation: First, solar scattering radiation in the clear sky () was calculated on the assumption of “level ground surface without obstruction around” on basis of Iqbal C model.25 Then, solar scattering radiation in clear sky () was calculated on the condition of “possibly inclined ground surface with potential obstruction around toward sky” according to Ref. 43.
5. : is the sum of and . of six 5-min-periods before imaging was used as alternative explanatory variables, which are denoted as (). In addition, every three adjacent 5-min-periods were taken as one group, and their was summed up generating eight (), which were used as alternative explanatory variables also. In other words, two kinds of with different duration were generated including with the period of 5 min and with the period of 15 min. The initial resolution of and layers was 10 m depending on DEM used. We converted these layers to 57-m resolution layers via the following process: at first, via a bilinear resample method, we converted these layers into 11.4-m ones (11.4, a number close to 10, was obtained by dividing 57 by integer 5). Then, we converted layers of 11.4-m resolution into 57-m resolution layers using aggregation method, which meant that each set was grouped into one pixel, whose average formed the value in their corresponding pixel in the new layers.
Precipitation before imaging and topographic wetness index
Soil moisture affects the plant canopy temperature by influencing the transpiration intensity of vegetation and has a profound effect on the energy distribution ratio to sensible heat flux and latent heat flux.30,39,45,46 Generally, the more moist the soil, the more evaporation removes heat. Therefore, soil moisture is an indispensable factor of thermal image modeling.41 Because it is difficult to simulate soil moisture accurately, precipitation data 2 weeks before the imaging date (PBI) and topographic wetness index (TWI) were used as proxy variables of soil moisture.
Spatial interpolation was conducted to PBI of 37 meteorological stations in the study area and its surrounding areas by using the common Kriging method, thereby obtaining 57-m resolution raster data of PBI.
The 57-m resolution layer of TWI was generated via the following process: first, using the bilinear resample method, a 10-m-resolution DEM was converted to a 11.4-m-resolution DEM. Second, the latter was depression filled. Third, flow direction was calculated using the single-flow direction algorithm47 on the basis of the depression-filled DEM, thereby deriving the flow accumulation (). Thus, TWI could be calculated as follows:2.3.1.
Vegetation index and vegetation coverage
Transpiration, which is the significant expenditure item in the energy balance of vegetation canopy, controls vegetation canopy temperature.29 Transpiration is largely determined by vegetation quantity and flourishing degree. Therefore, the terrain-corrected NDVI (NDVI′) and VC were selected as the explanatory variables. Their computation methods are described in the following.
The original DN image of the third and fourth bands was converted into the radiance image at the entrance pupil. On this basis, the top-of-atmosphere reflectance was calculated. Then, the calibration model was used for topographical correction, as follows:48,49
The raster layer of VC was calculated by using the 57-m resolution NDVI′, as follows:
Emissivity, water vapor content of air column at imaging, and air temperature at imaging
Given that the sharpening object is a top-of-atmosphere BT image, according to the radiative transfer equation,32,38 atmospheric transmissivity, atmospheric downward and upward long-wave radiations, and emissivity () should be considered. Their simulation or proxy methods are as follows:
1. Although atmospheric transmissivity of the thermal infrared band is influenced by many factors (e.g., air pressure, aerosol content, and content of various atmospheric components), its spatial variation mainly depends on the atmospheric water vapor content (AWVC).50 Therefore, the AWVC during imaging was used as the proxy factor of atmospheric transmissivity. The atmospheric precipitable water data of 19 meteorological stations in the study area and its surrounding areas were used to generate the 57-m-resolution AWVC raster data on the basis of the macrofactor regression and residual interpolation method.51
2. The AT at imaging was used to represent the atmospheric downward and upward long-wave radiations indirectly. Based on the AT data of 12 meteorological stations in the study area and its surrounding areas, 57-m resolution AT raster data were generated by using the combined method.52
3. Based on VC, 57-m resolution raster data were generated by using the method introduced in Ref. 37.
Profile curvature, slope degree, and elevation
Profile curvature (PC) affects the acceleration or deceleration of flow across the surface. Many environmental factors, such as soil thickness, litter thickness, and soil humidity, have a certain degree of correspondence with PC.53,54 Slope degree () and elevation (ELEV) were employed as proxy variables of factors, which are related to terrain but unknown currently and their raster data were generated. The 57-m-resolution layers of these three variables were generated via the following process: first, via the bilinear resample method, we converted the 10-m-resolution DEM to a 11.4-m-resolution DEM. Then, we generated 11.4-m-resolution layers for PC and from the 11.4-m-resolution DEM using corresponding tools in ArcGIS. Finally, we converted these layers of 11.4-m resolution to layers of 57-m resolution using aggregation method mentioned in Sec. 2.3.1.
Land cover type
Aside from VC, vegetation index, soil moisture, and emissivity, several other thermodynamic factors of the land surface (e.g., specific heat, soil density, thermal conductivity, surface roughness, and albedo) also affect surface temperature. Unfortunately, the spatial distributions of these factors are difficult to describe accurately. Previous studies proved that there is a significant relationship between these factors and LCT.5556.–57 Therefore, the spatial variation of the LCT is one of the major factors causing the spatial variation of LST and SEB.41,58,59 Reference 41 shows that in addition to solar radiation, the land cover variability is an another major factor causing LST spatial variation. The procedure of generating the LCT raster layer is as follows: first, clustering was conducted with unsupervised classification (iterative self-organization analysis algorithm). Second, the attribute of the unsupervised classification results was judged according to the high-resolution remote sensing images (from Google Earth) and NLCD. As a result, eight LCTs (i.e., evergreen forest, mixed forest, open forest, shrubbery, grassland, farmland, water body, and construction land) were obtained. LCT is a categorical variable, which is unsuitable for scale conversion. Therefore, LCT data were converted to quantitative variables, that is, the area percentage of each LCT in each 57-m resolution pixel; these variables were recorded as , ().
Radiance of visible and near infrared bands
Radiance at the entrance pupil of the first to fifth and seventh bands of ETM+ [, (] could also be used as explanatory variables to determine whether adding into the aforementioned variables could further improve the sharpening effects. So, at-entrance-pupil radiance layers of these six bands were calculated using the parameters of gain and bias in metadata. The layers’ resolution was converted from 28.5 to 57 m using the aggregation method.
Through the aforementioned steps, thirty-eight 57-m-resolution raster layers (including six , eight , PBI, TWI, NDVI′, VC, AWVC, AT, , PC, , ELEV, eight , and six ) of the explanatory variables were acquired. For every 57-m-resolution raster layer of the explanatory variables, every set was divided into one group (corresponding to a BT1026 pixel) for pixel value averaging, thereby obtaining thirty-eight 1026-m-resolution raster layers of the explanatory variables.
Combining Alternative Explanatory Variables
Considering that testing all combinations of 38 variables was laborious and unnecessary, only specific variable groups with particular purposes were tested (Table 1). Among these groups, group 1 was the default explanatory variable group by GSM, and groups 2 to 9 were used to test the influence of cutting or adding explanatory variables on sharpening.
Different combinations of alternative explanatory variables.
|Group No.||Alternative explanatory variables||Description|
|1||, , NDVI′, VC, TWI, PC, ELEV, , , PBI, AT, , AWVC||Basic explanatory variable group|
|2||NDVI′, VC, TWI, PC, ELEV, , , PBI, AT, , AWVC||Subtract solar radiation based on group 1|
|3||, , TWI, PC, ELEV, , , PBI, AT, , AWVC||Subtract NDVI′ and VC based on group 1|
|4||, , NDVI′, VC, PC, ELEV, , , PBI, AT, , AWVC||Subtract TWI based on group 1|
|5||, , NDVI′, VC, TWI, PC, ELEV, , , PBI, AT, AWVC||Subtract based on group 1|
|6||, , TWI, PC, ELEV, , PBI, AT, AWVC||Subtract all variables derived from the remote sensing image based on group 1|
|7||PBI, AT, AWVC, NDVI,||Subtract the variables derived from DEM based on group 1|
|8||, , NDVI′, VC, TWI, PC, ELEV, , ,||Subtract the macro variables based on group 1|
|9||, , NDVI′, VC, TWI, PC, ELEV, , , PBI, AT, , AWVC,||Add radiance at the Vis-NIR bands based on group 1|
Selecting the Sample Pixels
After this process, images of BT and explanatory variables at 1026-m resolution were prepared. All or a part of the pixels in these images could be selected as sample pixels, values of which (i.e., values of BT and explanatory variables) would be used for building the MARS model. The internal homogeneity degrees varied for different sample pixels. Previous research posited that the statistical model built based on samples with high homogeneity degrees is less scale dependent than one built based on samples with low homogeneity degrees.14 To observe the influence of the homogeneity degree of sample pixels on the model for the mountainous region, two groups of pixel samples were selected. The first group was the homogeneous pixel sample (HoPS) group. To select HoPSs, the variation coefficients of values at 57-m resolution corresponding to each pixel at 1026-m resolution were calculated and ordered from high to low for each kind of explanatory variable; if the variation coefficients of all explanatory variables of a pixel at 1026-m resolution were listed out of top 30%, then this pixel was regarded as a HoPS. The second group was the heterogeneous pixel sample (HePS) group, in which all pixels at 1026-m resolution were selected as the pixel samples without considering their homogeneity degrees.
Multivariate Adaptive Regression Spline
MARS is an automatic regression modeling method that can predict an output variable according to a group of explanatory variables. This method excels at finding the complex data structure that often hides in high-dimensional data. The method divides the entire high-dimensional data space into several regions. In each specific region, an independent function is used for fitting. The core formula of MARS model is
In this study, nine groups of the aforementioned explanatory variables (see Table 1) were taken as the dependent variables () to build the MARS models one by one. Given that two sampling methods (i.e., HoPS and HePS methods) were used, 18 MARS models were developed.
Simulation of Thermal Images at 57-m Resolution and Residual Redistributing
The 57-m-resolution explanatory variable raster layers were input into the 18 developed MARS models using the raster calculation tool in GIS software, and then 18 simulation results of BT at 57-m resolution (i.e., sharpening results, written as BT57′ images) were obtained. BT1026 image was used to modify each BT57′ image through the following procedures. First, a BT57′ image was reaggregated into an image at 1026-m resolution (each set of BT57′ pixels were aggregated into a pixel at 1026-m resolution, and its pixel value was the average of the corresponding BT57′ pixels) written as BT1026′. Second, the BT1026′ image was subtracted from the BT1026 image, resulting in , which was then added to the BT57′ image to obtain the result after modification, written as BT57″; this process is called residual redistributing (RR) below. The BT57′ and BT57″ images are the final sharpening results.
Validation of the Sharpening Results
The mean absolute error (MAE) and root mean-square error (RMSE) of the sharpening results (BT57′ and BT57″) were calculated with the BT57 image as a reference to measure accuracy. The MAE and RMSE were calculated as follows:
Evaluation of the Sharpening Effect
As previously mentioned, 18 MARS models and 36 sharpening results (18 BT57′ images and 18 BT57″ images) were obtained in this study. Table 2 lists the basic parameters of these MARS models and the MAEs and RMSEs of the corresponding sharpening results. In Table 2, the sharpening results of group 1, group 3, group 4, and group 9 based on the HePS method have the lowest MAEs and RMSEs. These MAEs before and after RR are as low as about 1.3 and 1.2 K respectively, showing the sharpening results are of high accuracy.
Basic parameters of the MARS models and the sharpening errors. The data before and after “\” are the errors of results of HePS and HoPS method, respectively.
|Group No.||Basic parameter of MARS model||Sharpening error (K)|
|Number of alternative explanatory variables||Number of BFs||Self error of the model||BT57′ image MAE||BT57″ image MAE||BT57′ image RMSE||BT57″ image RMSE|
To measure the sharpening effects of the proposed method, we compared them with the effects of two simple processing methods (bilinear resampling and cubic convolution resampling) that can be regarded as benchmarks for measuring the effect of any thermal sharpening method. The results show that the MAE of both simple processing methods is 1.776 K. The MAE of the best results obtained in this study is smaller than that of the simple processing methods by .
We qualitatively evaluate the sharpening effects by visual inspection. The visual presentation of the various sharpening results is closely correlated to their error. The smaller the error is, the better the visual presentation. To save space, we only present part of the sharpening results of the models constructed based on the first group of explanatory variables (Fig. 1). Comparing Fig. 1(e) with Fig. 1(b) reveals that the sharpening result based on the default explanatory variable group, and the HePS method shows significant improvement in its ability to reflect details of spatial variation of thermal information. Comparing Fig. 1(e) with Fig. 1(c) reveals that this sharpening result is very similar to the reference image. Figure 1(e) is much better than the result of cubic convolution resampling [see Fig. 1(d)]. Figures 1(g) and 1(h) are less similar than Figs. 1(e) or 1(f) to the reference image, indicating that the HoPS method is worse than the HePS method in a mountainous region. Very slight artificial box-like traces are found in specific areas in Fig. 1(f), and rather serious artificial box-like traces are found in Fig. 1(h).
Effect of Sampling on Sharpening
Table 2 shows that in terms of the self errors, the errors of the models built by the HoPS method are smaller than those of the models built by the HePS method. However, in terms of the errors of the sharpening results (MAE and RMSE), the errors of the model built by the HoPS method are larger than those of the models built by the HePS method (before and after RR). The reason behind this condition was the fact that the HoPS method pursued the consistency of 57-m resolution pixels covered by each 1026-m-resolution sample pixel on explanatory variables, and thereby the majority of the extracted sample pixels belonged to the flat region with weak regional representativeness of the large area of rugged terrain. The values of the HoPSs were more concentrated than the overall value distribution. The MARS models built by the HoPS method had a simple model structure (i.e., small number of BFs) and small self error. Nonetheless, given the obvious difference between the value distribution of the HoPSs and the overall value distribution, large errors generated when the HoPS-based models were applied to the entire research region. Thus, the HoPS method cannot adapt to the thermal image sharpening of the mountainous region. The analysis in the subsequent paragraph analyzes only the HePS-based models without considering HoPS-based models.
Effect of Explanatory Variables on Sharpening
Table 2 indicates that the errors of the sharpening effect obtained by different explanatory variable groups significantly varied. The variables in group 1 were all alternative ones [except ], which generated a sharpening result with very low MAE and RSME. Groups 2 to 8 eliminated one or several variables based on group 1. The MAE and RSME change after specific variable(s) was eliminated that could reflect the importance of the eliminated variable(s) to sharpening. The elimination of solar radiation increased MAE by (see group 2), indicating that this variable is valuable for thermal sharpening. Eliminating NDVI′ and VC only slightly influenced MAE (see group 3). The reason behind the case was the condition that another explanatory variable was highly correlated with NDVI′; replaced the explanatory effect of NDVI′ and VC. After the elimination of TWI, MAE did not rise but reduced slightly (see group 4). This finding may be due to the fact that TWI inaccurately reflected the spatial differentiation of soil moisture decided by the terrain and introduced numerous new noises. After was eliminated, MAE rose by (see group 5), implying that these variables are very valuable for thermal sharpening. If all variables derived from images at Vis-NIR bands were eliminated [i.e., NDVI′, VC, , and (see group 6)], then the MAE of the sharpening result would reach over 2 K. This phenomenon suggests that thermal sharpening could not be separated from these variables completely. If all variables reflecting topographic features (see group 7) were eliminated, then MAE would increase by nearly 0.2 K. If three macrovariables (i.e., PBI, AT, and AWVC) were eliminated simultaneously, then the MAEs of BT57′ and BT57″ images would increase by and 0.6 K (see group 8), respectively. The common characteristic of the macrovariables is that they have the spatial variation only on the macroscale and that they are nearly uniform inside each 1026 pixel. As such, the BT variation inside the area should bear no relation to these macrovariables. However, the results of group 8 show the importance of macrovariables to sharpening. The reason behind this event is the case that the loss of macrovariables sharply increased the self errors of the MARS model (see Table 2), which in turn caused large errors in the sharpening result. Group 9 added the Vis-NIR radiance variables based on the variables in group 1, and the MAE of the sharpening results reduced slightly. Nevertheless, group 9 did not get rid of the dependence on synchronously acquired Landsat Vis-NIR images. Vis-NIR radiance changes rapidly with date; hence, the Vis-NIR radiance of the target date cannot be accurately calculated with Vis-NIR images on other dates. This makes such explanatory variable group less feasible in practical applications.
The explanatory variable group built based on the prior knowledge about the energy budge and thermal infrared electromagnetic radiation transmission (i.e., group 1 in this study) could accurately conduct sharpening on thermal image. In practice, choosing the explanatory variables is difficult, especially in regions with complex environmental conditions. This study suggests that in this case, the explanatory variable group should be determined based on the laws of land surface energy budget and thermal infrared electromagnetic radiation transmission, and thereby the blind choice of explanatory variables can be avoided.
Comparison with Previous Methods
In previous studies of thermal imagery sharpening based on statistical models, the NDVI or its transformed form was most commonly used as a scale factor.2,3,5,21,27 Reference 27 performed sharpening of multiple resolutions using the NDVI as the scale factor. As in the current study, that study downscaled the resolution from 1536 to 192 m, and the RMSE of the results was 1.61 K. Reference 5 used four forms of NDVI-based function to sharpen the resolution from 960 to 60 m, and the RMSEs ranged from 1.20 to 3.27 K. Reference 10 used a global model, a resolution-adjusted global model, a piecewise regression model, a stratified model, and a local model to sharpen the resolution from 990 to 90 m, and the RMSEs ranged from 3.24 to 3.68 K. Using an artificial neural network method and 11 explanatory variables including the NDVI, difference vegetation index, and leaf area index, Ref. 8 sharpens LST images; the RMSEs of the results sharpened from 1080 to 90 m were 2.443 K (vegetated area) and 1.573 K (bare area). Using a regression tree method and six Vis-NIR bands as the explanatory variables, Ref. 14 sharpens BT images; the MAEs of the sharpening results in flat areas were generally smaller than 1 K, while large errors were obtained in mountainous regions (960 m to 120 m, MAE 1.67 K).
Previous research has shown that within mountainous regions, it is more difficult to sharpen the thermal imagery.14,15 In the current study, we performed sharpening of the thermal imagery in a typical mountainous region. The errors in the current results can be controlled at a relatively low level. For the model built with the default explanatory variable group of GSM and based on HePS method, the MAEs were 1.258 K (before RR) and 1.134 K (after RR) and the RMSE were 1.985 K (before RR) and 1.871 K (after RR). Thus, we tentatively put forward that the GSM-based method is superior to previous methods for mountainous regions.
For a more accurate comparison with TsHARP, we performed sharpening using five TsHARP-based methods proposed by previous study (an NDVI-based linear function model,5,10,27 an NDVI-based quadratic function model,5 a vegetation cover model,5 a simplified vegetation cover model,5 and a local NDVI-based model10), and the thermal image to be sharpened was BT1026. The MAEs of the sharpening results were 1.631, 1.627, 1.668, 1.668, and 1.591 K, and the RMSE values were 2.223, 2.425, 2.226, 2.226, and 2.214 K, respectively. These MAEs and RMSEs were larger than corresponding MAEs and RMSEs of the sharpening results in the GSM-based method.
Localization of the Model
Statistical sharpening models can be divided into two kinds, namely, global and local models. The global model refers to a single model built by the sample pixels from the entire study area. Thermal image sharpening with a global model assumes that the target variable–explanatory variables relation is uniform across the entire area, but this assumption often contradicts the reality. In the real world, the target variable–explanatory variables relation cannot remain the same globally. If samples are collected in each local region in the form of a sliding window and the special model for each local region is established based on these samples, then the model is a local model. Several studies have shown that for thermal image sharpening, a local model performs better than a global one.14,18
In this study, the statistical models between BT and explanatory variables were established through MARS instead of with the method based on sliding window. In essence, the MARS models in this study are local models. To explain this problem, the model built with the variables in group 1 was chosen as an example. The model is depicted as follows:
Feasibility of Geographical Statistical Model in Practice
The presented results indicate that the GSM method is capable of generating a fine spatial resolution thermal image from a coarse spatial resolution image ( resolution, e.g., the thermal images of MODIS and AVHRR) accurately. However, the following issue needs further discussion. In this study, the NDVI′, , and raster data at 57-m resolution used by GSM were generated based on the ETM+ Vis-NIR data. How can these data be obtained in practical applications? For special regions, most satellite-borne moderate-resolution sensors (e.g., ETM+, Operational Land Imager, and Advanced Spaceborne Thermal Emission and Reflection Radiometer) cannot produce Vis-NIR data every day. Thus, when conducting thermal image sharpening on the ’th day, we may need the Vis-NIR data of moderate resolution sensors on the ’th or ’th day to generate NDVI′, , and . Generally, NDVI′ and are stable within several days, and LCT is stable in a year. Thus, it is feasible to using NDVI′, , and LCT of the ’th or ’th day () to serve as the corresponding variables on the ’th day. Other variables in the default explanatory variable group of GSM are all stable in the time domain (e.g., slope and elevation), or the data of these variables can be observed or simulated every day (e.g., precipitation, AT, and solar radiation). Therefore, it can be concluded that the GSM sharpening method can eliminate the dependence on synchronous imagery at VIS-NIR bands.
It should be noted that the GSM-based method needs raster layers of many variables involved in land surface energy budget and thermal infrared electromagnetic radiation transmission. The establishment of these layers requires sufficient initial data (DEM, meteorological data, and so on) and much calculation time. In terms of data availability and implementation simplicity, several methods proposed by previous studies (e.g., Refs. 3, 5, 14, and so on) are better than the GSM-based method.
The robustness of this method needs further investigations; specifically, future work should address the variation of the importance of explanatory variables in various regions. In addition, some satellite remote-sensing data products (like MODIS water vapor, MODIS atmosphere profile, and TRMM precipitation products) possess well-defined physical meaning; they are likely suitable for acting as explanatory variables and can enhance the thermal sharpening effect. Once these data are used as explanatory variables, they should be deliberately analyzed in respect of spatial resolution, data quality, and the degree to which they synchronize temporally with the sharpened objects.
A sharpening method of thermal image based on GSM was proposed in this study. This method attached more importance to geography law and obtained sharpening results with fewer errors and good visual effects in mountainous areas with complex environment. The GSM-based sharpening method can avoid the blind choice of explanatory variables and reduce the dependence on synchronous imagery at VIS-NIR bands, which makes it more feasible in practice. The method is helpful to the agricultural and forestry studies and practices (e.g., drought monitoring and wildfire danger modeling) in mountainous regions using coarse spatial resolution thermal images (e.g., the images of MODIS and AVHRR). Furthermore, this study found that: (1) the homogeneous sample pixel model is not suitable for sharpening of a thermal image in a mountain area; (2) although macrovariables, such as PBI, AT, and AWVC, can only present spatial variability in macroscale, they are indispensable for thermal image sharpening; and (3) the MARS modeling method is capable of reflecting the locality of the relationship between the target variable and explanatory variables, thereby reducing errors of sharpening results efficiently.
This work was supported by the National Natural Science Foundation of China (No. 41201099) and Postdoctoral Research Funds of Henan University (Grant No. BH2012042). We are thankful to the United States Geological Survey (USGS) for providing the ETM+, the National Elevation Dataset, and the National Land Cover Database. We are also thankful to the National Climatic Data Center of National Oceanic and Atmospheric Administration of US for providing meteorological data. We also deeply appreciate the American National Renewable Energy Laboratory for providing the solar radiation data and the atmospheric data related with solar radiation. We also thank the two anonymous reviewers for their insightful comments, which improved this paper.
Pengcheng Qi received his PhD degree in physical geography from Lanzhou University, Lanzhou, China, in 2009. Since 2010, he has been working at Nanyang Normal University, where he is currently an associate professor with the Laboratory of Remote Sensing Monitoring of Natural Disaster. His research interests include remote sensing digital image processing and applications of thermal infrared remote sensing in ecology.
Shixiong Hu received his PhD degree in physical geography and environmental analysis from State University of New York at Buffalo, United States, in 2004. Since 2004, he has been working in the Department of Geography, East Stroudsburg University of Pennsylvania. His research interests include hillslope processes, environmental modeling with GIS, and soil and water conservation.
Haijun Zhang received the BS degree in geographic information systems and the ME degree in cartography and geographic information engineering from Chang’an University, China. He is currently a lecturer in the School of Environmental Science and Tourism, Nanyang Normal University, China. His research interests focus on the integration of remote sensing and GIS in addressing environmental issues.
Guangmeng Guo received his PhD from Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, in 2004. He is currently a professor with the Laboratory of Remote Sensing Monitoring of Natural Disaster, Nanyang Normal University. His research interests focus on applications of remote sensing in disaster monitoring.