Since the 1970s earth observation remote sensing satellites have been providing an enormous amount of information in the form of images for monitoring various environmental phenomena.1 These images can be categorized into several classes on the basis of their spectral characteristics, such as (i) visible (i.e., in the range 0.4 to ); (ii) near-infrared (NIR: 0.7 to ); (iii) shortwave infrared (SWIR: 1.0 to ); (iv) thermal infrared (TIR: 3 to ); and (iv) microwave (i.e., 3 mm to 3 m). In general, different satellites may acquire images over a particular spectral range; however, these images may differ significantly in both spatial and temporal resolutions. Usually the relatively high spatial resolution images have low temporal resolution, and vice versa.2 For example, the moderate-resolution imaging spectroradiometer (MODIS) and Landsat satellites have similar spectral ranges and also provide consistent surface reflectance34.–5 while they vary in their spatial and temporal resolutions (see Table 1 for details). It is interesting to note that some environmental issues (e.g., agricultural drought, irrigation management, and grassland) require high spatial (e.g., 30 m) and high temporal (e.g., weekly) resolutions67.–8 due to the rapid changes in these applications. In order to address this, a new data fusion research area (called as spatiotemporal data fusion) has emerged during the past several years. These techniques have been applied and successfully predicted synthetic high resolution images for different environmental issues such as vegetation indices,5,9 evapotranspiration,10 urban heat island,11 forest change detection,12 and surface temperature.2 In most of these instances, these fusion techniques have merged Landsat with MODIS images in order to generate synthetic images at the spatial resolution of Landsat (i.e., 30 m) and the temporal resolution of MODIS (i.e., 1 to 8 days).13220.127.116.11.–18 In addition, some researchers have used other data, such as: (i) Landsat images and medium resolution imaging spectrometer (MERIS) having spatial resolution in the range 300 to 1200 m with 3 days temporal resolution;19,20 and (ii) HJ-1 CCD satellite images having 30-m spatial resolution with 4 days temporal resolution and MODIS.21 In general, these techniques can be broadly divided into three groups, (i) the spatial–temporal adaptive reflectance-fusion model (STAR-FM)-based techniques, (ii) unmixing-based fusion techniques, and (iii) sparse representation-based fusion techniques. Some of the example cases are briefly described in Table 2.
Comparison between spectral, spatial, and temporal resolutions of MODIS and Landsat-8 images.
|Satellite||Spectral resolution (in μm)||Spatial resolution (in m)||Temporal resolution (in day)|
|MODIS||Visible (0.45 to 0.67)||250 to 500||1|
|NIR (0.841 to 0.876)||250 to 500|
|SWIR (1.230 to 2.155)||500|
|TIR (10.78 to 12.27)||1000|
|Landsat-8||Visible (0. 43 to 0.67)||15 to 30||16|
|NIR (0.85 to 0.88)||30|
|SWIR (1.57 to 2.29)||30|
|TIR (10.6 to 12.51)||100|
Description of some of the spatiotemporal data fusion techniques implemented over the visible and shortwave infrared spectral bands.
|Source and name of the technique||Description||Comments|
|Gao et al.;13 spatial–temporal adaptive reflectance fusion model (STARFM)||Predicted synthetic images by fusing Landsat and MODIS images. It consisted of three main steps: (i) selecting spectrally similar pixels within a window of interest using Landsat images; (ii) determining a weighting function factor as a function of both Landsat and MODIS images; and (iii) generating synthetic Landsat images at time 2 [i.e., synth-L(t2)] by multiplying the weightage factor defined in step (ii) with the sum of difference between two MODIS images taken at two different times [i.e., M(t2)−M(t1)] and Landsat image taken at time 1 [i.e., L(t1)]||There were two issues, i.e., the window size was variable depending on the study area; and not applicable over heterogeneous land cover types15|
|Zhu et al.;15 enhanced STARFM (ESTARFM)||Modified the STARFM technique proposed by Gao et al.13 in order to apply over heterogeneous land cove types. In this case, they used MODIS images taken at three times [i.e., M(t1), M(t2), and M(t3)] and Landsat images taken at two times [i.e., L(t1) and L(t3)] in order to predict synth-L(t2). The technique had four main steps: (i) selecting the spectrally similar pixels using two Landsat images; (ii) calculating the weighting factor for all similar pixels; (iii) determining a conversion coefficient for each similar pixel by linear regression of Landsat and MODIS images [i.e., L(t1) with M(t1)] and [L(t3) with M(t3)]; and (iv) generating synth-L(t2) by integrating the outcomes of steps (ii) and (iii), and M(t2)||This proposed technique was computationally expensive and unable to predict synthetic images for near real applications as the generation of such images were a type of hindcasting|
|Zurita-Milla et al.;19 unmixing data fusion technique||Used an unmixing data fusion technique based on a linear mixing model. The technique had two main steps: (i) unmixing MERIS time series images using a high-resolution land-use map; and (ii) generating high spatial and temporal synthetic images by assigning the unmixed signals to the corresponding land-use class presented in the central pixel of kxk MERIS neighborhood||The accuracy of this technique would highly depend on the quality of the employed land-use maps|
|Zhang et al.;18 enhanced spatial and temporal data fusion model (ESTDFM)||Generated synth-L(t2) by using MODIS images taken at three times [i.e., M(t1), M(t2), and M(t3)] and Landsat images taken at two times [i.e., L(t1) and L(t3)]. It had four main steps: (i) classifying Landsat images using a patch-based ISODATA classification technique and calculating the abundance of endmembers within a moving window; (ii) unmixing of the three MODIS images using those endmembers; (iii) predicting two synthetic images by calculating the sum of L(t1) and L(t3) with the corresponding difference images of unmixed MODIS images; and (iv) the two predicted images were temporally weighted to generate the final predicted image||This technique eliminated the requirement of a high-resolution land-use map as in the case in Zurita-Milla et al.19 However, it would also suffer from the same hindcasting problem like ESTARFM15|
|Huang and Song;22 sparse-representation-based spatiotemporal reflectance fusion model (SPSTFM)||Used MODIS images taken at three times [i.e., M(t1), M(t2), and M(t3)] and Landsat images taken at two times [i.e., L(t1) and L(t3)] to generate synth-L(t2). The technique consisted of four steps: (i) enhancing the three MODIS images to the equivalent spatial resolution of Landsat image (i.e., 30 m) through a sparse representation technique; (ii) building a dictionary pair using the two counterpart MODIS and Landsat images [i.e., M(t1) and L(t1); M(t3) and L(t3)] and the other MODIS image [i.e., M(t2)]; (iii) predicting the difference image of Landsat data using the learned dictionary pair; and (iv) reconstructing the predicted Landsat image using different weighting parameters||The technique provided better results than the STARFM-based fusion techniques; however, it still had the same hindcasting problem like ESTAFM15 and ESTDFM.18 Furthermore, it was computationally expensive|
|Song and Huang;23 sparse-representation-based spatiotemporal reflectance fusion model (SPSTFM)||Employed sparse representation (including dictionary training and spars coding) in order to enhance the spatial resolution of MODIS images (known as transition images) to the equivalent spatial resolution of Landsat image. Then two transition images taken at t1 and t2 and the Landsat image [i.e., L(t1)] were used to predict the synth-L(t2) by using a high pass modulation technique||This technique used only three input images to predict the synthetic image. Thus, it eliminated the problem of hindcasting of SPSTFM;22 however, it required relatively longer processing time due to training a dictionary|
In general, the above mentioned techniques were complex and required relatively long processing times. In order to address these, we developed a new data fusion technique called spatiotemporal image-fusion model (STI-FM) and demonstrated its effectiveness in enhancing the temporal resolution of Landsat-8 surface temperatures using MODIS data.24 Here, we proposed to evaluate the applicability of the STI-FM with required modifications in generating a synthetic Landsat-8 image [i.e., synth-] for the red, NIR, and SWIR spectral bands by integrating a pair of MODIS images [i.e., and ] and a Landsat-8 image [i.e., ]. The rationale behind choosing these spectral bands were that these had been widely used in the calculation of vegetation greenness and wetness conditions and are effective in monitoring these parameters in short time periods, such as the plants’ growing seasons. Our aim with regard to this paper was to implement this technique over a heterogeneous agriculture-dominant semiarid region in Jordan, Middle East.
General Description of the Study Area
The country of Jordan is located in the Middle East. It is divided into three major geographic areas (i.e., Jordan Valley, Mountainous Heights Plateau, and Eastern Desert) [see Fig. 1(a)]. Our study area [Fig. 1(b)] is located in the northwestern part of the Mountainous Heights Plateau, where the elevation varies in the range 600 to 1100 m above mean sea level. Geographically, it is located between 31°47′N to 32°29′N and 35°37′E to 36°00′E covering approximately [see Fig. 1(b)]. In terms of climate, it experiences Mediterranean climatic conditions with a hot and dry summer (i.e., average temperature with no rainfall during May to August); and cool and wet winter (i.e., average temperature to 7°C with 250 to 650 mm rainfall during November to February); and two transition seasons (i.e., spring during March and April; and fall during September and October). In general, about 70% of the annual evaporation is observed in the dry season (i.e., May–August) with an annual potential evaporation of . In our study area, the major agriculture activities include: (i) agricultural cereal crops (i.e., wheat and/or barley grown between November and July) account for of the area; (ii) orchards (i.e., olives, apple, nectarine, etc.) occupy ; and (iii) grazing and forestry account for .25,26
Data Requirements and Its Preprocessing
In this study, we employed remote sensing data acquired by two satellite systems: (i) Landsat-8 Operational Land Imager freely available from United States Geological Survey (USGS); and (ii) MODIS freely available from National Aeronautics and Space Administration (NASA). For both satellites, we selected the spectral bands of red (i.e., in the range 0.62 to ), NIR (i.e., 0.84 to ), and SWIR (i.e., 2.10 to ). For Landsat-8, we obtained five almost cloudy-free images acquired between June and August 2013 at 30-m spatial resolution. On the other hand, we obtained nine MODIS-based 8-day composite of surface reflectance products (i.e., MOD09A1) at 500-m spatial resolution for the same period. The 8-day composite images would lessen the probability of cloud-contamination of the daily MODIS products.27 The selected dates are presented in Fig. 2.
Preprocessing of MODIS images
The acquired MODIS surface reflectance products were originally provided in sinusoidal projection. We used the MODIS reprojection tool (MRT 4.1)28 to subset the images into the spatial extent of the study area and reproject them into the coordinate system of Landsat-8 images [i.e., Universal Transverse Mercator (Zone 36N-WGS84)]. Then, we coregistered these images using Landsat-8 images to allow for accurate geographic comparisons and to reduce the potential geometric errors (e.g., position and orientation) as effects of spatial miss-registration can influence the derived information. Finally, we also checked the cloud-contaminated pixels by using the quality control band (i.e., 500 m flag; another layer available in the MOD09A1 dataset) and excluded them from further analysis.
Preprocessing of Landsat-8 images
The Landsat-8 images were available in the form of digital numbers (DN). These DN values were converted into top of atmosphere (TOA) reflectance using the following equation illustrated in Ref. 29:
In order to transform the TOA reflectance into surface reflectance, we employed a simple but effective atmospheric correction method that would not require any information about the atmosphere conditions during the image acquisition period. This was done using MODIS surface reflectance images based on the fact that MODIS and Landsat have consistent surface reflectance values.3,4,13,30 This was accomplished in three distinct steps. In the first step, we employed an averaging method over a moving window of (i.e., approximately the equivalent of ) for up-scaling pixels of Landsat-8 images into the spatial resolution of MODIS images (i.e., 500 m). This was done in order to make both the Landsat-8 and MODIS images similar in the context of their spatial resolutions and to increase the spectral reliability.31 In this way, the spectral variance between the images would decrease while the spatial autocorrelation would increase; these were investigated in different studies.3233.34.–35 In the second step, we determined linear relationships between the up-scaled Landsat-8 and MODIS images for each of the spectral bands by generating scatter plots between them. The coefficients of the linear relationships (i.e., slope and intercept) were then used with the original Landsat-8 images (i.e., 30 m) in order to generate Landsat-8 surface reflectance images in the scope of the third step. It would be worthwhile to mention that the use of the Landsat Ecosystem Disturbance Adaptive Processing System atmospheric correction algorithm was not applicable for Landsat-8 due to the absence of climate data records.36 Finally, we employed the Landsat-8 quality assessment (QA) band for determining the cloud-contaminated pixels and excluded them from further analysis.
Figure 3 shows a schematic diagram of the STI-FM framework. It consisted of two major components, (i) establishing the relationships between MODIS images acquired at two different times [i.e., and ], and (ii) generating the synthetic Landsat-8 surface reflectance images at time two [i.e., synth-] by combining the Landsat-8 images acquired at time 1 [i.e., ] and the relationship constructed in the first component and its validation.
In order to establish relations between the two MODIS images [i.e., and ], we performed the following steps:
• We calculated a ratio image [i.e., ] using the pair of MODIS images for each of the spectral bands of interest in order to determine the rate of the temporal change between the two dates.
• We plotted by in order to assess the trends of changes in surface reflectance between the two images; this could be determined using a pixel-wise linear regression analysis.
• Based on the results of the previous two steps, the ratio image was classified into three clusters of land cover change on the basis of assuming that approximately variation (rate of change) in surface reflectance (i.e., albedo) would be common for various natural surfaces (e.g., conifer forests, deciduous forest, agriculture crops, grass, etc.).37,38 These clusters included: (i) negligible changes [i.e., rate of change is within ; ]; (ii) negative change [i.e., ; ]; and (iii) positive change [i.e., ; ]. We considered that this was an important issue that was not taken in consideration in other spatial and temporal fusion models. For each of the three clusters, we produced cluster-specific scatter plots between and , and performed linear regressions (see Fig. 4 for details).
In generating the synthetic Landsat-8 surface reflectance image [i.e., synth-] at 8-day intervals, we employed the Landsat-8 image acquired at time 1 [i.e., ] in conjunction with the classified image and cluster-specific linear regression models derived from the previous steps. In order to perform this combination, we applied different conditional linear functions based on MODIS classified image to assign the surface reflectance value of each pixel in the synthetic image instead of using one linear equation for the entire scene. For example, in order to generate synthetic Landsat-8 image of the day of year (DOY) 185, we first calculated the linear regression between MODIS image on DOY 169 to 176 and DOY 185 to 192, and then we applied the regression coefficients to Landsat-8 LST image of DOY 169. Here, our image-fusion model assumed that the linear relationship between two MODIS images should be applicable and comparable to the linear relationship between the two corresponding Landsat-8 images as they would have consistent values. Therefore, the use of this linear regression would be practical for generating synthetic Landsat-8 images.
Upon producing the synth- images, we evaluated them with the actual images acquired at 16-day intervals as the actual Landsat-8 were only available at every 16 day temporal resolution. In this case, we used two methods: (i) qualitative evaluation that involved visual examination and (ii) quantitative evaluation using statistical measurements, such as coefficient of determination (), root mean square error (RMSE), and absolute average difference (AAD). The equations for these statistical measures are as follows:
Results and Discussion
Evaluation of the Relationships Between MODIS Images Acquired at Two Different Times
Figure 5 shows the relation between 8-day composites of MODIS images acquired in two different dates for the spectral bands of red, NIR, and during the period June 2 to August 12, 2013. It revealed that a strong relation existed for each of the clusters (i.e., negligible change, negative change, and positive change) over all of the spectral bands during the period of observation. For example: the , slope, and intercept values were in the range: (i) 0.85 to 0.95, 0.95 to 1.07, and 0.003 to 0.01, respectively, for the negligible change cluster; (ii) 0.81 to 0.94, 0.77 to 0.94, and 0.0005 to 0.04, respectively, for the negative change cluster; and (iii) 0.79 to 0.91, 1.05 to 1.21, and 0.01 to 0.08 respectively, for the positive change cluster. Regression analysis showed that the negligible change clusters revealed the highest correlation values because no significant changes occurred in the study area during the two 8-day composite MODIS images of interest at 16-day intervals. Note that we were unable to compare our findings as there were no similar studies found in the literature so far. Although the use of 8-day composites of MODIS might reduce the cloud-contamination, it might bring another issue. For example, two consecutive MODIS images might potentially be apart in the range of 2 to 16 days as the 8-day composite were generated based on the minimum-blue criterion,27 which coincided with the most clear-sky day during the composite period of interest. The quantification of the impact of such an 8-day composite against daily data was, in fact, beyond the scope of this paper, but might be an interesting issue for further exploration.
Evaluation of the Synthetic Landsat-8 Surface Reflectance Images
Prior to conducting quantitative evaluations, we performed qualitative evaluations by comparing the actual and synthetic Landsat-8 images. In these cases, we generated pseudocolor composite images by putting the NIR, red, and SWIR spectral bands in the red, green, and blue color planes of the computer; and such an example is shown in Fig. 6. Figure 6 shows the actual Landsat-8 image acquired on June 18, 2013 (DOY 169) [Fig. 6(a)] and its corresponding synthetic image Landsat-8 image [Fig. 6(b)], which was produced using an image pair of Landsat-8 and MODIS acquired in the DOY 153 and one MODIS image acquired in the DOY 169. In fact, we evaluated four different land cover types [i.e., agricultural lands in Figs. 6(a1) and 6(b1); forests in Figs. 6(a2) and 6(b2); water bodies in Figs. 6(a3) and 6(b3); and urban areas in Figs. 6(a4) and 6(b4)]. In general, we observed that the visual clues (e.g., location, shape, size, and texture in particular) were reproduced in the synthetic images with negligible differences in comparison to that of the actual images. However, the tones (i.e., the DN representing the surface reflectance values) had some differences. These might happen due to the use of 500-m spatial resolution MODIS surface reflectance images in calculating the Landsat-8 surface reflectance values at 30-m resolution.
Figure 7 shows the relationship between the actual Landsat-8 surface reflectance images and the synthetic Landsat-8 surface reflectance images for red, NIR, and spectral bands for the DOY 169, 185, 201, and 217. It demonstrated that strong relations existed between the actual and synthetic images for all the spectral bands of interest over the period of study. In the context of linear regression analysis, the , slope, and intercept values were in the range: (i) 0.72 to 0.82, 0.86 to 0.98, and 0.0009 to 0.042, respectively, for red spectral band; (ii) 0.71 to 0.79, 0.80 to 0.87, and 0.166 to 0.0800, respectively, for NIR spectral band; and (iii) 0.78 to 0.83, 0.91 to 0.94, and 0.0096 to 0.0367, respectively, for spectral band. In the context of RMSE analyses, they were: (i) in between 0.003 an d0.004 for red spectral band; (ii) 0.005 for NIR spectral band; and (iii) 0.004 for spectral band. In addition, the AAD values were 0.0002, 0.0003, and between 0.0001 and 0.0002 for the red, NIR, and spectral bands, respectively.
It would be worthwhile to note that our findings were quite similar or even better in some cases compared to other studies. For example: (i) Gao et al.13 implemented STARFM over boreal forest and obtained AAD values of 0.004, 0.0129, and 0.0078 for red, NIR, and spectral bands, respectively; (ii) Roy et al.14 applied a semiphysical fusion model over two study sites in the United States (Oregon and Idaho) and got AAD values of 0.015, 0.22, and 0.28 for Oregon site for red, NIR, and spectral bands, respectively; (iii) Zhu et al.15 applied ESTARFM over heterogeneous regions and achieved AAD values of 0.0095 and 0.0196 for red and NIR spectral bands, respectively; (iv) Walker et al.5 used STARFM to generate synthetic Landsat ETM+ surface reflectance images over dry-land forests; and found that the values were 0.85 and 0.51 for red and NIR spectral bands; (v) Song and Hang23 employed a sparse representation-based synthetic technique over boreal forests and found values of 0.71 and 0.90; RMSE values of 0.02 and 0.03; and AAD values of 0.01 and 0.21 for red and NIR spectral bands, respectively; and (vi) Zhang et al.18 used ESTDFM and observed values of 0.73 and 0.82, and AAD values of 0.009 and 0.0167 for red and NIR spectral bands, respectively. It is also important to mention that the proposed model would be applicable for other satellite systems that would have similar spectral and orbital characteristics, such as other Landsat series, MODIS, MERIS, and ASTER. In addition, it is also interesting to point out that the model is a relatively simple and easily reproducible approach, which might satisfy most of the user’s needs; such a simple and less sophisticated method might be the most suitable for different applications. Although our results demonstrated strong relations between actual and synthetic Landsat-8 images, some issues would be worthwhile to consider for further improvements, such as:
• In this study, we used MODIS surface reflectance images at 500-m spatial resolution. However, it would be possible to use such images acquired at 250-m spatial resolution in the case of red and NIR spectral bands in particular, which might enhance the quality of the synthetic image.31
• One of the major requirements for the input images [i.e., , , and ] was to be free from cloud-contamination. However, it might not be possible to have images completely free from such contamination. In such events, we might use the cloud-infilling algorithm described in Chowdhury and Hassan39 that required images acquired at current and previous dates.
• Although the use of MODIS surface reflectance products to generate Landsat-8 surface reflectance images led to the prediction of reasonable synthetic Landsat-8 surface reflectance images, it would be interesting to use climate data records and compare its outcome with the method adopted here. However, such climatic records were not available for Landsat-8 images at the time of this study.40
In this study, we demonstrated the applicability of the STI-FM technique for enhancing the temporal resolution of Landsat-8 images from 16 to 8 days using 8-day MODIS based surface reflectance images and demonstrated its implementation over heterogeneous agriculture-dominant semiarid region in Jordan. Our results showed that the proposed method could generate synthetic Landsat-8 surface reflectance images for red, NIR, and SWIR spectral bands with relatively strong accuracies (, RMSE, and AAD values were in the range 0.71 to 0.83; 0.003 to 0.005; and 0.0001 to 0.0003, respectively). In general, our method would be considered as a simple one because it would not require any correction parameters or high quality land-use maps in order to predict the synthetic images. Despite the accuracy and simplicity, we would recommend that the proposed method should be thoroughly evaluated prior to adoption in other environmental conditions except for semiarid regions like ours.
We would like to thank Yarmouk University in Jordan for providing partial support in the form of a PhD scholarship to Mr. K. Hazaymeh; and the National Sciences and Engineering Research Council of Canada for a Discovery grant to Dr. Q. Hassan. We would also like to thank USGS and NASA for providing Landsat-8 and MODIS images free of cost.
Khaled Hazaymeh received his BA degree in geography and spatial planning from Yarmouk University, Jordan, in 2004 and his MSc degree in remote sensing and GIS from the University Putra Malaysia in 2009. He is currently pursuing his PhD degree in earth observation in the Department of Geomatics Engineering at the University of Calgary, Canada. His research interests focus on environmental modeling using remote sensing techniques.
Quazi K. Hassan received his PhD degree in remote sensing and ecological modeling from the University of New Brunswick, Canada. He is currently an associate professor in the Department of Geomatics Engineering at the University of Calgary. His research interests include the integration of remote sensing, environmental modeling, and GIS in addressing environmental issues.