Estimation of soil moisture using a vegetation scattering model in wheat fields

Abstract. We develop a vegetation scattering model to eliminate the effect of vegetation and surface roughness on the radar signal. The canopy water content is an important variable associated with the scattering effect of vegetation, and it can be calculated using the leaf area index, which is retrieved from PROSAILH optical model based on Landsat-8 images. The scattering model introduced the direct scattering contribution of underlying ground into the water cloud model. The experimental correlation length was replaced by a fitting parameter from C-band RADSARSAT-2 radar data to calculate the scattering contribution of underlying ground. Results demonstrate that the vegetation scattering model has a good performance in soil moisture retrieval with R2 of 0.805 and root-mean-square error of 0.039  m3  ·  m  −  3. The applicability and capability of the scattering model will provide the operational potential of C-band radar data for soil moisture retrieval in an agricultural region.


Introduction
Synthetic aperture radar (SAR) has great potential for providing valuable insight in soil moisture retrieval irrespective of meteorological conditions. Increasing interest has been drawn because of its outstanding sensing capacities in the presence of persistent cloud cover and extraordinary sensitivity to soil dielectric constant. [1][2][3] Several radar backscattering models have been developed to simulate the radar signal. Sensor configurations (e.g., incidence angle, frequency, and polarization), soil properties (e.g., soil moisture, texture, and surface roughness), and vegetation parameters [e.g., normalized difference vegetation index (NDVI), leaf area index (LAI), biomass, and canopy water content] play an important role in these models. Over bare soil, soil moisture is estimated using semiempirical models (e.g., Oh, 4-7 Dubois et al., 8 and Shi et al. 9 model) or physical models [e.g., integral equation model (IEM) 10 and advanced IEM (AIEM) [11][12][13][14]. However, measuring the surface roughness parameters accurately in the field experiment is rather difficult. 15 Baghdadi et al. [16][17][18][19] proposed an empirical calibration of IEM by replacing the correlation length with a fitting parameter in C-and X-bands. The fitting parameter is called calibrated or optimal correlation length, which depends on polarization, incidence angle, and root-mean-square (RMS) height. Álvarez-Mozos et al. 20 used the optimal parameter in IEM and retrieved promising soil moisture based on RADARSAT-1 scenes.
Meanwhile, the optimal correlation length was successfully applied in many other studies using radar data. [21][22][23][24][25] Additionally, Michigan microwave canopy scattering (MIMICS) model, 26 water cloud model (WCM), 27 and crop model 28 over vegetation-covered areas. WCM is one of the most widely used models for vegetation-covered soil moisture retrieval. It consists of two terms: direct contribution from vegetation and attenuation component. In WCM, one vegetation parameter is used to separate the vegetation effect from the observed backscatter coefficient. The vegetation parameter could be parameterized using NDVI, enhanced vegetation index, canopy water content, LAI, or leaf water area index. 22,[29][30][31][32][33] El Hajj et al. 34 developed an inversion technique to invert WCM for soil moisture estimation based on a coupling scenario between SAR (TerraSAR-X and COSMO-SkyMed) and optical (SPOT 4/5 and Landsat 7/8) images over irrigated grassland areas in southeastern France. NDVI and LAI are treated as investigated vegetation descriptors derived from optical data. Bai et al. 35 coupled WCM and AIEM to retrieve soil moisture based on timeseries VV-polarized Sentinel-1A data in the northeastern Tibetan Plateau. LAI is designated as the vegetation descriptor, which is smoothed and interpolated from Terra moderate resolution imaging spectroradiometer LAI eight-day products. Bao et al. 36 used Sentinel-1 SAR and Landsat data to present a methodology for retrieving soil moisture under conditions of partial vegetation cover at two experiment sites in the UK and Spain. Vegetation water content was obtained using the Landsat spectral index to remove the effect of vegetation. However, for the herbaceous or sparse vegetation-covered areas, the strong influence from the underlying ground is a key mechanism in the total backscattering. Thus an empirical parameter is introduced to separate the scattering contribution of vegetation and underlying ground effectively from the total backscatter coefficient in the study.
This study presented an approach coupling WCM and AIEM to retrieve soil moisture using C-band SAR observations. The direct backscattering of underlying ground was applied to optimize the scattering mechanisms and improve the estimation accuracy of the WCM. The vegetation descriptor was represented by canopy water content, which was modeled and derived from the optical data. An effective fitting parameter was applied in the AIEM. Then soil moisture retrievals were performed and compared with the results acquired from other inversion algorithms. Finally, the soil moisture map was generated from full scene of RADARSAT-2 image.

Study Area
The study area is located in the middle of Guanzhong Plain, known as "Qinchuan in eight hundred miles." It is a homogeneous and relatively flat area of Yangling agricultural high-tech industries in Shaanxi province, China (Fig. 1). This district belongs to a semihumid and semiarid climate region, which has obvious characteristics of continental monsoon climate. The annual average temperature and rainfall are 12.9°C and 636.1 mm, respectively.

RADARSAT-2 data
RADARSAT-2 radar satellite is widely applied in ocean monitoring, disaster management, agriculture, hydrology, and other fields. It is a C-band multipolarization satellite. The center frequency is 5.405 GHz with the image resolution of about 8 m. The near-and far-incidence angles are 26.8 deg and 28.7 deg, respectively. This study selected two single look complexformat images on March 29 and April 22, 2014. The start time of raw data at each day is around UTC10:41:47.153442. The preprocessing of RADARSAT-2 images is mainly conducted in Sentinel Application Platform (SNAP, version 6.0.0) and ENVI 5.2 software, including multilooking, radiometric calibration, terrain correction, and speckle filtering. In order to improve the image interpretability, multilooking processing is conducted by reducing the inherent speckled appearance in a SAR image. The SNAP multilooking tool is used to make the mean GR square pixel reach to 30 m with the number of range and azimuth looks of 3 and 6, respectively. The main purpose of radiometric calibration is to transform the digital number (DN) value into backscatter coefficient with the unit of dB. Sigma0 look-up table (LUT) in the RADARSAT-2 product is applied to perform the radiometric calibration using the Radiometric Calibrate tool. Then the Lee filter is done to suppress the speckle with a window size of 5 pixels × 5 pixels. Finally, Range Doppler Terrain Correction tool is used to implement orthorectification and eliminate topography related phase changes combining with the SRTM 3Sec. The radar image is geometrically corrected using 20 GCPs (ground control points) with the correction error of one pixel.

Landsat-8 data
Landsat-8 land satellite was launched by the National Aeronautics and Space Administration at Vandenberg Air Force Base in California on February 11, 2013. It mainly provides data support for Earth resources, water, forest, environment, and urban planning. The operational land imager (OLI) and the thermal infrared sensor are the two main loads of Landsat-8 satellite. The OLI terrestrial imager consists of nine bands with a spatial resolution of 30 m. In this paper, two Landsat-8 optical images were selected on March 24 and April 9, 2014. The start time of data at each day is around UTC06:02:40. All data were downloaded from the USGS website. The preprocessing of Landsat-8 data mainly includes radiometric calibration, atmospheric correction, and geometric correction in ENVI5.2 software. Radiometric calibration is conducted to convert the DN values to the equivalent apparent radiance data. Then FLAASH Atmospheric Correction toolbox is used to perform atmospheric correction on the processed radiance data of Landsat-8 data. In order to ensure the geometric accuracy of the optical images, the geometric correction is also applied using the GCPs with the correction error of about half a pixel.

In Situ Measurements
This study carries out satellite-ground synchronous experiments during the late jointing stage of winter wheat on March 29 and the flowering stage on April 22, 2014. There are three fields including Yangling, Fufeng, and Wugong in the study area ( Fig. 1) and Table 1 shows the field measurements of wheat coincident with RADARSAT-2 images mainly including LAI, surface soil moisture, and roughness parameters. LAI is measured using LAI-2000 plant canopy analyzer (LI-COR, Inc., Lincoln, Nebraska). 37,38 The measured time is mainly selected at 7:00 to 9:30 a.m. and 16:00 to 18:00 p.m. to avoid uncertainties and errors caused from direct sunlight. Ten effective measurements are performed under diffuse illumination conditions. Time domain reflectometry is selected to collect soil moisture content with the probe length of 7.6 cm. Clean probe is vertically inserted into soil and the stable soil moisture values are recorded when the probe has been completely covered by soil. A total of five measurements are obtained for each sampling points. Then the sampling points are randomly divided into two parts including the correction and prediction dataset according to the proportion of 2:1. The roughness parameters, including RMS height and correlation length, are mainly measured using the profile plate method. The profile plate used in this experiment consists of a 2-m-long board with 200 probes and the interval between probes is 0.5 cm. The roughness plate is placed vertically and each sampling point is photographed in the south-north and east-west directions. The measured photos are digitized to depict the surface fluctuation and the contour curve of soil rough surface is obtained. Finally, the RMS height and correlation length are calculated according to the relative length of the roughness plate in the photograph and the height relationship between the marked points.

Vegetation scattering model
WCM was first proposed by Attema and Ulaby 27 to retrieve the vegetation parameters based on SAR observations. It is a semiempirical model and simplified the scattering mechanisms of vegetation cover by ignoring the multiple scattering effects between vegetation and surface. 27 The total backscatter coefficient is expressed as an incoherent sum of contribution of vegetation and scattering contribution of underlying soil, which is attenuated by the vegetation layer. It can be represented by the following form: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 1 1 6 ; 3 0 4 where σ 0 veg indicates the volume scattering from the vegetation. σ 0 soil represents the scattering contribution from ground surface. γ 2 is a two-way attenuation coefficient through the canopy. They are calculated as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 1 1 6 ; 2 3 1 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 1 1 6 ; 1 8 2 where A relates to the backscatter coefficient at full cover, B acts as a canopy attenuation coefficient, m veg means canopy water content, and θ is the incidence angle. In this study, the direct scattering from the underlying ground surface is added into the WCM. Equation (1) can be rewritten as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 1 1 6 ; 1 2 1 C indicates the proportion of vegetation canopy and bare soil in a pixel, which can be calculated from Landsat-8 optical data. It can be expressed as follows: Note: Late jointing stage is that point in which the internodal tissue in the wheat leaf begins to elongate and form a stem.
where NDVI is normalized differential vegetation index. NDVI max and NDVI min are the maximum and minimum of NDVI, respectively.

Leaf area index estimation
LAI is an important parameter of vegetation monitoring and climate change. Nowadays, visible and near-infrared remote sensing methods are usually used for LAI retrieval in agricultural areas. [39][40][41] The most commonly used model is PROSAILH optical model, which is composed of PROSPECT model and the SAILH model. [42][43][44] PROSPECT model is developed to simulate the directional-hemispherical reflectance and transmittance with the solar spectrum from 400 to 2500 nm. [45][46][47] Scattering by arbitrary inclined leave (SAIL) model is a canopy reflectance model based on the 1-D model proposed by Suits. [48][49][50] It is introduced to simulate the bidirectional reflectance of vegetation canopy. In order to consider the hotspot effect in the vegetation canopy, Nilson and Kuusk 51 introduced a hot-spot parameter and developed a modified model called SAILH. The parameter is expressed as the ratio between leaf size and canopy height. According to the field measurements, the ranges and step sizes of the input variables are determined in Table 2. PROSAILH model simulates the canopy reflectance using different parameter combinations to establish the relationship between LAI and canopy reflectance. Then LAI value can be retrieved according to the relationship and ground measurements.

Bare soil scattering model and calibration
AIEM is a bare soil scattering model based on the theory of electromagnetic wave scattering. [11][12][13][14] In AIEM, soil surface roughness is an important parameter and plays a vital role in simulating scattering characteristics of actual natural surface. RMS height, correlation length, and autocorrelation function (ACF) are used to describe the surface roughness in the AIEM. Many studies have showed that the exponential ACF have a better performance than other functions in characterizing roughness. 5,9,52 Comparing with RMS height, the correlation length is more difficult to be accurately measured and characterized. Thus a semiempirical calibration method was proposed to reduce the effect of surface roughness on radar backscatter. The method introduces a modeled correlation parameter (l opt ) proposed by Baghdadi et al., [16][17][18] which can ensure a good fit between the backscatter coefficient simulated by the AIEM and that acquired from the SAR image data. The optimal correlation length l opt is dependent on RMS height, polarization mode, and incidence angle. It is expressed as a power form for both exponential and fractal correlation functions and a linear relationship for Gaussian correlation function. These significant improvements and calibrations about the reproduction of radar backscatter coefficient were reported at L-, C-, and X-bands. 19,23,53,54 In this study, the optimal correlation length is described as a power relationship with the measured RMS height for the exponential ACF, which is expressed as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 1 1 6 ; 6 0 4 where k and t can be calculated by statistical regression analysis and s is the RMS height of ground surface.

Soil moisture retrieval
In this study, the Dobson dielectric mixing model 55 is used to calculate the value of soil moisture based on the soil dielectric constant calculated from the AIEM. In order to retrieve the value of soil dielectric constant, the coefficients in Eq. (4) need to be determined. The main steps for retrieving soil moisture are described as follows: Step 1: Improving the performance of the AIEM. The forward AIEM is applied to calculate the optimal correlation length based on the field measurements and SAR observations. Soil texture parameters and measured soil moisture contents are used to obtain the soil dielectric constant.
Step 2: Calibration of the vegetation scattering model. PROSAILH model is demonstrated to calculate LAI and canopy water content. Then canopy water content, RMS height, optimal correlation length, soil dielectric constant, and incidence angle are used as inputs of the vegetation scattering model in Eq. (4). Two-thirds of sampling points are randomly selected as training data to parameterize and calibrate the vegetation scattering model. Levenberg-Marquardt nonlinear least-squares method is applied to calculate the coefficients A and B in the 1stOpt software.
Step 3: Establishing the LUT. The vegetation scattering model is simulated to establish the LUT when the coefficients are obtained. The canopy water content and soil moisture content are regarded as free variables. The canopy water content ranges from 0.1 to 1.4 kg · m −2 with an interval of 0.01 kg · m −2 , and the soil moisture content increases from 0.08 to 0.52 m 3 · m −3 with an interval of 0.01 m 3 · m −3 . The LUT indicates a one-to-one correspondence relationship among canopy water content, soil moisture, and backscatter coefficient.
Step 4: Retrieving soil moisture content. One-third of sampling points are used as validation data to retrieve soil moisture and validate the vegetation scattering model. The sum of squared deviations and the estimated error of canopy water content are conducted. Then soil moisture will be retrieved when the two constraints reach to their minimum values simultaneously. If they could not reach their minima simultaneously, soil moisture corresponding to the minimum sum of squared deviations will be considered as the final value.

Effect of Vegetation Canopy on SAR Observations
In the proposed model, canopy water content is the only unknown variable, which needs to be solved. Assuming that there are M leaves per unit volume in the study area and the average weight water content of each leaf is l veg . Then the canopy water content can be expressed as The leaf is considered to be elliptical and its major and minor semiaxes are a and b, respectively. The canopy height is expressed as h. Then M can be rewritten as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 8 ; 1 1 6 ; 6 8 8 By substituting Eq. (7) into Eq. (6), the expression of canopy water content m veg is expressed as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 9 ; 1 1 6 ; 6 2 2 where l veg , a, b, and h can be obtained from field measurements. LAI can be retrieved by PROSAILH optical model. In this study, 28 sampling points were randomly selected to establish the fitting relationship between canopy water content and LAI. There is a good linear relationship between canopy water content and LAI with R 2 of 85.3%. According to the empirical relationship between canopy water content and LAI, the canopy water content value can be obtained to build a semiempirical coupling model including radar model as the main model and optical model as the auxiliary model.
In this paper, the scattering model is applied to eliminate the scattering contribution of vegetation from the total backscatter coefficient. The backscatter coefficients at each sampling point before and after applying the scattering model are illustrated in Fig. 2. The blue and red lines represent the backscatter coefficient before and after vegetation removal in HH and VV polarizations, respectively.
It can be seen that backscatter coefficients at each sampling point are attenuated to different values after applying the scattering model. On March 29, vegetation effects on radar signal at points 10 and 17 are relatively bigger than that at other sampling points. The relative scattering contribution of bare soil is becoming smaller on April 22 than that on March 29, which might be due to the wheat growth from sparse to dense cover. Farmland irrigation could be also one of the factors that decrease the influence of underlying ground on the backscatter coefficient.

Evaluation of Vegetation Scattering Model on Calibration Points
In this study, datasets including totally 80 and 70 sampling points on March 29 and April 22 are randomly divided into calibration and validation data elements. Winter wheat during two growth stages is the main applied cultivation at this site. The terrain in this area is relatively flat and it has little influence on the soil moisture retrieval. Meanwhile, the scattering contribution of wheat is an important mechanism responsible for radar signal under wheat-covered area. Vegetation parameters or vegetation descriptors calculated from optical images are suitable for eliminating the influence of vegetation scattering. For the different growth stages, wheat has a different influence on radar signal. Thus Two RADARSAT-2 images and corresponding field measurements are used to simulate backscatter coefficient and explore the effect on radar signal during the late jointing and flowering growth stages of wheat. In order to calibrate the vegetation scattering model, the coefficients A and B are calculated based on field measurements and SAR observations in Table 3. The vegetation scattering model can be evaluated based on the calibration points when these coefficients are obtained. Figure 3 demonstrated the estimated results calculated from the validation datasets in HH and VV polarizations during two growth stages.
As shown in Fig. 3, the fitting relationship between the simulated backscatter coefficient and the radar observations approximates linear irrespective of any polarization and growth stage. Similarly, this study uses the original WCM and PROSAILH model to construct a combined semiempirical model (WCM + PROSAILH model). To compare with the vegetation scattering model, the combined model is conducted to acquire the empirical coefficients (Table 3) and simulated backscatter coefficient. Figure 4 shows the simulated results based on calibration data in HH and VV polarizations. It can be concluded that both the two models can effectively simulate the backscatter coefficient contributed by vegetation canopy and bare soil in the vegetationcovered area. However, the vegetation scattering model performs better than the combined model with the root-mean-square error (RMSE) of 0.967 dB in HH polarization and 0.985 dB in VV

Soil Moisture Estimations on Validation Points
In this study, soil moisture is estimated based on validation data using the vegetation scattering model, temperature vegetation dryness index (TVDI), and simplified MIMICS, respectively. TVDI is a simplified feature space of land surface temperature and NDVI. 56 This feature space divides the surface coverage into bare soil, partial vegetation-covered and dense vegetationcovered areas. It consists of four edges, which are related to the wet and dry conditions of soil surface. The index is proposed to demonstrate soil moisture detection and has been widely used in the drought monitoring. [57][58][59] MIMICS is proposed to depict the backscattering behavior of a forest based on radiative transfer theory. 26 Geometrically, this model divides the tree canopy into three components: crown layer, trunk, and underlying ground. According to the scattering characteristics of herbaceous vegetation, scattering component associated with trunk was deleted to characterize total scattering mechanisms. 28 The three models are performed to obtain the values of soil moisture based on RADARSAT-2 observations and field measurements on March 29, 2014. Figure 5 demonstrates the inversion results calculated using the validation points. As can be observed from Fig. 5, the vegetation scattering model produced the best performance in soil moisture retrieval (R 2 of 0.805 and RMSE of 0.039 m 3 · m −3 ). Errors associated with TVDI (R 2 of 0.604 and RMSE of 0.060 m 3 · m −3 ) were higher than that of the scattering model, the combined model, and simplified MIMICS (R 2 of 0.718 and RMSE of 0.050 m 3 · m −3 ) at the late jointing stage. It is possibly because the optical model TVDI was less sensitive to surface soil moisture than microwave models. Meanwhile, the temporal gap between SAR and optical data could affect the results of soil moisture retrieval. According to the results calculated from simplified MIMICS, the accuracy of soil moisture estimation was better than that of TVDI and worse than that of the scattering model. The reason is that too many unknown model parameters were the main factor that affected the applicability and estimated accuracy of simplified MIMICS. Moreover, the direct scattering information of bare soil was ignored in the simplified MIMICS.

Soil Moisture Maps
The vegetation scattering model was used to retrieve soil moisture using calibration and validation points in the wheat fields based on RADARSAT-2 images. The unseen data on the images needed to be applied to demonstrate the application of the scattering model. A typical subset of RADARSAT-2 image acquired on March 29, 2014 was selected to validate the scattering model and produce the spatial sequence of soil moisture map. Figure 6 showed the soil moisture map based on the full scenes of RADARSAT-2 data. Parts of the image were masked according to wheat-planted area. The soil moisture distribution of the whole study area could depict the real situation of the actual features. The dark red areas were densely planted farmlands, and the green parts were sparse crop areas. The soil moisture content of the farmland area was about 0.3 to 0.5 m 3 · m −3 in the densely planted area. During the late jointing stage, the demand of winter wheat for water was relatively large and frequent field irrigation led to high values of soil moisture in farmland areas.

Conclusions
In this study, C-band RADARSAT-2 observations and field measurements were used to establish a vegetation scattering model for soil moisture retrieval in agricultural regions. The scattering model was formulated by the WCM, the PROSAILH optical model, and the calibrated AIEM. Direct scattering contribution from underlying ground was added into the WCM to characterize the total backscattering mechanism and separate the scattering influence of vegetation from that of bare soil. The latter was described using the calibrated AIEM and parameterized with an effective roughness parameter. Canopy water content was calculated using a fitting relationship based on the values of LAI derived from the PROSAILH model. Very promising results were found for evaluation of the vegetation scattering model and soil moisture retrieval. Results demonstrated that coupling the SAR and optical models was validated to be a significant method for soil moisture retrieval in agricultural regions. Compared with the combined model formulated by the original WCM and PROSAILH model, the scattering model performed better with the R of 81.9%, RMSE of 0.967 dB in HH polarization, and the R of 84.9%, RMSE of 0.985 m 3 · m −3 in VV polarization. For soil moisture retrieval, the scattering model also had better simulation accuracy than TVDI and simplified MIMICS with the R 2 of 0.805 and the RMSE of 0.039 m 3 · m −3 . The results indicate that the scattering model can provide accurate soil moisture by integrating the advantages of optical and radar methods in agricultural regions.
The effects of the vegetation and surface roughness parameters on the radar signal lead to errors and uncertainties in soil moisture retrieval. Future work for acquiring more extensive data to validate this methodology is necessary. More satellites carrying payload of polarimetric SAR need to be used to extend the application of the proposed method and make a significant contribution to regional soil moisture monitoring.
Liangliang Tao received his BS degree in map surveying from China University of Mining and Technology in 2011 and his PhD in cartography and GIS from Beijing Normal University in 2017. He is a lecturer at the School of Geographical Sciences, Nanjing University of Information Science and Technology, China. His current research interest is soil moisture retrieval using remote sensing technology. Other interests include LAI and cab monitoring using optical and SAR data.
Guojie Wang received his PhD in meteorology direction from the Department of Earth Sciences, Vrije University Amsterdam, Amsterdam, The Netherlands, in 2010. Currently, he is a professor in the School of Geographical Sciences, Nanjing University of Information Science and Technology, Nanjing, China. His research interests include global and regional scale water circulation of satellite remote sensing and land surface atmosphere interaction.
Xi Chen received his BS degree in mathematics from Wuhan University of Science and Technology in 2011, and received his PhD in cartography and geographic information system in the Faculty of Geographical Science, Beijing Normal University. Currently, he is a postdoc researcher in School of Earth and Space Sciences, Peking University. His research focuses on fusing multisource geographical information for decision support, including the fusion of remote sensing images, marine acoustic data and statistical information.
Jing Li received his BS degree in geography and his MS degree in graphics and remote sensing from Peking University in 1982 and 1985, respectively. From 2003 to the present, he has been a professor with Beijing Normal University and a chief engineer of the National Disaster Reduction Committee, Ministry of Civil Affairs of China. His research interests include remote sensing applications for natural disaster reduction, and monitoring of oceanic and coastal ecology by remote sensing.
Qingkong Cai received his BS degree in geomatics engineering from Henan University of Urban Construction in 2010 and his PhD degree in photogrammetry and remote sensing from China University of Mining and Technology in 2015. He is a lecturer at Henan Institute of Engineering, China. His current research interests include application of hyperspectral remote sensing in agriculture and environmental monitoring.