Filling method for soil moisture based on BP neural network

Abstract. Soil moisture data obtained by inversion of Fengyun 3B remote sensing data, are widely used in drought monitoring and global climate change research, however, some regional data are missing in this data set, which reduces the application effect. Based on backpropagation neural network (BPNN), we established a filling method and filled the missing area with moderate resolution imaging spectroradiometer (MODIS) inversion products, including land surface temperature, normalized difference vegetation index, and albedo. We named it the multilayer BPNN filling algorithm. The algorithm consists of two neural network layers. The first network layer is used for the spatial scaling of MODIS inversion products, and the second network layer uses the scaling products to further generate soil moisture values. We compared the proposed method to a discrete cosine transform and partial least square (DCT-PLS) and a kriging using the same data set. The experiments demonstrate that our method could obtain good filling results in both homogeneous areas and areas with high data variations, whereas DCT-PLS and kriging could only get good filling results in homogeneous areas.


Introduction
Spatiotemporal continuous soil moisture data are an important index for drought monitoring, global climate change research, and ecological environmental change research. It also has important theoretical and practical significance for regional water resources, agriculture, and animal husbandry management. 1,2 Remote sensing technology has advantages in terms of full space coverage, continuous time coverage, and low cost. Therefore, it has become an important tool for acquiring soil moisture data. 3 Fengyun 3B (FY-3B), launched on November 5, 2010, is one of China's second-generation polar-orbiting meteorological satellites; 4 the spatial resolution of band data is 250 m × 250 m, which can be used to obtain soil moisture data. 5 Soil moisture data obtained by inversion of FY-3B remote sensing data are widely used in drought monitoring and global climate change research. However, the presence of clouds, aerosols, and haze leads to missing value in some regional areas, and it further leads to missing soil moisture data. The gap reduces the application effect. Figure 1 shows these types of gaps; the red regions represent gaps, and the black line represents the bound of Shandong Province, China, which is our study area.
Gap filling methods use acquired effective observation values to fill in missing values. Common filling methods include spatial interpolation, filtering, and curve fitting. Spatial interpolation fills gaps based on the spatial continuity of surface parameters. Müller 6 filled the gaps in chlorophyll images using kriging. From the perspective of spatial data, Casey et al. 7 proposed a filling method using the values of neighboring pixels. This type of spatial interpolation algorithm can achieve the desired filling effect when the area of the gaps is relatively small and homogeneous. Otherwise, these algorithms often fail. Filtering algorithms decompose original signals into different frequencies and use the low-frequency information to perform gap filling. Kandasamy et al. 8 used a method based on caterpillar singular spectrum analysis to fill the gaps in data from moderate resolution imaging spectroradiometer (MODIS) eight-day leaf area index products. Sirjacobs et al. 9 and He et al. 10 used a data interpolating empirical orthogonal function to fill the gaps in sea surface temperature data and sea surface chlorophyll concentration data. Zhang et al. 11 reconstructed missing data from the advanced microwave scanning radiometer 2 based on a discrete cosine transformation combined with a penalized partial least square (DCT-PLS) method. Chen et al. 12 filled normalized difference vegetation index (NDVI) time series gaps by means of Savitzky-Golay filtering combined with the upper envelope of NDVI data. By regarding the high-frequency portion of a signal as noise, this type of filtering algorithm typically achieves as better smoothing effect, but the details of signals are lost. Curve fitting techniques adjust parameters by minimizing a cost function that is typically the sum of quadratic differences between observations and simulations. Because this adjustment is performed over a limited temporal window, only a limited amount of information is used when filling gaps. Beck et al. 13 reconstructed MODIS NDVI time series data using a double-logistic function and achieved acceptable filling results. Jönsson et al. 14 proposed a filling algorithm based on an asymmetric Gauss function by fitting peaks and valleys to the gaps in NDVI data.
Artificial neural networks (ANN) comprising an input layer, several hidden layers, and an output layer, are based on the biological neural network of the human brain. The backpropagation neural network (BPNN) is one type of ANN model; it is self-organizing, self-teaching, and nonlinear, and widely used in the analysis of the nonlinear system considering various influencing factors. The goal of BPNN is to minimize the root-mean-square error between the predicted outputs and the target outputs by adopting the backpropagation algorithm. 15,16 Theoretically, three-layer neural network can approximate a nonlinear function with any precision, as long as the number of neurons is enough.
The BPNN is widely used for prediction tasks for its excellent nonlinear mapping ability. [17][18][19] Prediction using a BPNN does not require determination of the mathematical relationships between inputs and outputs. During training, a BPNN can adjust its own parameters to obtain the desired outputs. Soil moisture has strong correlations with time, surface temperature, land cover, albedo, and digital elevation model (DEM) data. 20,21 Therefore, we established a filling algorithm based on BPNN and filled the missing area of soil moisture data obtained by inversion of FY-3B with MODIS inversion products, including land surface temperature (LST), NDVI, and albedo; DEM data were also used as input. We named it the multilayer BPNN filling algorithm (MBPNN-filling). The reason why MODIS inversion products were chosen as the data source is that MODIS data product has a short period and good credibility. [22][23][24] The following summarizes the proposed method for filling the gap of soil moisture and its evaluation.
(1) We built a network structure using BPNN; the network consists of two neural networks layers. The first network layer is used for the spatial scaling of MODIS inversion products, and the second network layer uses the scaling products to further generate soil moisture values. (2) In the model training stage, we use stochastic gradient descent (SGD) 25,26 to train the network layer by layer. (3) Finally, the filling experiment was carried out by using the trained model, and the filling results were compared with the original data, and the comparison results were analyzed.

Methods
We describe the network structure in detail in Sec.  Figure 2 shows the network architecture of our approach. The function of LST-NN is to take LST, NDVI, and albedo, and DEM as input to generate FY-LST, where LST, NDVI, and albedo are inversion products of MODIS. Similarly, NDVI-NN is used to generate FY-NDVI. Albedo-NN is used to generate FY-albedo. The spatial resolution of FY-LST, FY-NDVI, and FY-albedo is the same as the spatial resolution of FY3B's corresponding inversion products. We added FY prefix before the generated data, indicating the same function as FY3B's inversion products. The function of SM-NN is to use FY-LST, FY-NDVI, FY-albedo, and DEM to generate soil moisture products.

Network Architecture
The main reason that we choose the given network structure is to avoid the adverse effect of spatial scale effect on the result. Through the layered design, the first layer is responsible for the transformation scale. In this layer, the main inputs LST, NDVI, and albedo are all from the same MODIS, so the effect of space effect will be less. The second layer is responsible for generating soil moisture data. The spatial resolution of input and output is consistent. This design is beneficial to improve the reliability of the model.

LST-NN
In this section, we describe the structure and the input of LST-NN. The structure of NDVI-NN and albedo-NN is similar to that of LST-NN except that the output is changed to FY-NDVI and FY-albedo. Figure 3 shows the structure of LST-NN.
LST-NN comprises an input layer that has 28 neurons, a hidden layer that has 10 neurons, and an output layer that has only one neuron. The spatial resolution of MODIS inversion product used in this study is 1 km. Among FY3B inversion product, the spatial resolution of NDVI and albedo is 250 m, the spatial resolution of NDVI and LST is 1 km. The spatial resolution of NDVI is DEM is 90 m. According to the above data, the area covered by an FY3B pixel may be associated with the area covered by four MODIS pixels at most and with 16 DEM pixels at most. Therefore, among the 28 neurons in the input layer, four neurons are used to represent MODIS-LST, four neurons are used to represent MODIS-NDVI, and four neurons are used to represent MODIS-albedo, and 16 neurons used to represent DEM. LST-NN output is FY-LST.
The hyperbolic tangent and linear function were chosen as the activation function for the hidden and output layers, respectively. 27 The datasets were normalized to give values between −0.95 and 0.95 prior to training. The initial connection weights were set to be random between −0.3 and 0.3. The learning rate and the momentum factor of the BPNN model were set to be 0.05 and 0.9, respectively. These were all empirical values. 28 To overcome the overfitting problem, the datasets were divided into three sets: the training set, validation set, and testing set, accounting for 60%, 20%, and 20%, respectively. The training set was used to adjust the weights of the neural network, the validation set was used to minimize overfitting, and the testing set was used only for testing the final solution in order to confirm the actual predictive power of the network.   Figure 4 shows the structure of SM-NN. SM-NN comprises an input layer that has 19 neurons, a hidden layer that has 10 neurons, and an output layer that has only one neuron. Among the 19 neurons in the input layer, one neuron is used to represent FY-LST, one neuron is used to represent FY-NDVI, one neuron is used to represent FY-albedo, and 16 neurons used to represent DEM. SM-NN output is soil moisture.

Network Training
The principle of selecting samples is: (1) MODIS products are consistent with the product time of FY3B and (2) MODIS products and FY3B can be covered in the selected area. The steps of generating a training sample are as follows: (1) extracting a subregion covered by a FY3B pixel from the selected sample region; (2) obtaining LST, NDVI, and albedo values of the FY3B inversion product of the pixel; (3) using the boundary of the subregion and the MODIS data performs topology operations to obtain LST, NDVI, and albedo values of associated pixels; and (4) performs topology operations using DEM data using the boundaries of the subregions to obtain DEM values of associated pixels.
In our training, the SGD method with momentum was used for parameter updates, the following expression illustrates the SGD method with momentum: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 1 1 6 ; 2 5 4 where WðnÞ denotes the old parameters, Wðn þ 1Þ denotes the new parameters, and ΔWðn þ 1Þ is the increment for the current iteration. The iteration increment, which is a combination of old parameters, gradient, and historical increment, is calculated as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 1 1 6 ; 1 8 6 where JðWÞ is the loss function, ϑ is the learning rate for step length control, d w denotes the weight decay, and m denotes the momentum.

Fill Gap Using the Trained Network
After successful training, the model can be used to generate data. According to our design, the output of the model is written to the corresponding pixel by location, replacing the original null value, and completing the data filling work.

Experiment Design
We designed a set of test experiments and comparative experiments to verify the feasibility of the proposed MBPNN-filling method. The proposed approach was implemented using MATLAB on a windows operating system.

Study Area and Data
The study area is Shandong Province, China. The annual precipitation is about 550 to 950 mm, with 75% concentrated from May (the beginning of the unfrozen season) to October (the beginning of the frozen season) due to the impact of the South Asian summer monsoon. The surface soil moisture has significant seasonal variations, lower in winter but higher in summer, with strong spatial heterogeneity. The main vegetation is crop, grass, and forest. In the pretreatment stage, we used the algorithms proposed by Zeng et al., 24 Liu et al., 16 and Jia et al. 22 to fill the gap of MODIS data.

DEM
DEM data from the Shuttle Radar Topography Mission (SRTM), an international project spearheaded by the National Geospatial-Intelligence Agency, the U.S. National Aeronautics and Space Administration, the Italian Space Agency, and the German Aerospace Center, with 90 m × 90 m resolution were downloaded from the website available in Ref. 31.

Soil Moisture Inversion from FY-3B
Zhang et al. 32 presented a method of soil moisture inversion using BPNN. First it created a BPNN and optimized node weights of BP network by genetic algorithm. Then, TM data (TM3, TM4, and TM6) and different ASAR data of polarization and polarization ratio (VV, VH, and VH/VV) were used as neural network inputs; the outputs were ground soil moisture data, part of which were used to train network and invert to get soil moisture distribution. Finally, it verified the effectiveness of genetic neural network optimization algorithm and the inversion results based on integration of active and passive remote sensing. The results showed that the new optimization algorithm was efficient and effective, and the inversion results based on integration of TM and ASAR data were obviously better than inversion individuals, that embodied advantages and potential of soil moisture inversion based on integration of active and passive remote sensing.
In this study, we used FY-3B multispectral data instead of TM data and used FY-3B MWSI data to represent ASAR for soil moisture inversion.

Samples and Test Data
We selected a batch of FY3B remote sensing image data of 2017, which are similar in time and can cover the whole study area. The soil moisture data set of the whole study area was obtained using the inversion method described in Sec. 3.2, and the data set has not missing value, as shown in Fig. 5. We use this data set to make samples and test data.
We developed a data extraction tool based on GDAL that can extract pixels that intersect or contain with its topology based on the specified coordinate location or range. We select 6000 pixels from soil moisture data set shown in Fig. 5, and extract soil moisture value of each pixel using the tool. Then, we calculate the accurate coverage area of each pixel, on this basis, extract data from MODIS product data sets, FY3B product data set, DEM data set, and form 6000 samples.
The method of making test data set is based on the original soil moisture data set, and the soil moisture in several areas is artificially changed to 0 as a missing area. The advantage of this stage is that after the completion of the filling test, we can compare the obtained filling value with the original value to verify the precision of the fill value.
We made two sets of test data sets, each containing three sets of test data sets. Figure 6 is the first set of test data sets, and Fig. 7 is the second set of data sets. In the first test set, the area of   each test area is very small, whereas in the second test data set, the area of each test area is relatively large. In Fig. 6, (a) Exp-1 is the missing region distribution formed by removing the value of 150 subregions artificially. (b) Exp-2 is the missing region distribution formed by removing the value of 250 subregions artificially. (c) Exp-3 is the missing region distribution formed by removing the value of 600 subregions artificially. Each subregion contains 3 to 6 pixels, the area of each pixel is 0.25 km 2 .

Comparison Model
We chose the DCT-PLS and the kriging model as the comparison models. A comparative experiment was conducted using methods established in the published literature.

DCT-PLS
DCT-PLS is a method of performing simultaneous verification of raw data, replacement of false and missing vectors, and some smooth and powerful postprocessing techniques based on a penalized least squares approach (PLS); it combines the use of the discrete cosine transform (DCT) and the generalized cross-validation, thus DCT-PLS algorithm can process the data quickly and smoothly. The DCT-PLS uses a robust procedure that is very little affected by the outliers. This method consists of constructing weights with a specified weighting function using the current residuals and updating them, from iteration to iteration, until the residuals remain unchanged. It is worth noting that the DCT-PLS does not contain any "outlier detector" based on a specific predicate such as used in the normalized median test. The algorithm rather assigns a low weight to outlying values in order to reduce their influence.

Kriging
Kriging interpolation algorithm is also known as the best interpolation of spatial autocovariance. It is a method of interpolation that predicts unknown values from data at known locations. Kriging uses variogram to express the spatial variation. This method is a mathematical model obtained through fitting semivariogram, then the distance of any given semivariance function is estimated, and the spatial weight is calculated. It estimates the unknown value using a weighed linear combination of the available samples.
In Kriging method, the weight is calculated from a variogram, which considers the distance as an independent variable. The method first considers the variability of spatial attribute in spatial location. Second, the influenced distance range of a point is determined and its value is to be estimated. At last, the attribute value of the point to be estimated is computed using the points within this range. Not only the relative positions between observed points and estimated points are considered but also the relative positions of observed points are taken into account in the Kriging method.

Test Results and Discussion
This paper provides an approach to fill the missing soil moisture data with auxiliary information using MBPNN-filling. Compared with other typical methods of the same kind, the proposed method obtains the best effect. We first give the result of test experiment in Sec. 4.1 and analyze the reasons in Sec. 4.2. Figure 8 shows the results of the first test experiment and Fig. 9 shows the results of the second test experiment. In Figs. 8 and 9, the first row is the original soil moisture inversion product using  our method, which descried in Sec. 3.2. Because the FY3B remote sensing data we selected can cover the whole region, the soil moisture inversion product is not missing, so the test results can be verified better. The second row is test data, the missing area is shown by manually removing the value of some pixels. The third row is the filling results of DCT-PLS. The fourth line is the filling results of kriging. The fifth line is the filling results of our approach.

Analysis
In order to compare the filling results, we calculated the mean square error (MSE) between the filling results and the original values after each test, and taking MSE as the evaluation index, the smaller the MSE, and the better the filling effect.
These MSE values of first test experiment are listed in Table 1. From the table, we can see that the three algorithms all achieved optimal filling results (MSE less than or equal to 8.08727E-7). Among the three algorithms, MBPNN-filling achieved the best filling results and Kriging achieved the worst filling results. Table 2 lists the MSE values of second test experiment. For the same missing proportion, MBP achieved the best filling results among the three algorithms. The MSE of kriging is too large to accurately reflect the distributions of soil moisture. With an increasing missing proportion, the performance of all three algorithms decreases significantly.
It can be seen from Exp.-1, Exp.-2, and Exp.-3, although soil moisture is a variable with strong spatial heterogeneity, when the area of the missing area is small enough, MBPNN-filling, DCT-PLS, and Kriging can still get good filling results. The reason is that when the area of the missing area is small, the algorithm can get enough information from the surrounding area to calculation. The influence of spatial heterogeneity can be ignored. The effect of filling mainly depends on the area of a single missing area, regardless of the number of missing areas.
It can be seen from Exp.

Conclusion
This paper proposes a method to fill the missing area of FY-3B soil moisture inversion product by using MODIS inversion product and DEM as auxiliary information. Compared with traditional DCT-PLS, Kriging, and other filling algorithms, the proposed method introduces external auxiliary information and adopts the strategy of spatial scale conversion of auxiliary variables and filling separately. It effectively avoids the influence of spatial heterogeneity on the filling algorithm and has a good filling effect. This article also provides a tool to generate pixel-level training samples, which can reduce manual workload and improve efficiency. The main limitation of our method is that when the time of auxiliary information from MODIS is too different from the time of FY3B, the effect of filling will be greatly reduced. In the next work, we will introduce time series data to further enhance its applicability.
Yang Xiaoxia is currently a lecturer working at the College of Information Science and Engineering of Shandong Agricultural University. Her main research areas are remote sensing and geographic information system in land use monitoring and evaluation, taken part in a number of agricultural remote sensing projects by Ministry of Science and Technology and Shandong Province. Currently, she is mainly engaged in the research of remote sensing technology in agriculture and environmental.
Zhang Chengming is currently a professor working at the College of Information Science and Engineering of Shandong Agricultural University. His main research areas are remote sensing and geographic information system in land use monitoring and evaluation, presided over a number of agricultural remote sensing projects by Ministry of Science and Technology and Shandong Province. Currently, he is mainly engaged in the research of Remote Sensing Technology in agriculture and environmental.
Cui Zhaoyun is a senior engineer at Tai'an Meteorological Bureau in Shandong Province. Her main research fields are agrometeorological technology and satellite remote sensing applications, and she presided over a number of scientific and technological issues of Shandong Meteorological Bureau. At present, she mainly engaged in agricultural meteorological monitoring methods and data application research.
Yu Fan is currently an associate professor at the Chinese Academy of Surveying and Mapping. His main research areas are remote sensing image processing, he has published several academic papers in this area. He also takes part in many research projects founded by Ministry of Science and Technology of China. Currently, he is mainly engaged in the research of remote sensing technology in urban management and city information acquisition.
Wang Jing is a senior engineer at the Ningxia Meteorological Science Institute. Her main research area is remote sensing in agriculture. Currently, she is mainly engaged in the research of fruit tree classification and planting area extraction by remote sensing technology.
Han Yingjuan is a senior engineer at the Ningxia Institute of Meteorology. Her main research area is the application of remote sensing of ecology and agriculture in arid and semiarid areas. Currently, she is carrying out remote sensing extraction of crop planting information and monitoring of ecological environment elements by remote sensing.