|
1.IntroductionImage segmentation is the precondition and foundation for the extraction and target identification of high-resolution remote sensing images.1 In a high-resolution image, spectral confusion is more serious, differentiation is substantially reduced, and the accuracy of the spectral statistics-based segmentation method is reduced.2,3 Object-oriented image segmentation method can overcome the influence of “salt and pepper” noise and improve accuracy by using object structure and spectral signature. Because this approach must adjust the segmentation scale to obtain an acceptable image segmentation result and it is difficult to determine a suitable segmentation scale, development of the object-oriented segmentation method has been slow.4,5 With the development of machine learning technology, researchers began to apply algorithms, such as neural networks (NNs)6,7 and support vector machines (SVM),8,9 to the segmentation of high-resolution images.3,10,11 Studies have revealed that image segmentation based on machine learning algorithms can obtain more optimal results compared to traditional statistical and object-oriented methods.12,13 Both SVMs and NNs are shallow learning algorithms,14–16 owing to their limited network structures. Shallow learning algorithms have difficulty expressing complex functions effectively. So, when sample size and diversity are increased, shallow learning models cannot adapt to the increasing complexity. 17,18 Advancements in deep learning enable us to address these problems with deep neural networks (DNNs).19–22 As one of the most important branches of deep learning, the convolutional neural network (CNN) is commonly applied to image data owing to its superior feature learning ability.23–25 The CNN is a deep learning network composed of multiple, nonlinear mapping layers with strong learning abilities that obtain excellent results in image segmentation.26,27 Traditional deep learning methods include deep convolutional neural networks (DCNN)18,28 and deep deconvolutional neural networks (DeCNN).29 Since then, many methods of remote sensing image segmentation based on CNN have been developed.30,31 Many large CNNs with performance that can be scaled depending on the size of training data, model complexity, and processing power have achieved meaningful improvements in the object segmentation of images.32–39 A fully convolutional network (FCN) is a deep learning network for image segmentation originally proposed in 2015.39 Leveraging the advantages of convolutional computation in feature organization and extraction, an FCN establishes a multilayer convolution structure and reasonable sets deconvolution layer to realize pixel-by pixel segmentation.40–42 Researchers have since developed a series of segmentation models based on convolution, including segNET,43 UNet,44 DeepLab,45 multiscale FCN,46 and reSeg.47 Each of these segmentation models has its own strengths and works well with certain selected types of images. Segmentation models such as FCN are effective because the multilayer structure of these models adeptly handles the rich detail features of images. However, in regions of vegetation in Gaofen-2 (GF-2) imagery, one pixel usually contains several different types of plants or crops. Thus, the information between pixels does not reflect this variety and the image texture is smoother. Although a single tree is larger than most crop plants, a typically sized tree occupies two or three pixels in the GF-2 imagery and smaller trees may only occupy one pixel or less, continuing the problem of little information difference between pixels. Therefore, because the detail features of a vegetation region may be lacking, the effect of deep layer CNN may also be weak and may even pull in greater noise, resulting in poor segmentation accuracy. To obtain accurate segmentation results from GF-2 images, the size of the coverage area of individual plants relative to the spatial resolution of the GF-2 imagery must be considered when designing a CNN. Based on the above analysis, we constructed convolutional encode neural networks (CENN) to accurately distinguish and extract farmland and woodland from GF-2 imagery. Because the CENN considers the previously described complexities presented by the features of vegetation regions in the GF-2 imagery, such as the smaller coverage area size of a single plant, fewer detail features, and the continuous appearance of vegetation, our approach achieves improved accuracy compared to the existing approaches. The following summarizes the proposed method for the segmentation of vegetation in GF-2 images and its evaluation.
2.MethodsIn accordance with established conventions for image segmentation using CNNs, we divide the work into two stages, training and classification, as depicted in Fig. 1. The upper part of Fig. 1 shows the training stage. Together, GF-2 imagery and corresponding pixel-by-pixel artificial classification labels are input to the CENN as training samples. The error between the predicted classification labels and the artificial classification labels is calculated and backpropagated through the network using the chain rule. Then the parameters of the CENN are updated using the gradient descent method. The above process iterates until the error is less than a predetermined threshold. The lower part of Fig. 1 details the classification stage, in which the trained CENN accurately extracts vegetation from input GF-2 imagery. 2.1.Network ArchitectureThe CENN model is divided into four functional groups of components, input, convolution layer, encoder layer, and output, as shown in Fig. 2. In the training stage, the inputs are original images and artificial classification labels. In the classification stage, the inputs are the original GF-2 images, the output is a single-band file, and the content of each pixel in the output is the category number of the corresponding original image pixel. The CENN indicate farmland using category number 100, woodland is denoted by category number 150, and category number 200 distinguishes other land use. 2.1.1.Convolution layerIn Sec. 1, we analyzed the characteristics of vegetation areas in GF-2 imagery. Based on this analysis, we constructed a convolution structure in the convolution layer known as “width convolution.” With this convolution structure, we can extract more features to improve segmentation accuracy. The convolutional kernels of the CENN are in form, where denotes the width of the convolutional kernel, denotes the height of the convolutional kernel, and denotes the number of channels of the convolutional kernel. In this paper, is set to 3 because only three channels of GF-2 imagery are employed. As shown in items (b) and (c) of Fig. 2, we adopted four types of convolutional kernel in the CENN, which are referred to as A-type, B-type, C-type, and D-type, respectively; and of A-type are set to 1, and of B-type are set to 3, and of C-type are set to 5, and of D-type are set to 7. This structure earns the name “width convolution” owing to how all the convolutional kernels use the GF-2 images as direct inputs and process the images parallel to each other. The primary function of the A-type convolutional kernels is to extract the color features of each pixel. The B-type convolutional kernels are divided into two groups. The first group of convolutional kernels must be trained. They are used to extract the texture features of the central pixels and the surrounding eight pixels. The second group consists of eight convolutional kernels. The values of these convolutional kernels are fixed, and no further training is required. These eight convolutional kernels are used to calculate the absolute value of the color difference between the central pixel and eight adjacent pixels surrounding the central pixel. The roles of the C-type and D-type convolutional kernels are similar to that of the B-type convolutional kernels, and these two types are also composed of two groups. However, the C-type and D-type convolutional kernels have a wider range with which to exploit the features between center pixels and their surrounding pixels. As shown in Fig. 2, all the convolutional kernels were divided into two groups, a farmland group (b) and a woodland group (c). Both groups contain A-type, B-type, C-type, and D-type convolutional kernels. The purpose of this two-group design is to enable the CENN to better express the characteristics of vegetation, and thus improve their capacity for distinction of different vegetation types. 2.1.2.Encoder layerAs shown in Fig. 2, the encoder layer contains two sublayers of encoders to better simulate the nonlinear relationship between features and outputs. There are two encoders in the first encoder sublayer, the encoder-farmland (d) and the encoder-woodland (e), as observed in Fig. 2. The role of each encoder is to regress and simulate the characteristics of their respective land uses. The second encoder sublayer has a single encoder, the encoder-adjust (f), as depicted in Fig. 2. The role of the encoder-adjust (f) is to adjust the calculations of the upper sublayer so that farmland, woodland, and other lands are distinguished from each other. Then, the flag-function (g) maps the encoded result of encoder-adjust (f) to the appropriate category number. 2.2.Network Training2.2.1.Label sampleThe training dataset comprised a total of 39 GF-2 remote sensing images (size ) of Shandong Province, China. Of these images, 21 images were captured on February 17, 2016, and 18 images were captured on May 12, 2017. The spatial resolution of the panchromatic band was 1 m. The spatial resolution of the multispectral was 4 m. Environment for visualizing images (ENVI) software was used for preprocessing tasks, such as fusion and color stretching. We then selected the 321 band as RGB band to improve visual effects. Artificial label samples are an important training foundation. Because the CENN use pixels as the primary learning object, they must be accurately labeled. We used ENVI software for labeling and designed a preprocessor to build the mask. The process of artificial labeling is as follows:
We manually labeled all images at the pixel level. Thus, for each image, there exists a label map, with a row–column indexed pixel-class correspondence. We used 36 images for training and the remaining three images for testing. Figure 3 shows an example of one image-label pair. 2.2.2.Model trainingImages from two different time periods were selected as training data. We select images from different periods to increase the anti-interference abilities of the CENN, mitigating complications, such as the change of seasons, and thus enhancing applicability. The training stage proceeded through the following steps: Training obtained farmland group convolutional kernels and woodland group convolutional kernels. Each group enhanced the farmland or woodland features, whereas features of all other types were suppressed as much as possible. For example, in the woodland group, woodland features were enhanced while features of all other types were suppressed. In our training, the SGD method with momentum was used for parameter updates, and the following expression illustrates the SGD41,48 method with momentum: where denotes the old parameters, denotes the new parameters, and is the increment for the current iteration. The iteration increment, which is a combination of old parameters, gradient, and historical increment, is calculated as shown below: where, is the loss function, is the learning rate for step length control, denotes the weight decay, and denotes the momentum.2.3.Segment Using the Trained NetworkAfter successful training, the CENN can be used to segment the input imagery pixels-by-pixel. According to our design, the output is written into a new band. The benefit of this design is that in saving the segmentation result to a new band, it avoids damaging the original file. 3.ExperimentsWe designed a set of test experiments and comparative experiments to verify the feasibility of the proposed CENN method. The proposed approach was implemented using Python 2.7 on a Linux Ubuntu 16.04 operating system using an NVIDIA GeForce Titan X Graphics device with 12 GB graphic memory. The data and classification criteria used were described in Sec. 2.2.1. 3.1.Learning Ability IndicatorsThe primary functions of the CENN are reflected in their feature extraction and encoding abilities. The concentration degree of feature values was used as an index to examine feature extraction capabilities, using distinctions between farmland, woodland, and other lands as an index to examine encoding capacity. 3.2.Comparison ModelWe chose the deep belief network (DBN) model, the FCN model, and the DeepLab model as the comparison models. A comparative experiment was conducted using methods established in published literature. 3.2.1.Deep belief networkThe paper of Dawei et al.2 presented a method of pixel-by-pixel classification for high-resolution images using DBN. Their method calculated the texture features of an image through non-subsampled contourlet transform and used the DBN to classify high-resolution remote sensing images based on spectral-texture features. The training process included two subprocesses: pretraining and fine-tuning. Pretraining was performed in an unsupervised manner. A greedy algorithm was used to perform layer-by-level optimization during training, and the parameters of each restricted Boltzmann machine (RBM) were adjusted individually. After training the upper layer, the output is used as the input to train the RBM of next layer. After completing the pretraining, the last-level backpropagation network was trained in a supervised learning manner. The error was propagated backward through the layers and the weight of the entire DBN network was fine-tuned. 3.2.2.Fully convolutional networkFor the FCN model, we directly employed the FCN-8s model proposed by Long et al.40 The architecture of the model was derived from the VGG-16 network. After the upsampling operation, the final prediction was fused from the output of three branches—the primary network, the pool4 layer, and the pool3 layer. In the training phase, the input data and the training parameters for FCN-8s in the comparative experiment were the same as those used to train the proposed model. The testing stage also used the same classification parameters as applied in the proposed approach. 3.2.3.DeepLabFor the DeepLab model, we directly employ the DeepLab v3 model proposed by Chen et al.45 DeepLab was also developed based on the VGG network. Unlike the FCN model, to ensure that the output size would not be not too small without excessive padding, DeepLab changed the stride of the pool4 and pool5 layers of the VGG network from the original 2 to 1, plus 1 padding. To compensate for the influence of the stride change on the receptive field, DeepLab used a convolution method called “Atrous convolution” to ensure that the receptive field after pooling remains unchanged and the output is more refined. Finally, DeepLab incorporated a fully connected conditional random fields (CRF) model to refine the segmentation boundary. 3.3.Results and Comparison3.3.1.Learning ability of the CENNFigure 4 graphs the distribution of the feature values learned from the farmland samples, woodland samples, and other samples using A-type convolutional kernels. Figure 4 reveals that after the convolution operation, the feature values of farmland and woodland are concentrated in two regions while the other land use type is scattered. This is mainly because the different seasons in which the data were collected results in substantial differences in color values. The features’ concentration of woodland is less than that of farmland, mainly because the seasonal color change of woodland is more dramatic than that of farmland. Figure 5 shows the feature values learned from the farmland samples, woodland samples, and other samples using B-type convolutional kernels. As shown in Fig. 5, because the B-type convolutional kernels are primarily used to learn the color difference between adjacent pixels, the features of farmland samples have a better concentration degree, and the dispersion level of woodland sample is substantially larger. The results reflect the smooth texture of farmland and the rough texture of woodland. Additionally, for farmland samples, the features’ concentration of C-type convolutional kernels and D-type convolutional kernels is both less than that of B-type convolutional kernels. This is because as the field of view expands, more other-type pixels are introduced at the boundary. For woodland samples, the C-type convolutional kernels achieve the best concentration of features. Figure 6 shows the learning results of C-type convolutional kernels. Based on the learning results, we observe that although multispace convolutional kernels are more suitable for the extraction of farmland and forestland features than deep convolution, it remains necessary to combine multiple features to accurately determine the category to which each pixel belongs. Figure 7 shows the encoding result of the first encoder sublayer, and Fig. 8 shows the encoding result adjusted by the second encoder sublayer. As demonstrated by the figure, the adjusted encoding result could already be used to segment image. 3.3.2.Experiment result comparisonIn the comparative experiment, we apply our trained model to three GF-2 images for segmentation. All images were equally sized at . These images were only used for testing and were not involved in training. Figure 9 illustrates the results obtained from the comparison methods and the proposed method. Table 1 lists the confusion matrix C of our classification results. From the table, we can see that our approach achieves higher classification performance. In the above example, our recall for farmland and woodland is 0.855. The average proportions of “farmland” wrongly classified as “woodland” and “others” are 0.053 and 0.033, respectively. The average proportions of “woodland” wrongly classified as “farmland” and “others” are 0.047 and 0.053, respectively. Table 1Confusion matrix C of our approach for Fig. 9.
We employ precision, recall, and kappa coefficient as indicators to evaluate our approach. These indexes are calculated from confusion matrix C. Precision denotes the average proportion of pixels correctly classified to one class from the total retrieved pixels. Precision is calculated as follows: where denotes the number of pixels of category that are correctly classified, and denotes the number of pixels of category that misclassified into category .Recall represents the average proportion of pixels that are correctly classified in relation to the actual total pixels of a given class. Recall is computed as follows: The kappa coefficient measures the consistency of the predicted classes with the artificial labels. The indicator values are listed in Table 2. Table 2Comparison between approaches using DBN, FCN, DeepLab, and CENN.
4.DiscussionThis paper presents a classification approach, which extracts vegetation from GF-2 images using the CENN. Compared with the three typical deep learning-based approaches, the proposed method substantially improves classification accuracy. In the following sections, we discuss the reasons for the improvement and the benefits of using the proposed approach to classify land use. 4.1.Deep Belief Network versus the Proposed ApproachIn the method of pixel-by-pixel segmentation based on DBN, the texture features of an image are first calculated. The obtained two-dimensional texture features are converted into one-dimensional vectors. Then, three channel values of RGB are added to the vectors, and they are merged into a single vector. Finally, the DBN network is constructed using each component value of the vector as an independent input to classify the pixels. Although texture features are completely different from spectral values, this method uses them in combination, resulting in logical confusion. If only the value of the texture feature is used, it cannot indicate spatial relations represented by texture, which results in information loss in the texture extraction. Therefore, like traditional spectral-based methods, this method utilizes only the spectral characteristics of the pixel itself and effectively ignores the spatial relationship between pixels, which makes it easy to generate the incorrect classifications. Unlike the DBN method, the CENN model makes full use of the advantages of convolution in information aggregation and uses A-type convolutional kernels to extract the common features of the original spectral value. Three kinds of convolutional kernels: B-type, C-type, and D-type are used to extract textural features in three sizes. The CENN use the two-stage encoder to simulate nonlinear equations and encode the features. These strategies effectively improve the classification accuracy. In GF-2 images, it is easy to confuse tall crops and small, dense trees owing to the relatively small differences in texture and spectrum. As observed in Fig. 10, the proposed method is substantially more effective than the DBN method. 4.2.Fully Convolutional Network versus the Proposed ApproachThe advantage of the FCN model is that it maximizes the information available in the rich details of GF-2 imagery using deep convolution. This advantage is obvious when extracting a target object that covers many pixels, but if the target object covers fewer pixels, even pixels that contain several target objects, the effect of deep layer convolution drastically weakens, and even greater noise may be pulled in, resulting in poor segmentation effect. When FCN is used to extract farmland and woodland from GF-2 images, although the farmland or woodland may cover many pixels, the advantages of FCN cannot be exploited because of the small differences between pixels. Unlike the FCN model, which expands the view through deep convolution, the proposed CENN expand the view by using three convolutional kernels of B-type, C-type, and D-type. By combining three convolutional kernels, the maximum observed area size is about , which is fully capable of covering most of the canopy. The CENN not only make full use of the pixels’ own features but also fully exploit the spatial features between pixels, adeptly accounting for the continuous appearances of crops and trees. Additionally, the CENN also fully consider the natural characteristics of farmland and woodland, providing further advantages in identifying corner pixels. Figure 11 demonstrates that the FCN model and the CENN have the same segmentation accuracy when identifying the interior regions of farmland and woodland. However, when identifying the pixels in the corner region, the FCN model had many errors, whereas the CENN has almost no errors. 4.3.DeepLab versus the Proposed ApproachCompared to the FCN model, there are two important improvements in the DeepLab model: (1) the deconvolution part is improved and (2) the network uses the fully connected CRF model to refine the segmentation boundary. According to existing literature, when identifying large objects such as buildings, the segmentation accuracy at the boundary of DeepLab is better than that of the FCN. This is because DeepLab makes better use of the details of the image and spatial correlation of pixels. However, when DeepLab was used to identify woodland and farmland, because the information between pixels does not noticeably change and the image texture is smoother, the advantages of DeepLab are lost. As explained in Sec. 4.2, the CENN not only make full use of the pixels’ own features but also make fully exploit the spatial features between conjoint pixels and account for the continuous appearance of crops and trees. Therefore, the CENN can effectively avoid the defects of the DeepLab approach and ensure the segmentation accuracy. As shown in Fig. 12, the segmentation accuracy of the CENN at the boundary and corners is much better than that of DeepLab approach. 4.4.Benefits of Using the Proposed Approach to Classify Land UseAccurate land use classification results play an important role in scientific research and agricultural production. The use of remote sensing data to produce land use classification results is becoming more common. In the GF-2 image, it is difficult to accurately distinguish between forest land and farmland using traditional methods. Because our approach can better solve this problem, it has played an important role in agricultural surveys and improved the efficiency of agricultural surveys. Our approach has been applied in the Meteorology Bureau of Shandong Province, China. 5.ConclusionsThis paper presents a CENN model that extracts vegetation in farmlands and woodlands from GF-2 images. Compared to the traditional DBN model, FCN model, and DeepLab model, the proposed CENN fully consider the characteristics of farmland and woodland in the GF-2 images. According to the characteristics of the model, categorical training was implemented to enable the model to effectively discriminate farmland and woodland from other land types and extract vegetation from GF-2 images with high accuracy. The paper also provides a software-based method of using RoI for sample annotation, which can reduce the manual workload and enhance the efficiency of marking. The main limitation of our approach is that the accuracy of the extraction results is greatly reduced when the CENN is applied to submeter level images, which results in a limited application scope of the model. In the following work, we will try to use multilayer convolutions, to further enhance its applicability. AcknowledgmentsAll the works were supported by Science Foundation of Shandong (Grant Nos. ZR2017MD018 and ZR2016DP01), the National Science Foundation of China (Grant Nos. 41471299 and 41671440), open research project of key laboratory on meteorological disaster monitoring, early warning, and risk management in characteristic agricultural areas of arid area (Grant No. CAMF-201701). ReferencesD. Zhipeng, W. Mi and L. I. Deren,
“A high resolution remote sensing image segmentation method by combining superpixels with minimun spanning tree,”
Acta Geod. Cartographica Sin., 46
(6), 734
–742
(2017). https://doi.org/10.11947/j.AGCS.2017.20160514 Google Scholar
L. Dawei, H. Ling and H. Xiaoyong,
“High spatial resolution remote sensing image classification based on deep learning,”
Acta Opt. Sin., 36
(4), 0428001
(2016). https://doi.org/10.3788/AOS GUXUDC 0253-2239 Google Scholar
C. Liu et al.,
“Fusion of pixel-based and multi-scale region-based features for the classification of high-resolution remote sensing image,”
J. Remote Sens., 19
(2), 228
–239
(2015). https://doi.org/10.11834/jrs.20154035 Google Scholar
J. Jin, Z. Zou and C. Tao,
“Compressed texton based high resolution remote sensing image classification,”
Acta Geod. Cartographica Sin., 43
(5), 493
–499
(2014). https://doi.org/10.13485/j.cnki.11-2089.2014.0086 Google Scholar
Z. Wu et al.,
“On combining spectral, textural and shape features for remote sensing image segmentation,”
Acta Geod. Cartographica Sin., 42
(1), 44
–50
(2013). Google Scholar
D. M. Miller, E. J. Kaminsky and S. Rana,
“Neural network classification of remote-sensing data,”
Comput. Geosci., 21 377
–386
(1995). https://doi.org/10.1016/0098-3004(94)00082-6 CGEODT 0098-3004 Google Scholar
J. Mas and J. Flores,
“The application of artificial neural networks to the analysis of remotely sensed data,”
Int. J. Remote Sens., 29 617
–663
(2008). https://doi.org/10.1080/01431160701352154 IJSEDK 0143-1161 Google Scholar
G. Camps-Valls and L. Bruzzone,
“Kernel-based methods for hyperspectral image classification,”
IEEE Trans. Geosci. Remote Sens., 43 1351
–1362
(2005). https://doi.org/10.1109/TGRS.2005.846154 IGRSD2 0196-2892 Google Scholar
G. Mountrakis, J. Im and C. Ogole,
“Support vector machines in remote sensing: a review,”
ISPRS J. Photogramm. Remote Sens., 66 247
–259
(2011). https://doi.org/10.1016/j.isprsjprs.2010.11.001 IRSEE9 0924-2716 Google Scholar
F. Pacifici, M. Chini and W. J. Emery,
“A neural network approach using multi-scale textural metrics from very high-resolution panchromatic imagery for urban land-use classification,”
Remote Sens. Environ., 113
(6), 1276
–1292
(2009). https://doi.org/10.1016/j.rse.2009.02.014 Google Scholar
X. Huang and L. Zhang,
“An SVM ensemble approach combining spectral, structural, and semantic features for the classification of high resolution remotely sensed imagery,”
IEEE Trans. Geosci. Remote Sens., 51
(1), 257
–272
(2013). https://doi.org/10.1109/TGRS.2012.2202912 IGRSD2 0196-2892 Google Scholar
Y. Yuan, J. Lin and Q. Wang,
“Hyperspectral image classification via multitask joint sparse representation and stepwise MRF optimization,”
IEEE Trans. Cybern., 46 2966
–2977
(2016). https://doi.org/10.1109/TCYB.2015.2484324 Google Scholar
Q. Wang, J. Lin and Y. Yuan,
“Salient band selection for hyperspectral image classification via manifold ranking,”
IEEE Trans. Neural Netw. Learn. Syst., 27 1279
–1289
(2016). https://doi.org/10.1109/TNNLS.2015.2477537 Google Scholar
Y. Bengio,
“Learning deep architectures for AI,”
Found. Trends Mach. Learn., 2
(1), 1
–127
(2009). https://doi.org/10.1561/2200000006 Google Scholar
H. Larochelle et al.,
“Exploring strategies for training deep neural networks,”
J. Mach. Learn. Res., 10
(1), 1
–40
(2009). Google Scholar
N. Jones,
“The learning machines,”
Nature, 505
(7842), 146
–148
(2014). https://doi.org/10.1038/505146a Google Scholar
T. Nguyen, J. Han and D. C. Park,
“Satellite image classification using convolutional learning,”
in Proc. of the AIP Conf.,
2237
–2240
(2013). https://doi.org/10.1063/1.4825984 Google Scholar
J. Wang et al.,
“Road network extraction: a neural-dynamic framework based on deep learning and a finite state machine,”
Int. J. Remote Sens., 36 3144
–3169
(2015). https://doi.org/10.1080/01431161.2015.1054049 IJSEDK 0143-1161 Google Scholar
R. Taormina and K. W. Chau,
“Data-driven input variable selection for rainfall-runoff modeling using binary-coded particle swarm optimization and extreme learning machines,”
J. Hydrol., 529 1617
–1632
(2015). https://doi.org/10.1016/j.jhydrol.2015.08.022 JHYDA7 0022-1694 Google Scholar
Z. Liang et al.,
“Fuzzy prediction of AWJ turbulence characteristics by using typical multi-phase flow models,”
Eng. Appl. Comput. Fluid Mech., 11 225
–257
(2017). https://doi.org/10.1080/19942060.2016.1277556 Google Scholar
S. A. I. Bellary et al.,
“Application of computational fluid dynamics and surrogate-coupled evolutionary computing to enhance centrifugal-pump performance,”
Eng. Appl. Comput. Fluid Mech., 10 171
–181
(2016). https://doi.org/10.1080/19942060.2015.1128359 Google Scholar
J. Zhang and K. W. Chau,
“Multilayer ensemble pruning via novel multi-sub-swarm particle swarm optimization,”
J. Univ. Comput. Sci., 15 840
–858
(2009). Google Scholar
W. C. Wang et al.,
“Improving forecasting accuracy of annual runoff time series using ARIMA based on EEMD decomposition,”
Water Resour. Manage., 29 2655
–2675
(2015). https://doi.org/10.1007/s11269-015-0962-6 WRMAEJ 0920-4741 Google Scholar
S. Zhang and K. W. Chau,
“Dimension reduction using semi-supervised locally linear embedding for vegetation leaf classification,”
Lect. Notes Comput. Sci., 5754 948
–955
(2009). https://doi.org/10.1007/978-3-642-04070-2 LNCSD9 0302-9743 Google Scholar
C. Wu, K. Chau and C. Fan,
“Prediction of rainfall time series using modular artificial neural networks coupled with data-preprocessing techniques,”
J. Hydrol., 389 146
–167
(2010). https://doi.org/10.1016/j.jhydrol.2010.05.040 JHYDA7 0022-1694 Google Scholar
M. Castelluccio et al.,
“Land use classification in remote sensing images by convolutional neural networks,”
(2015). Google Scholar
F. Hu et al.,
“Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery,”
Remote Sens., 7 14680
–14707
(2015). https://doi.org/10.3390/rs71114680 Google Scholar
S. Saito, T. Yamashita and Y. Aoki,
“Multiple object extraction from aerial imagery with convolutional neural networks,”
J. Imaging Sci. Technol., 60 10402-1
–10402-9
(2016). https://doi.org/10.2352/J.ImagingSci. Technol.2016.60.1.010402 ELIMEX Google Scholar
H. Noh, S. Hong and B. Han,
“Learning deconvolution network for semantic segmentation,”
in Proc. of the IEEE Int. Conf. on Computer Vision,
1520
–1528
(2015). https://doi.org/10.1109/ICCV.2015.178 Google Scholar
S. Paisitkriangkrai et al.,
“Effective semantic pixel labelling with convolutional networks and conditional random fields,”
in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition Workshops,
36
–43
(2015). https://doi.org/10.1109/CVPRW.2015.7301381 Google Scholar
G. Papandreou, I. Kokkinos and P. A. Savalle,
“Untangling local and global deformations in deep convolutional networks for image classification and sliding window detection,”
(2014). Google Scholar
V. Badrinarayanan, A. Handa and R. Cipolla,
“Segnet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labeling,”
(2015). Google Scholar
S. Ioffe and C. Szegedy,
“Batch normalization: accelerating deep network training by reducing internal covariate shift,”
(2015). Google Scholar
J. Liu, B. Liu and H. Lu,
“Detection guided deconvolutional network for hierarchical feature learning,”
Pattern Recognit., 48 2645
–2655
(2015). https://doi.org/10.1016/j.patcog.2015.02.002 Google Scholar
M. Volpi and V. Ferrari,
“Semantic segmentation of urban scenes by learning local class interactions,”
in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition Workshops,
1
–9
(2015). https://doi.org/10.1109/CVPRW.2015.7301377 Google Scholar
K. Simonyan and A. Zisserman,
“Very deep convolutional networks for large-scale image recognition,”
(2015). Google Scholar
T. Panboonyuen et al.,
“An enhanced deep convolutional encoder-decoder network for road segmentation on aerial imagery,”
Adv. Intell. Syst. Comput., 566 191
–201
(2017). https://doi.org/10.1007/978-3-319-60663-7_18 Google Scholar
A. Kendall, V. Badrinarayanan and R. Cipolla,
“Bayesian segnet: model uncertainty in deep convolutional encoder-decoder architectures for scene understanding,”
(2016). Google Scholar
L. C. Chen et al.,
“Semantic image segmentation with deep convolutional nets and fully connected CRFs,”
(2016). Google Scholar
J. Long, E. Shelhamer and T. Darrell,
“Fully convolutional networks for semantic segmentation,”
in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition,
3431
–3440
(2015). https://doi.org/10.1109/CVPR.2015.7298965 Google Scholar
G. Fu et al.,
“Classification for high resolution remote sensing imagery using a fully convolutional network,”
Remote Sens., 9 498
(2017). https://doi.org/10.3390/rs9050498 Google Scholar
J. Dolz,
“3D fully convolutional networks for subcortical segmentation in MRI: a large-scale study,”
Neuroimage, 170 456
–470
(2018). https://doi.org/10.1016/j.neuroimage.2017.04.039 NEIMEF 1053-8119 Google Scholar
V. Badrinarayanan, A. Kendall and R. Cipolla,
“SegNet: a deep convolutional encoder-decoder architecture for image segmentation,”
IEEE Trans. Pattern Anal. Mach. Intell., 39
(12), 2481
–2495
(2017). https://doi.org/10.1109/TPAMI.2016.2644615 ITPIDJ 0162-8828 Google Scholar
O. Ronneberger, P. Fischer and T. Brox,
“U-Net: convolutional networks for biomedical image segmentation,”
(2015). Google Scholar
L. C. Chen et al.,
“DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs,”
(2017). Google Scholar
H. Lin, Z. Shi and Z. Zou,
“Maritime semantic labeling of optical remote sensing images with multi-scale fully convolutional network,”
Remote Sens., 9 480
–501
(2017). https://doi.org/10.3390/rs9050480 Google Scholar
F. Visin et al.,
“Reseg: a recurrent neural network-based model for semantic segmentation,”
in Proc. of the IEEE Conf.on Computer Vision and Pattern Recognition Workshops,
41
–48
(2016). https://doi.org/10.1109/CVPRW.2016.60 Google Scholar
D. P. Kingma and J. Ba,
“Adam: a method for stochastic optimization,”
(2017). Google Scholar
BiographyChengming Zhang is currently a professor working at the College of Information Science and Engineering of Shandong Agricultural University. His main research areas are remote sensing and geographic information system in land use monitoring and evaluation, presided over a number of agricultural remote sensing projects by Ministry of Science and Technology and Shandong Province. Currently, he is mainly engaged in the research of remote sensing technology in agriculture and environment. Jiping Liu is currently a professor at the Chinese Academy of Surveying and Mapping. His main research interests are the application of remote sensing in urban geographic information acquisition and the study of remote sensing data. He has participated in key projects in many countries and achieved excellent results. Currently, he focuses on the research of intelligent processing of remote sensing and big data. Fan Yu is currently an associate professor at the Beijing University of Civil Engineering and Architecture. His main research areas are remote sensing image processing, he has published several academic papers in this area. He also takes part in many research projects founded by Ministry of Science and Technology of China. Currently, he is mainly engaged in the research of remote sensing technology in urban management and city information acquisition. Shujing Wan is currently a engineer at the College of Network Information Center of Qufu Normal University. Her main research areas are remote sensing and digital image processing. Currently, she is mainly engaged in the research of information construction and remote sensing technology in agriculture and the environment. Yingjuan Han is a senior engineer at the Ningxia Institute of Meteorology. Her main research area is remote sensing of ecology and agriculture. Her main research area is the application of remote sensing of ecology and agriculture in arid and semiarid areas. Currently, she is carrying out remote sensing extraction of crop planting information and monitoring of ecological environment elements by remote sensing. Jing Wang is a senior engineer at the Ningxia Meteorological Science Institute. Her main research area is remote sensing in agriculture. Currently, she is mainly engaged in the research of fruit tree classification and planting area extraction by remote sensing technology. Gang Wang is currently a junior studying at the College of Information Science and Engineering of Shandong Agricultural University. His main research areas are remote sensing and geographic information system. Currently, he is mainly engaged in the research of remote sensing technology in classification and LUCC. |