Lawn plant identification and segmentation based on least squares support vector machine and multifeature fusion

Abstract. Different turfs have different growth characteristics, engendering differences in the number of maintenance cycles and amounts of pesticides used; therefore, studying their subtle color and shape differences through image recognition is crucial. Our study proposes an improved least squares support vector machine (LS-SVM) pixel classification method for this purpose. The sensitivity to local color changes in the hue, saturation, and value color space is considered, and the Sobel operator is used to extract the homogeneity as pixel-level color features. The maximum local energy, gradient, and second-order moment matrix of image pixels are obtained as texture features using a Gabor filter. Seven shape features of different plant leaves are calculated, multiple extracted features are used as LS-SVM classifier inputs, and samples are selected and trained with a dynamic threshold. The trained classifier can be used for segmentation. The experiments showed that it could use the local information of the color images and the excellent generalization ability of LS-SVM to segment lawn plants effectively. Under different weather conditions, the penalty coefficient, C, and kernel parameters with optimal generalization were Bayesian optimized to obtain a segmentation rate exceeding 95%. This algorithm yields a higher classification rate for plants with less obvious differences in texture and shape and optimizes space and time complexities.


Introduction
Lawn plants are classified based on corresponding standards to enable green units to formulate reasonable plans and select suitable lawn grass species.This can help maintenance staff and promote the construction of modern ornamental gardens and sport lawns.Different turf species have dissimilar growth characteristics, resulting in different stubble heights and thus different maintenance and pruning cycles; therefore, the plant species must be classified.This study focused on tall fescue, ryegrass, bluegrass, carpet grass, and other common lawn grasses and used image recognition technology to classify them.After lawn types are identified, the theoretical basis for setting the height of a lawnmower can be determined according to the respective growth characteristics of the lawns to realize scientific and efficient mowing operations.
In plant identification processes, flowers and fruits may last for only a few weeks, [1][2][3] whereas plant leaves remain nearly constant throughout the year. 4,5Furthermore, leaf shape is one of the most important visual features for describing many plants; therefore, classification based on leaf characteristics is crucial.Common classification techniques include the k-nearest neighbor (KNN), probabilistic neural network (PNN), and support vector machine (SVM) classifiers; however, these techniques have obvious drawbacks.For example, the KNN technique is expensive for testing every instance, 6,7 sensitive to noise, and provides irrelevant inputs.Moreover, a PNN has a large network structure and an excessive number of attributes. 8,9An SVM classifier cannot easily be trained on large-scale training samples and can only classify them into two categories. 10his study addresses these shortcomings by effectively classifying leaves using a least squares SVM (LS-SVM) classifier. 11ngrouille and Laird 12 classified oak species with 27 leaf shape features using the principal composition analysis method.Sixta 13 used the internal distance of shape context for leaf recognition.Rossatto et al. 14 used the volume fractal dimension and naive Bayesian classification for leaf image recognition.Mallah et al. 15 described a method for improving the recognition rate with a small training set and incomplete feature extraction.They used a K-nearest value classifier combined with a feature vector and density estimation method to improve the recognition rate.The recognition rate reached 96% when three features were combined, whereas it was 91% when four features were used.Du et al. 16 extracted plant leaf shape features and image-invariant moments, and they used a mobile center hypersphere classifier to recognize more than 20 plants.Wu et al. 3 orthogonalized the shape and texture features and increased the number of recognition categories to more than 30 using a PNN.Priya et al. 17 used the same feature dataset as Ref. 3 to improve the recognition rate using an SVM.Elhariri et al. 18 combined the color features on the basis of the shape and texture features to recognize more than 30 types.
Most lawn leaves have different shapes, rich colors, and textures.Therefore, these are the main characteristics used to distinguish plant species. 19Most of the leaves identified in articles, such as Refs. 3 and 12-18, are relatively flat and broad.However, the objects of leaf recognition in this paper are all slender and clustered lawn plants, and the background information is relatively more complex.Therefore, the study in this paper is conducive to the development of intelligent lawn mowers.
The main contributions of this study are as follows: (1) based on the LS-SVM segmentation algorithm, an algorithm was developed for effectively identifying and segmenting turf plant images; (2) common turf plants were classified using the improved algorithms; and (3) the accuracy and effectiveness of the turf identification were evaluated qualitatively for different weather conditions.
2 Lawn Recognition Algorithm Lawn images may contain random noise due to the influence of light, uneven exposure, camera distortion, or other types of interference.To suppress the background interference and random noise and to highlight the object region of the image in the identification process, the image can be segmented using an LS-SVM method with a dynamic threshold.When a dynamic threshold is used for selecting the training sample, a fast, stable, and reasonable training sample can be obtained.When images are identified using the LS-SVM method, the equality constraint in the standard SVM is changed to an inequality constraint, and the quadratic programming problem is transformed into a problem of linear equations.Therefore, the computational complexity is reduced considerably, and the computational speed becomes higher than that of a general SVM. Figure 1 shows a detailed description of the specific process steps for categorizing turf plants using an LS-SVM and a dynamic threshold.
In the experiment conducted in this study, a Daheng Image Company's Mercury series mer-231-41u3c camera was used to collect lawn images.The camera resolution was determined to be 1920 × 1200, and it was matched with an m0814-mp2 8-mm-focal-length lens.
During the collection of lawn images, the lawn height was 30 to 180 mm.To ensure that the lawn was in the camera shooting range, the distance between the camera bracket and the lawn level was set at 600 mm, and the camera center was 90 mm from the ground.A total of 240 images of four common lawns, tall fescue, perennial ryegrass, Kentucky bluegrass, and carpet grass, were collected under sunny and cloudy conditions, as shown in Fig. 2.

Dynamic Threshold
Due to shadows, different background contrast levels, burst noise, and background gray-level changes, the lawn images cannot be segmented effectively using traditional methods, such as those involving a global fixed threshold 20,21 or an Otsu threshold. 22Therefore, a set of dynamic thresholds related to pixel position can be used for local dynamic threshold processing in a local neighborhood.The basic steps of dynamic threshold segmentation are as follows:  1.The whole image is segmented into subimages with a 50% overlap, and the sizes are determined by the average filter.The mean filter determines the size of the subimages, mainly by setting the size of the template window, for instance as a 3 × 3 matrix.2. Determine a histogram for each subimage.3. Detect whether the histogram of each subimage is bimodal, and if so, interpolate to obtain the threshold, g 0 , of all the subimages, otherwise no processing is performed.4. Interpolate the threshold values of the subimages to obtain the threshold value, g t , for the whole image.
Here, f is set as the customized pixel value offset, which refers to the size of the mask.Its optimal range is [5, 40]. 5.The brighter part of the image is assumed to be the grass sample.In each subimage, pixels satisfying the condition g 0 ≥ g t þ f are regarded as the object; the remaining pixels are considered as the background.
If f is excessively small, many small areas containing noise will be extracted.By contrast, if f is excessively large, the region cannot be extracted easily.In this scenario, the optimal value of f can be set as 10 based on training with a large number of templates.

Color Spatial Feature Extraction
Compared with red, green, and blue (RGB) color space, hue, saturation, and value (HSV) color space can express the tone and brightness of color more directly.Furthermore, it can separate color and brightness information.The equation 23 for conversion from RGB to HSV is as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 6 3 ; 3 8 7 V ¼ maxðR; G; BÞ; E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 6 3 ; E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 6 3 ; 3 0 9 H ¼ where ðR; G; BÞ represent the red, green, and blue coordinates, respectively; S (saturation) and V (value) lie in the range [0, 1], and H (hue) lies in the range [0, 360].To eliminate the influence of light on color, only pixel-level color features are considered from the color space of S. Figure 3 shows the H, S, and V components of an image in the HSV color space.
In the proposed algorithm, the pixel-level color features that conform to the human visual system are divided into two parts: discontinuity and standard deviation.Here, P x;y ¼ ðP H x;y ; P S x;y ; P V x;y Þ indicates that the pixel is located at ðx; yÞ with three components in an M × N size image, and CF x;y represents the color features of pixel P x;y .

Compute the noise discontinuity
Discontinuity refers to the inconsistency of pixel gray values at the junction of different regions; it mainly describes the edge amplitude.The discontinuity of color components, P k x;y ðk ¼ H; SÞ, selected in this study is represented by c k x;y .Sobel, Canny, Derish, Laplacian, and other edge operators were used for the edge computation of the same noisy images.The results show that the performance of the second-order differential Laplacian operator is better than that of other traditional edge operators.However, the images must be enhanced and smoothed before the gradient calculation is performed.
Because determining the exact edge position is not necessary, the discontinuity and gradient can be calculated using the simple Sobel operator: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 3 2 6 ; 3 9 6 where G k 2 x 0 and G k 2 y 0 are the components of the gradient in the x 0 and y 0 directions, respectively.

Compute the standard deviation
By assuming that the signal is ergodic, the standard deviation, υ k x;y , describes the intensity variation within a local image window and is calculated for a pixel component, P k x;y ðk ¼ H; SÞ, as follows: ; t e m p : i n t r a l i n k -; e 0 0 5 ; 3 2 6 ; 2 5 7 x;y is the mean value of the color component, P k x;y ðk ¼ H; SÞ, in local window, Ω x;y and is defined as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 3 2 6 ; 1 6 8 where Ω x;y is a local window of d × d whose center is pixel ðx; yÞ, and d is an odd number >1.

Compute the pixel-level color features
Herein, the local homogeneity of image pixels is defined by pixel-level color features.Homogeneity is closely related to the local information extracted from an image, and it reflects the consistency of the color features in the local region.However, the aim of image segmentation is to divide the image into several homogeneous regions; therefore, the local homogeneity, as a color feature for regional segmentation, plays a major role in the process.
The local homogeneity consists of the standard deviation and discontinuity of the color component, P k x;y ðk ¼ H; SÞ, and it is represented by E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 6 3 ; 4 9 7 hðP k where Here, υ k x;y and c k x;y represent the standard deviation and discontinuity, respectively, of the pixel color component, P k x;y ðk ¼ H; SÞ, located at ðx; yÞ.
Finally, the pixel, P x;y , with CF x;y ¼ ½hðP H x;y Þ; hðP S x;y Þ can be obtained at location ðx; yÞ.
Figure 4 shows the pixel-level color characteristics of the H and S components.

Texture Feature Extraction
Texture is usually combined with the color features and applied to image segmentation.In this algorithm, the local energy, local gradient, and local second moment are extracted from the six directional subbands of the Gabor filter as pixel-level texture features.The Gabor filter can not only reduce the influence of illumination and noise but it also preserves the edge information of texture in different scales and directions through its good bandpass and directional selectivity.Figure 5 shows the six directional subbands of the Gabor filter.

Selection of the color space
The S component of the HSV color space is selected to represent the texture because it closely matches the human perception of lightness, and this color space can control the color and brightness information independently.

Application of the Gabor filter to the S component
In this study, a Gabor filter with six orientations and two scale bands was used to decompose the S component.Khan et al.'s 24 research shows that 4 to 6 orientation subbands could approximate the directional selectivity of the human visual system, and two-level decomposition could be selected for turf images.Figure 6 shows the Gabor filter's six directional and two scale subbands.

Extraction of the local energy
The local energy of the S component can be calculated 24 as follows: The gradient is used to measure the change in pixel value in the x and y directions, which is an important measure of image features.If a region is smooth, its gradient is small.In this algorithm, G x;y is used to represent the maximum value of the 12 gradient magnitudes at location ðx; yÞ, which is another pixel texture feature at location ðx; yÞ: ; t e m p : i n t r a l i n k -; e 0 1 2 ; 3 2 6 ; 6 7 1

Extraction of the local second moment matrix
To describe the texture of lawn leaves, a texture feature based on the local second-order moments can be used, which can be considered as the covariance matrices of two-dimensional random variables.The energy values in the two main directions in the neighborhood are represented by their eigenvalues.When one eigenvalue is greater than the others, the local neighborhood represents the texture and dominates the orientation.The second moment, M m;n x;y , of the Gabor filter subband coefficient, G m;n ðx; yÞ, is expressed as follows: Here, * represents the convolution, ∇I represents the gradient, G m;n x and G m;n y , respectively, represent the components of the gradient in the x and y directions, and G σ ðx; yÞ is a separable binomial approximation to a Gaussian smoothing kernel with a variance σ 2 : E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 5 ; 3 2 6 ; 3 6 4 Here, M m;n x;y is a symmetric semipositive definite matrix; and λ 2 ðλ 1 > λ 2 Þ are defined as the eigenvalues of M m;n x;y ; and ϕ is the main eigenvector of M m;n x;y .When λ 1 and λ 2 are negligible, the local neighborhood is approximated by a constant, representing a nontextured area.In contrast, a large value represents a textured area.
In the proposed algorithm, the sum of the eigenvalues λ 1 and λ 2 of the second moment matrix, M m;n x;y , is defined as the pixel-level feature, V m;n x;y .The maximum value of the sum of 12 eigenvalues is represented by V x;y , which is the third pixel texture feature: Finally, the texture feature, TF x;y , of the image pixel, P x;y , is obtained at location ðx; yÞ: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 7 ; 6 3 ; 7 0 3 TF x;y ¼ ðE x;y ; G x;y ; V x;y Þ: (17)   The image of each pixel-level texture feature is shown in Fig. 7.

Shape Feature Extraction
In addition to the color and texture characteristics, shape features constitute another typical characteristic of lawn plants.Therefore, shape features can be added to the classification and recognition processes.In this paper, the method presented in Ref. 25 is used to extract the geometric parameters of four plants by the frequently used Fourier harmonic function.The shapes of different lawn plants have obvious differences, which can be adequately described by their rectangularity, aspect ratio, roundness, and sphericity.The aspect ratio is defined as the ratio of the length to the width of the rectangular blade.
Rectangularity is defined as the ratio of the blade area to the minimum outer rectangle, which reflects the degree to which the object fills the minimum external rectangle and lies in the range (0, 1).
Circularity is defined as the ratio of 4π times the blade area to the square of the circumference, l; it reflects the degree of compact correlation between the blade and the circumferential circle.
Sphericity is the ratio between the blade area and the circumference of the smallest circumscribed rectangle.
Eccentricity is defined as the ratio between the long axis and the short axis of the blade.
Lobation is defined as the ratio of the shortest distance between the center of gravity of the blade area and the boundary to the short axis of the blade; it can reflect the amplitude characteristics of the blade boundary.
The circumference-to-diameter ratio is defined as the ratio of the blade circumference to the long axis (Table 1).
The geometric parameters of the blade region can be calculated according to the extracted blade profile of the lawn, including the blade area, A 0 , blade area perimeter, l, length, a R , and width, b R , of the minimum external rectangle of the blade region, the long axis a and short axis b of the blade region, and the shortest distance, l min , from the center of the blade region to the boundary.
Table 2 shows the shape characteristics of lawn plants.

Least Squares Support Vector Machine Identification Model
The classification and recognition of lawn plants is essentially a complex multiclass discrimination problem.The classification of lawn plants is a small sample and multiclass classification problem that can be solved using an SVM classifier.Therefore, a classifier model based on an LS-SVM, an improved version of the traditional SVM method, was developed.This method adopts a least squares linear system as the loss function and solves a set of linear equations instead of the complex quadratic programming problem in the SVM.Moreover, this method has a low computational complexity and has the advantages of good generalization and a fast learning speed.
Assume that the given training set is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 8 ; 3 2 6 ; 3 8 3 where x is the input vector, l is the number of samples, n is the dimension of the input vector, and y is the type of input vector.
The basic idea of the LS-SVM method is to find the smallest hyperplane kωk in a separable hyperplane, which is the same as the classical SVM.However, the LS-SVM method gives an e i correction for each data point; therefore, the problem of finding the hyperplane optimization is transformed to a convex optimization problem: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 0 ; 3 2 6 ; 1 9 2 The Lagrangian method is used to solve the mentioned optimization problem, which is transformed into solving a linear equation: where α is the Lagrange multiplier, Table 1 The lawn shape characteristic formulas.

Shape feature Formula
Leaf aspect ratio In the LS-SVM method, the supporting value is proportional to the error of the data points, e i , and γ is 1.This method nonlinearly maps input parameters to the highdimensional feature space and constructs the same optimal decision function as the classical SVM according to the principle of structural risk minimization.
E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 2 ; 6 3 ; 6 8 6 fðxÞ Then, each test sample, x i , is substituted into fðxÞ, and the type of test sample can be obtained.

Results and Analysis
An experiment was performed using a computer with an Intel (R) i5-2520M 2.5 GHz quad-core processor with 4 GB of memory.The algorithm was developed using VS2012 C# along with the HALCON© 12.0.1 image processing library (MVTec Software GmbH, Germany).The proposed algorithm was evaluated using three evaluation criteria.In Sec.3.1, the segmentation performance of the fuzzy threshold method, maximum between-group variance method (Otsu), and the proposed algorithm are compared with a complex background.In Sec.3.2, the radial basis function (RBF) kernel function of the LS-SVM and multifeature fusion are selected as the recognition method through several experimental comparisons.In Sec.3.3, the stability and accuracy of the algorithm under different illumination conditions are verified.

Image Segmentation Evaluation Results
Four images with different complex backgrounds were selected from the collected samples to evaluate the proposed segmentation method.Figure 8(a) shows the original images, and Fig. 8(b) shows the segmented images obtained using the proposed algorithm.To further evaluate the segmentation performance of the proposed method, the performance of the proposed method was compared with that of the fuzzy threshold method and Otsu method for various categories, and the results are presented in Figs.8(c) and 8(d).
As presented in Fig. 8, all three image segmentation algorithms provided good segmentation performance levels for a single plant with a simple background.For the relatively complex background of the last three images (a 2 perennial ryegrass; a 3 tall fescue; a 4 carpet grass), the proposed algorithm could ignore the remaining small weed areas and identify the region of interest directly; this capability is consistent with the human visual observation characteristics and demonstrates the strong adaptability of the algorithm to changes in lighting conditions.

Discrimination Using Support Vector Machines
This study proposes an LS-SVM-based method for segmenting and recognizing common lawn plants.First, a square area of 3 × 3 pixels serves as the convolution unit, and the pixel color features of the H and S components are selected as two color input features.Second, the local energy, gradient, and second moment are selected as the texture input features.The leaf aspect ratio, rectangularity, circularity, sphericity, eccentricity, lobation, and the circumference-to-diameter ratio of the blade region are then selected as shape input features.Finally, a suitable local threshold is selected based on dynamic threshold processing, and 13 feature vectors are constructed as the input for LS-SVM training samples for the recognition and classification of lawn plant images.
In this study, four kinds of common lawn plants were selected as objects to be identified.A total of 240 images of the four selected plants are available, and every kind of these selected plants can be found in the 60 images.A total of 120 lawn images were selected randomly as the SVM training set, and the others were selected as the test set.To observe the recognition accuracy of the SVM RBF, a linear kernel function, polynomial kernel function, RBF, and multilayer perceptron kernel function (sigmoid) were used for comparison.Table 3 shows the characteristics of the identification test data for the different kernel functions and their comparison.
Table 3 shows that when the RBF was used as the kernel function of the SVM, the classification rate for the training and test sets could reach 99.9% or more.This was the best result among all the kernel functions.Bayesian optimization, 26 grid search, 27 random search, and other methods have been applied by many researchers to determine the values of the penalty coefficient, C, and the kernel parameter σ.After a certain range of sample testing and comparison, the Bayesian optimization was found to be the best choice in this study, and the best values of C and σ were obtained (C ¼ 8 and σ 2 ¼ 0.42).
Based on the same characteristics of the input vector, the performance of the traditional SVM method was compared with that of the proposed LS-SVM method.Table 3 shows the classification results.Figure 9 shows that the LS-SVM combined with color, texture, shape, and dynamic threshold parameters provided the best segmentation results, and the overall recognition rate for the four types of turf reached 92.88%.In addition, the recognition results for the four types of turf were compared; the recognition rate for carpet grass was the highest (94.1%), whereas that for Kentucky bluegrass, it was only 85.37%.This is because the difference between the shape and texture characteristics of Kentucky bluegrass and other lawns is not obvious and is affected by illumination and other noise.To further improve the recognition accuracy, the number of shape and texture feature parameters can be increased in a later stage.

Lighting Performance Discrimination
To verify the practicability of the proposed algorithm under different weather conditions, a green lawn area was selected as an experimental site.A total of 240 images were collected at two times: once at noon on a sunny day and once at ∼ 3:00 PM on a cloudy day.The experiment showed that the overall recognition rate on a sunny day was 92.3%, whereas that on a cloudy day was higher, mainly because the light on sunny days is direct, and the light on cloudy days is diffused, resulting in a higher contrast between the plants and the background, which makes the recognition rate on cloudy days higher.Furthermore, the average recognition time on a sunny day was 1.42 s higher than that on a cloudy day; nonetheless, this can meet the requirements of actual classification (Table 4).

Conclusion
This paper proposes an image segmentation method based on dynamic threshold and LS-SVM techniques.This method adopts a 3 × 3 pixel square area as separate units.Then, it selects two pixel color features, local energy, gradient, and the second moment, which are taken as three texture features, and seven blade shape regional characteristics as input characteristics.Next, it applies the feature vector as the LS-SVM input and adds the LS-SVM local dynamic threshold to obtain the training sample used to classify images taken for identification.The experiments showed that the algorithm is highly accurate in different environments.For the recognition of turf plants with similar shapes and texture features, more feature vectors can be added to the LS-SVM recognition model to train the samples.Considering the limitations of the proposed algorithm, it is necessary to further optimize the penalty coefficient, C, and parameters for more complex environments to achieve the classification and recognition requirements.In future work, we plan to optimize and improve the algorithm and to classify more lawn plants to establish a complete database for lawn plant identification and classification.This can provide a theoretical basis for accurately setting the height of cutters for ZTR mowers.

Fig. 1
Fig.1Flowchart of LS-SVM image identification based on a dynamic threshold.

Fig. 3 H
Fig. 3 H, S, and V components of color images: (a) original image, (b) H component, (c) S component, and (d) V component.
e m p : i n t r a l i n k -; e 0 0 8 ; 6 3 ; 4 5 3 EðP k x;y Þ ¼ T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 9 ; 6 3 ; 4 1 4 VðP k x;y Þ ¼

E
Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 3 ; 3 2 6 ; 4 8 7 M m;n x;y ¼ G σ ðx; yÞ Ã ð∇IÞð∇IÞ T ; (13) E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 4 ; 3 2 6 ; 4 4 5

Fig. 7
Fig. 7 Image of each pixel-level texture feature: (a) local energy, (b) local gradient, and (c) local second moment.

Fig. 8
Fig. 8 Segmentation effect comparison of several segmentation methods: (a) original images of four species of lawn (a 1 Kentucky bluegrass, a 2 perennial ryegrass, a 3 tall fescue, and a 4 carpet grass); (b) image segmentation obtained by the algorithm in this paper; (c) image segmentation obtained by the Otsu method; and (d) image segmentation obtained by the fuzzy threshold method.

Table 3
Recognition rate using different kernel functions based on multiple features.