Detection of two intermixed invasive woody species using color infrared aerial imagery and the support vector machine classifier

Abstract Both the evergreen redberry juniper (Juniperus pinchotii Sudw.) and deciduous honey mesquite (Prosopis glandulosa Torr.) are destructive and aggressive invaders that affect rangelands and grasslands of the southern Great Plains of the United States. However, their current spatial extent and future expansion trends are unknown. This study was aimed at: (1) exploring the utility of aerial imagery for detecting and mapping intermixed redberry juniper and honey mesquite while both are in full foliage using the support vector machine classifier at two sites in north central Texas and, (2) assessing and comparing the mapping accuracies between sites. Accuracy assessments revealed that the overall accuracies were 90% with the associated kappa coefficient of 0.86% and 89% with the associated kappa coefficient of 0.85 for sites 1 and 2, respectively. Z -statistics ( 0.102 < 1.96 ) used to compare the classification results for both sites indicated an insignificant difference between classifications at 95% probability level. In most instances, juniper and mesquite were identified correctly with < 7 % being mistaken for the other woody species. These results indicated that assessment of the current infestation extent and severity of these two woody species in a spatial context is possible using aerial remote sensing imagery.


Introduction
Invasive plant species are well known for their successful exploitation of natural resources (e.g., water, space, light, and nutrients). This ensues their aggressive and competitive behavior, prolific seed production, and seed longevity. 1,2 Invasive weeds are capable of moving from small, manageable infestation to larger areas reaching levels where control is either economically prohibitive and/or cause significant ecological damage. 3 Because of their rapid spreading potential and threat to biodiversity and ecosystem processes, invasive plant species have been a long-standing concern to natural resource managers, ecologists, and biological conservationists. [4][5][6] Invasion may alter nutrient cycling, regional hydrologic processes, carbon sequestration, herbaceous production, diversity and composition, and soil erosion characteristics. 2,5,[7][8][9] Some invasive species can dominate the vegetative canopy and eventually can form monotypic stands. 10 Weed infestation is considered a major reason for loss in global biodiversity including species extinction. 6,7,[11][12][13][14][15][16][17] There is an abundance of redberry juniper (Juniperus pinchotii Sudw; juniper hereafter) and honey mesquite (Prosopis glandulosa Torr; mesquite hereafter) across vast areas of the southern Great Plains of the United States. Both species are native to the region but they are considered "native invasives." Their encroachment into grasslands and rangelands has been widely attributed to reduced fire intensity and frequency and livestock overgrazing. 15,16,[18][19][20] Although there is a common ground that fire suppression and livestock grazing have facilitated woody invasion in the region, some argue that decreased lumbering, increased carbon dioxide emission, landscape fragmentation, and climate change have also accelerated encroachment. 21,22 However, invasion by both mesquite and juniper species may also be beneficial to wildlife habitat, 23,24 ecosystem carbon storage, 25 recreational activity, 8 soil stabilization 24 and bio-energy production. 8,18 Furthermore, juniper and mesquite encroachment in this region has given rise to increased above-and below-ground biomass, root density, soil nitrogen and carbon, and soil microbial biomass pool. [26][27][28] Numerous remote sensing studies have been conducted for detection and mapping of a large number of native and non-native invasive plant species using imagery. 4,[6][7][8]11,24,[29][30][31][32][33] Remote detection of invasive plant species using geospatial imagery may substantially improve monitoring, planning, and management practices by overcoming some of the shortcomings of groundbased surveys such as observer bias and inaccessibility to certain locations. Remote sensing techniques for accurate mapping of invasion offer a unique set of advantages including repeatability, large area coverage, and cost-effectiveness over ground-based methods over time and space. [34][35][36] Extent of mesquite distribution has been well reported from southwestern United States, South America, Australia, and India, [37][38][39][40] while that of juniper distribution is well recorded. 19,[41][42][43] As both species occupy a significant area of grasslands and rangelands, their invasion has raised several environmental concerns around the world. Information about their specific canopy coverage and distinction from surrounding land cover classes, however, is lacking. In addition, since management practices and mode of interaction with ecosytem process differ between these species, accurate identification of these species is critical, especially when they occur in intermixed stands. Our objectives were to: (1) explore the ability of a gray scale near infrared (NIR) band of multispectral aerial imagery to detect and map land cover types dominated by juniper and mesquite at two sites in north central Texas, (2) use this method of analysis to separately map juniper and mesquite canopy cover, and (3) assess and compare the mapping accuracies between the sites.

Remote Sensing Imagery
A county-level color infrared aerial image of Hardeman County, covering both sites, was obtained from the National Agricultural Imagery Program (NAIP) provided by the Natural Resources Conservation Service Geospatial Data Gateway (http://datagateway.nrcs.usda.gov/). The NAIP image was a 3-band digital aerial image with a spatial resolution of 1-m taken on August 12, 2010. The image was projected to the Universal Transverse Mercator North American Datum 1983 Zone 14 North by the provider. The image was extracted for the study sites using ArcGIS (ESRI Inc., Redlands, California).

Imagery Classification
After visually evaluating a few classification methods (minimum distance, maximum likelihood, spectral angle mapper, neural net, etc.) in Environment for Visualizing Images (ENVI; Exelis Visual Information Solutions, Boulder, Colorado), the support vector machine (SVM) classifier was selected for our objectives due to its superior ability to detect live vegetation. The SVM is a supervised machine learning method that performs classification based on the statistical learning theory. The SVM classifies data by separating a hyperplane that provides the best separation between classes in a multidimensional feature space. This hyperplane is the decision surface on which the optimal class separation takes place. The optimal hyperplane is the one that maximizes the distance between the hyperplane and the nearest positive and negative training example called the margin. From a given set of training samples, the optimization problem is solved to find the hyperplane that leads to a sparse solution. Although the SVM is a binary classifier in its simplest form, implementation of the SVM classifier in ENVI was extended to more than two classes by splitting the problem into a series of binary class separations (ENVI User's Guide).
In order to represent more complex shapes than linear hyperplanes, a variety of kernels including the polynomial, the radial basis function, and the sigmoid can be used for performing SVM classification in ENVI. The SVM was employed using the radial basis function kernel for performing the pairwise classification. A penalty parameter also can be introduced to the SVM classifier to allow for misclassification during the training process. The penalty parameter was set to its maximum value, whereas a classification probability threshold of zero was used in order to classify all pixels (ENVI User's Guide). Default settings of this classifier were used for image classifications. During the classification process, only NIR band of the NAIP imagery was chosen using the spectral subset option in ENVI.

Extraction of Training Samples
The 1-m NAIP image allowed clear, visual identification of all dominant land cover classes based on the spectral contrast among the live vegetation (juniper, mesquite, herbaceous) and senescent herbaceous or nonvegetative components (paved road, shadow, exposed soil, water) [Figs. 2(a), 2(b), 3(a), and 3(b)]. Previous studies have found significant differences in reflectance between Ashe juniper (Juniperus ashei Buchholz), mesquite, water, exposed soil, and herbaceous plants. 24,31,44,45 In addition, reflectance variation within a deciduous crown is greater than that within a coniferous tree crown because of the nonconical shape, larger branches, and shaded area caused by the neighboring branches. 46 Thus, respective training samples to perform image classification were manually extracted from isolated trees and areas on the image. 47 Training samples consisted of 5 to 10 polygons, each having 25 to 50 pixels, from pure canopy or each land cover type at identified locations on the ground and on the image. The SVM analysis was performed for the following land cover classes: juniper, mesquite, live herbaceous, senescent herbaceous, bare ground, water, shadow, and paved road. Exposed soil, shadow, dirt roads, live and senescent herbaceous land cover classes were grouped together into a nonwoody class that resulted in a five-category final classification map for each site: juniper, mesquite, nonwoody, water, and paved road.

Accuracy Assessment
Accuracy assessment for classification was made by constructing an error matrix for each classified image, which compares, on a group by group basis, the relationship between reference categories on the ground and corresponding classified categories on the image. Error matrices for each classification map were generated by comparing the classified classes with the ground verification data. Error matrices were computed to evaluate the classification accuracy including the overall, producer's, and user's accuracies.
There is no single established standard for selection of the image and ground areas for comparison. 46,[48][49][50] Because a pixel in an image represents only an arbitrary location on the ground, and positional errors of maps and global positioning system receivers become significant with smaller pixel sizes, areas based on geographic information system polygons are used frequently. 48 However, using individual pixels is appropriate if a per-pixel classification is assessed for accuracy. 51,52 This avoids problems caused by generating "homogeneous" polygons on a landscape. It has also been observed that pixel positional error results in conservative bias of the accuracy assessment. 53 Therefore, the unavoidable positional error introduced into this assessment would result in lower or conservative estimates of mapping accuracy. 49 Field validation (accuracy assessment) was performed using verification data (ground control points) at sites. Verification data for 250 locations at each site were randomly generated using the "create random points" function in ArcGIS [Figs. 2(a) and 3(a)]. The verification points were loaded into a real time differential Trimble GeoXH Global Positioning System (Trimble Navigation Limited, Sunnyvale, California) equipped with the ArcPad (ESRI Inc. Redland, California) software package and a 4-m external antenna, providing a submeter horizontal accuracy (10-cm), and navigated prior to image classification at sites. This created an unbiased field validation method that was visited at sites without any prior knowledge of whether specific locations were delineated for the respective land cover categories by the classification method. Actual land cover type on each of the 250 locations was assessed at the sites and assigned to points that were navigated using the GPS unit. Subsequent to image classification, these points were overlain on the land cover map and a one-to-one matching was performed to contract an error matrix for each site.
In addition to accuracy assessment for classified maps and individual land cover classes, kappa coefficients and kappa variances were determined from the error matrix and a two-tailed Z-test (Zα ∕2 ¼ Z 0.025 ) was performed to compare image classification between sites at the 95% confidence level. [54][55][56] The kappa statistic is an estimate of agreement or accuracy between the imagery-derived classification map and the ground verification data. This is characterized by: (1) the major diagonal and (2) the chance agreement by taking into account of the row and column totals estimates. Kappa values range between 0 and 1, with values >0.80 representing strong agreement between the classified map and ground truth and values <0.40 representing poor agreement. Values between 0.40 and 0.80 indicate moderate agreement with the ground truth data. 57

Results
Figures 2(a), 2(b), 3(a), and 3(b) show the CIR composite (a) and the grayscale NIR NAIP images (b) for sites 1 and 2, respectively. The CIR composite imagery reveals distinct spatial patterns of juniper, mesquite, herbaceous, and other land cover classes on each site. On the CIR composite image, mesquite is characterized by a lighter reddish tone, juniper by a dark reddish color, herbaceous plants by a grayish to pinkish response (along intermittent stream segments), exposed soil and paved road by a white to light blue color, and water by black or blue tones [ Figs. 2(a) to 3(a)]. On the corresponding gray scale NIR image, juniper has a distinct darker gray tone than mesquite which in turn is darker than the herbaceous species (along intermittent stream segments). Exposed soil (bare ground or dirt road) has the brightest white color with distinct shape and spatial geometry on both images. Paved road has a similar gray color as the herbaceous species, but with same spatial characteristics of dirt road. Water varies from having dark gray to bright color with a texture noticeably differing from the other features Image classification resulted in 27.74% juniper, 24.45% mesquite, 0.91% water, 0.36% road, 46.54% nonwoody for site 1 [Fig. 2(c)], whereas 35.43%, 4.87%, 1.01%, 2.03%, and 56.66% were classified as juniper, mesquite, water, road, and nonwoody land cover classes, respectively, at site 2. Land cover classes were identified with the overall accuracy of 90.4% with the associated kappa coefficient of 0.86 for site 1 and 89.2% with kappa coefficient of 0.85 for site 2 ( Table 1). The producer's accuracies ranged from 85.7% for water to 100% for road, while the user's accuracies varied from 50% for water to 100% for paved road at site 1 ( Table 1). The highest and the lowest producer's accuracies were found for nonwoody (91.67%) and water (77.78%) classes, respectively, at site 2 ( Table 1). The classification for the same site had the highest and the lowest user's accuracies of 94.20% and 53.85%, respectively, for the juniper and water land cover classes ( Table 1). The Z-statistics (0.102 < 1.96) indicated an insignificant difference between the classifications for sites at 95% probability level.
Both sites were similar in the degree to which juniper was misidentified as mesquite and mesquite was misidentified as juniper. Juniper was misidentified as mesquite 4.2% and 3.0% of the time at sites 1 (3/72) and 2 (2/67), respectively. In contrast, the error for misidentifying mesquite as juniper was slightly higher at 6.6% and 5.3% at sites 1 (4/61) and 2 (1/19), respectively. Averaged over both sites, misidentification of juniper as mesquite was 3.6% and mesquite as juniper was 6.0%.

Discussion
This study evaluated the usefulness of geospatial aerial imagery to detect and map intermixed juniper and mesquite distribution on rangeland settings, a land cover type typical for much of the south central and southwestern US. Accurate and timely information concerning current extent of juniper and mesquite over large and inaccessible areas using a relatively cost-effective and quick method can be used for various fields of rangeland ecology and management. We used freely available NAIP imagery currently over the study region for mapping mesquite and juniper at the species level. Our accuracy assessment indicated that land cover classes were successfully mapped with an overall accuracy > 89% with the individual class accuracies ranging from 50% to 100%. The lowest class accuracy (50%) was found for water. This probably resulted from: (1) similar spectral patterns for juniper tree shadow and water bodies in presence of dense aquatic weeds, and (2) few validation points randomly generated for the water land cover class. Separation of different rangeland woody species on an image has received limited attention in the literature. Phenological, structural, and spectral characteristics of plant species have been utilized to distinguish one woody species from another. In a study in central Texas, reflectance spectra of honey mesquite, Ashe juniper, senescing grass, mixed herbaceous, and some other woody plants were recorded in late summer using a hyperspectral handheld field spectroradiometer. 45 Mesquite had higher visible and NIR reflectance than Ashe juniper. The shift from higher to lower reflectance occurred around 720 nm. A similar observation was made by the authors in Ref. 44 who reported that Ashe juniper had lower visible and NIR reflectance compared to mesquite in the spectrum they examined.
Some juniper and mesquite plants with small canopies might have been classified as nonwoody land cover with the classification method employed in this study. We found that a juniper plant with a 1.75-m canopy diameter and 2.3-m 2 canopy area was undetected with the algorithm used to classify the same imagery. Visual inspection of ground-level images of this plant revealed that the plant was growing on 100% bare soil which suggested that small plants may remain undetected on such soils. Accurate monitoring of semiarid vegetation using remote sensing is hindered by the effects of bare soil because of an "overexposure" effect that washes out smaller objects on the image. 58 The correlation between measured woody-plant cover from image classification and ground-based measurements depends strongly on the: (1) image resolution, (2) size of the plants or clusters under surveillance, and (3) time of image acquisition. [59][60][61][62][63] In a southern New Mexico study, only 29% of shrubs with canopy areas <2-m 2 in size were correctly classified, while 87% of all shrubs with canopies >2-m 2 were detected using an image with a spatial resolution 0.86 m. 61 In contrast to these studies, a study in Arizona found that the overall classification accuracy for mapping shrub cover dominated by velvet mesquite (Prosopis velutina) derived from a 1-m spatial resolution image was greater than 0.6-m resolution image. 59 An image with a spatial resolution of 1-m or less was recommended to estimate percent cover and the areas of individual shrubs in the south-central Mohave Desert in California. 60 An optimal pixel size of 6-m or less has been suggested for studying functional properties of chaparral and grassland in southern California using hyperspectral data. 62 It is likely that in our study, several newly recruited mesquite and juniper trees with canopy areas <1-m 2 were missed in the 1-m image unless the trees were clustered. However, it is also very likely that mesquite or juniper cover within mesquite or juniper clusters and thickets was overestimated because small gaps between canopies were missed. The same conclusion can be given regardless of image pixel size for individual trees or shrubs of different species within a dense patch of other woody species.
We are uncertain as to why a slightly higher level of inaccuracy occurred in misidentifying mesquite as juniper (average 6.0% over both sites) compared to misidentifying juniper as mesquite (avg. 3.6%). Due to within-canopy and canopy versus canopy shading effects, there may have been portions of mesquite canopies that exhibited a darker red spectral color and these in some cases were misidentified as juniper canopies. The potential for this error toward a darker color is probably greater than for sunlight portions of juniper canopies becoming sufficiently lighter red or pink to the point of being mistaken as mesquite canopies. This may be due to the conical canopy shape and more uniform reflectance of evergreen juniper species compared to deciduous species as has been noted. 46 Regardless, the error margins of cross misidentification between these two woody species were in all cases <7% and were thus not a large concern but more sites with an equal mix of the two species need to be assessed.
Our ability to utilize aerial images for quantifying trends and patterns of woody plant cover depends on several factors. 59 Spatial, spectral, and radiometric resolutions along with the image scale, image processing methods, atmospheric haze, shadow, terrain effects, angle between the sensor and vegetative layers, relative contrast between vegetative layers and background, canopy architecture, crown size and height, and plant density greatly influence detection capabilities of remotely sensed image. 64,65 In cases where canopies of individuals of the same plants or different plants overlap, it cannot be reliably determined from top-down perspective whether a given image object represents one large plant, multiple plants of the same species, or multiple plants of different species. 59

Conclusions
Remote sensing techniques offer a unique set of algorithms for detecting and mapping plant species and can potentially characterize the extent of an invasion by distinguishing the invading species from the rest of the vegetation mosaic in a timely and spatial manner. This study investigated the use of aerial imagery with 1-m spatial resolution and the SVM classifier for separating juniper, mesquite, and co-occurring herbaceous vegetation in two grassland environments that had been invaded by woody species mesquite and juniper. Given the economic and ecological consequences of invasion by these woody plants, our results clearly indicate that aerial imagery is a valuable tool for identifying and mapping the extent invasion with a high level of accuracy. The 1-m scale level of resolution appeared to be adequate in mapping both mesquite and juniper with the exception of mapping very small plants with <1-m-diameter canopies. For these plants a higher resolution image may be necessary. This might be critical if the goal is to detect the very early stages of invasion with juvenile plants. These maps can be used for monitoring and planning control measures for both species studied. The ability of the support vector machine (SVM) classifier to accurately separate juniper canopies from those of mesquite also has advantages in estimating biomass levels of each species and determining the extent and type of treatments required for woody invasion mitigation, which can be different for different shrub species. The use of 1-m spatial resolution aerial imagery for obtaining estimates of infestation by undesirable rangeland species was shown in this study and it is recommended that this methodology and technology should be considered when high scale maps are needed for research and land management purposes. Srinivasulu Ale is an assistant professor and geospatial hydrologist at the Texas AgriLife Research in Vernon, Texas. He received his PhD in agricultural and biological engineering from Purdue University, West Lafayette, in 2009 and MS in agricultural engineering from G.B. Pant University of Agriculture & Technology, India, in 1992. His research interests include water resources management on croplands, rangelands and pasture production systems, water quality assessment and management, land use (focus on bioenergy-induced) and climate change impacts on hydrology and environment, and irrigation and drainage.
R. James Ansley is a professor of rangeland ecology at the Texas A&M AgriLife Research Center in Vernon, Texas. His areas of interest are rangeland shrub ecology, bioenergy, fire ecology, and range plant ecophysiology. His current research focus is to quantify the ecological impact of woody plant encroachment in semi-arid grasslands and rangelands, develop sustainable technologies to reduce woody plant effects, and determine the potential of rangeland woody plants for bioenergy uses. He also teaches introduction to biology as an adjunct instructor at Vernon College.