Remote detection of flowering Somei Yoshino (Prunus×yedoensis) in an urban park using IKONOS imagery: comparison of hard and soft classifiers

Abstract. Identification of flowering trees in urban areas is challenging due to weak spectral signals and the high heterogeneity of urban landscapes. We hypothesized that a soft classifier, such as mixture tuned matched filtering (MTMF), would be better able to identify pixels including blooming cherry trees than a hard classifier such as maximum likelihood (ML). To test this hypothesis, we compared the accuracy of MTMF and ML in classifying blossoms of Somei Yoshino cherry trees (Prunus×yedoensis) in an urban park in Tokyo using IKONOS imagery. An accuracy assessment demonstrated that the MTMF classifier (overall accuracy: 62.2%, kappa coefficient: 0.507, and user’s accuracy of SY: 48.1%) performed better than ML in identifying flowering SY (overall accuracy 48.7% with kappa accuracy: 0.321 and user’s accuracy of blooming SY: 38.9%). Our results suggest that both methods are able to classify cherry blossoms in an urban landscape, but MTMF is more accurate than ML. However, the producer’s accuracy of MTMF (72.7%) was slightly lower than ML (77.7%), suggesting that the accuracy of MTMF could decrease due to the limited number of available bands (four for IKONOS) and the existence of endmembers, such as dry grass in this study, with stronger signals than flowers.


Introduction
Plant phenology is gaining attention as an important indicator of global and local climate changes.][5] However, vegetation indices do not utilize the full information content of remotely sensed imagery in the way that image classification methods can, 8 especially for phenological events.Vegetation indices typically focus on certain spectral bands that represent the spectral reflectance of canopy greenness, and therefore provide less information on flowering status, flower abundance, and flowering dates. 9Moreover, the spectral bands used by vegetation indices may sometimes represent ground features such as soil that can cause errors in classifying land cover type. 10mage classification approaches have been used to identify tree species and their composition, 8 to detect land use changes, 11,12 and to identify plant conditions 13 based on the spectral signal of canopy greenness.Two approaches have been used in previous studies: (1) hard classification and (2) soft classification.Hard classification selects the class label with the greatest likelihood of being correct and unambiguously assigns each pixel to a single class. 14,15The decision boundaries of the feature space are well defined for hard classification.In soft classification, pixels are assigned based on the relative abundance of each class in the spatially and spectrally integrated multispectrum of each pixel. 14Therefore, the decision boundaries of the feature space are considered fuzzy 14 in soft classification because each pixel can have multiple or partial class memberships. 15,16Due to its ability to assign multiple classes to a single pixel, soft classification has been widely used to monitor mineral, soil, and vegetation status, especially in highly heterogeneous areas, because it can divide multiple spectral responses within a pixel and provide proportional information for each class.
Cherry blossoms of Prunus species flower synchronously in the spring in temperate zones of the northern hemisphere.Cherry blossoms are of interest because they provide social and economic benefits from cherry blossom viewing, and they provide important information on the long-term impacts of climate change. 17,18However, identification of cherry blossoms is challenging because urban environments are highly heterogeneous and the flowers produce a weak spectral signal.Therefore, we hypothesized that a soft classifier approach may be more useful to identify cherry blossoms in urban areas due to its ability to separate multiple spectral responses from different land cover types.
In this study, we explore the ability of hard and soft classifiers to identify cherry blossoms in an urban landscape from high-spatial resolution images.We chose the most common cherry cultivar in Japan, Somei Yoshino (hereafter SY) (Prunus × yedoensis), for identification of cherry blossoms.We used maximum likelihood (ML) as a hard classification method and mixture tuned matched filtering (MTMF) as a soft classification method.We compared the accuracy of these two classifiers using high-spatial resolution IKONOS imagery of an urban park in Tokyo, Japan.

Remotely sensed data
We used a multispectral IKONOS image [four bands: blue (445-516 nm), green (506-595 nm), red (632-698 nm), and near infra-red (NIR; 752-853 nm)] with 4-m resolution.The IKONOS data were recorded over the study area on April 1, 2006, and were purchased from Pasco, Japan.The image was chosen because SY was in full bloom at the time according to information provided by the Japanese Meteorological Agency (JMA).The purchased data were radiometrically corrected and geo-referenced to the Universal Transverse Mercator (UTM) coordinate system, zone 54, WGS84 datum.We conducted reflectance data conversion on the image to estimate areas of blooming SY.To avoid multiple spectral responses, asphalt roads and lakes were masked using a threshold approach.Each feature of the study site in IKONOS image was first digitized and overlaid in Google Earth and was approximately measured.

Spectral data collection
To validate the spectral reflectance of flowering SY in the IKONOS image, we collected spectral reflectance data of flowering SY in Yanagisawanoike Park using a spectroradiometer (ASD Fieldspec Pro) in April 2014.The data were collected at a spectral range of 0.35-2.5 μm with a spectral interval of 3.3 nm.The spectral reflectances of 10 flowers from five blooming SY individuals were measured in a laboratory under dark conditions using a spectroradiometer mounted at a nadir position 20 cm above the target with a 25-deg field of view.We recorded 10 readings for each sample and calculated the average of the spectral data.The sensor was calibrated using a white Spectralon panel prior to data collection.

Ground data collection
In addition to the spectroradiometer measurements, we collected XY-coordinates of flowering SY trees, soil, dry grass, and evergreen trees using a handheld GPS unit (Garmin GPSmap 60CSx) on April 1, 2014.According to the park manager and Google Earth, the SY trees on this date were the same as in the 2006 imagery.We used these coordinates as reference data to assess classification accuracy.

Methods used to identify flowering SY
We used two types of image classifications to identify flowering SY from IKONOS imagery: hard classification and soft, or fuzzy, classification.We used ML for hard classification, as it has been widely used for many purposes, such as discrimination of tree species. 19,20We used MTMF for soft classification because it has been used to identify targets in highly heterogeneous areas, such as urban areas, by decomposing the pixel into its constituent classes and estimating the proportion of each class.
Maximum likelihood classification.To obtain optimal classification using ML, we first examined spatial and spectral information for a set of training pixels.We collected spatial information on texture using the gray level co-occurrence matrix method on the IKONOS image with a 3 × 3 pixels window.We calculated the mean, variance, entropy, homogeneity, contrast, dissimilarity, second moment, and correlation of pixels for each training area (Fig. 1).Because there was spatial variability and contrast among classes, we used textural analysis in addition to the spectral information to improve the classification results.
We extracted spectral information from training pixels of the IKONOS image (Fig. 2).The spectral pattern of each class varied enough to discriminate the classes.Dry grass had a higher reflectance, and evergreen trees had a lower reflectance, compared to flowering SY.However, the spectral patterns and magnitudes of soil and evergreen trees were almost identical.Therefore, we conducted a spectral separability test to determine the distinctness of each class.
We applied transform divergence (TD) to the IKONOS image to select the features with the greatest degree of statistical separability.TD is used to evaluate spectral variability among classes of training areas.A TD value of 1.90-2.00indicates good to excellent separation between classes, while a value <1.70 indicates poor class separation. 21The TD results demonstrated good class separability (TD ¼ 2.00) among flowering SY, soil, dry grass, and evergreen trees.However, the TD value was 1.73 for flowering SY and dry grass and 1.83 for flowering SY and evergreen trees, indicating weak separabililty of these classes.Soil and evergreen trees had an even lower separability, with a TD value of 1.65.However, we were able to distinguish classes with a lower separability based on spatial evaluation (Fig. 1).Therefore, we used flowering SY, soil, dry grass, and evergreen trees as the training classes for ML classification.To obtain optimal accuracy of the ML classification, we supplemented the four spectral bands of the IKONOS imagery with four bands of local texture information (variance).Thus, a total of eight bands were used in this classification.
Mixture tuned matched filtering.3][24] There are two phases in the MTMF algorithm: the matched filter (MF) calculation to estimate abundance, and the mixture tuning (MT) calculation to identify false-positive results.
MT assesses the probability of an MF estimation error for each pixel based on mixing feasibility.Abundances in MTMF must obey two critical feasibility constraints: (1) they must be non-negative, and (2) the abundances for each pixel must sum to one.Calculated infeasibility represents the distance of the pixel from the line connecting the target spectrum and the background mean, measured in terms of standard deviations using the appropriate mixing distribution for the MF score of that pixel.MT and MF scores can be jointly interpreted to provide good subpixel detection and false-positive rejection. 25he endmember of MTMF is a spectrum representing ground surface materials. 26In this study, we assigned a single endmember for MTMF classification of flowering SY by selecting 10 pure pixels of flowering SY.We averaged the spectral data from the IKONOS imagery for these 10 data points to create a single composite target spectrum that was used as the endmember for MTMF classification.

Infeasibility scores
Infeasibility scores are used to confirm the classification of flowering SY from the MTMF classifier.The best match is indicated by an MF score close to one and an infeasibility score close to zero. 27However, according to Brelsford and Shepherd, 28 certain spectral signatures can generate large positive MF scores that are indicated as false positives in MTMF.In this study, we used the cumulative distribution function to identify an infeasibility score for 36 points where flowering SY was confirmed by GPS ground truthing.These 36 points were distributed across 40 pixels in the IKONOS imagery.We assigned the MF scores of these 40 pixels to five groups to identify the best infeasibility score, which lies between 0.01 and 0.1 and represents the highest MF score (0.8 ≤ MF ≤ 1.2) (Fig. 3).

Accuracy assessment
We assessed the accuracy of MTMF and ML classifications of flowering SY compared to ground-truthed data.We calculated both user's and producer's accuracy for both classification methods.According to Congalton and Green, 29 producer's accuracy is the ability of the IKONOS imagery to classify a certain target (number of individual classes correctly classified/total number of reference data), while user's accuracy is the probability that a classified pixel actually represents that category (number of pixels classified on the map/number of pixels in the image that actually represent that category).The percentage of all classes correctly classified was evaluated using overall accuracy and the kappa coefficient, which measures the level of agreement of the overall accuracy.We calculated the overall accuracy and kappa coefficient as in Eqs. ( 1) and ( 2 Fig. 3 Best infeasibility scores for 36 points of flowering SY used to identify the feasibility of matched filter (MF) scores.
OA ¼ P q k¼1 n kk n ; (1) where q is the number of rows in the matrix, n kk is the number of observations in row k and column k of the error matrix, n kþ and n þk are the marginal totals of row k and column k, respectively, and n is the total number of observations.The number of flowering SY trees in Yanagisawanoike Park is limited by the presence of a lake.This made it impossible to take a random sample of at least 50 plots for each land cover class, which is ideal.Laba et al. 30 had a similar problem due to the limited areas of certain vegetation classes, and suggested using the largest possible number of plots.The numbers of training and test pixels used for each class in ML and MTMF classifications are shown in Table 1.
The MF scores, which represent the abundance of pixels in each category, ranged from-2.698 to 2.947 [Fig.5(b)].Pixels representing masked asphalt road and lake had negative MF scores.2][33] The 36 points of flowering SY were distributed across 40 pixels with 0.8 ≤ MF ≤ 1.2 [Fig.5(b)], indicating more than 80% flowering SY per pixel.Pixels with MF scores < 0.8 represented bare soil and MF scores >1.2 represented dry grass and evergreen trees.Infeasibility scores from the MTMF classification ranged from 0.01 to 16.854.Each MF score in the MTMF classification had its own infeasibility score that indicated the class to which the pixel belonged.Pixels identified as flowering SY had infeasibility scores ranging from 0.001 to 0.1 (Fig. 4).
The IKONOS image used in this study had high variation and contrast among the training classes.Therefore, we supplemented the image with four gray level co-occurrence (variance) bands for the ML classification.However, the TD showed that separability of flowering SY, dry grass, and evergreen trees was poor.The ML classification identified most of the soil pixels as evergreen trees [Fig.5(c)], even though texture analysis was conducted before ML classification.
The MTMF classification had 62.2% overall accuracy and a kappa coefficient of 0.507, compared to 48.7% overall accuracy and a kappa coefficient 0.321 for the ML classification.User's accuracy of the MTMF classification of flowering SY (48.1%) was higher than that of ML classification (39.4%).The poor overall accuracy of the ML classification was primarily due to misclassification of soil (user's accuracy: 37%, producer's accuracy: 25%).ML misclassified 60.6% of flowering SY as dry grass or evergreen trees [Fig.5(c)].However, the producer's

Discussion
Our results indicate that, in terms of overall accuracy and Kappa coefficient, MTMF classified flowering SY in an urban park more accurately than ML.However, the producer's accuracy of  the MTMF classification was slightly lower than the ML due to misclassification of flowering SY pixels as soil or dry grass (Table 2).This may be due to the limited number of available bands for ML (four bands for IKONOS).MTMF can achieve higher classification accuracy by using hyperspectral data.Williams and Hunt 22 demonstrated that MTMF classification worked well to identify leafy spurge in hyperspectral airborne visible infrared imaging spectrometer (AVIRIS) images.In addition, the existence of an endmember with a stronger signal than flowers, such as dry grass in this study, may have limited the user's accuracy of MTMF classification.Therefore, additional endmembers may be needed to improve the performance of MTMF for classifying flowering SY trees.In contrast, the ML classifier identified flowering SY with a relatively high producer's accuracy (Table 2).However, misclassification of soil as evergreen trees may be the cause of the low overall accuracy.Most of the pixels representing soil were assigned as evergreen trees, and pixels of deciduous trees were often assigned as soil.Cherry blossoms precede the leaf flushing of other deciduous trees, which had no leaves at the time of the imagery.Because soil has higher reflectance than branches or trunks, deciduous trees were often misclassified.Therefore, adding a training class for deciduous trees could improve the accuracy of ML classification.
Plant leaves, rather than flowers, have often been used [3][4][5] to observe plant phenology from remotely sensed data because the spectral signal of flowers is generally weaker than that of leaves.We confirmed that cherry blossoms of SY have weaker spectral signals than dry grass (Fig. 2), but MTMF classification has considerable potential in terms of enabling their accurate separation (Figs. 3 and 5).

Conclusion
Our results suggest that MTMF classification is more accurate than ML classification for identifying plant flowering phenology in a highly heterogeneous urban landscape.However, the number of spectral bands can limit the producer's accuracy of MTMF classification.Therefore, utilization of hyperspectral data with high-spatial resolution such as AVIRIS might be useful for identifying flowering phenology in urban ecosystems.

Fig. 1 Fig. 2
Fig.1Mean values of textural features calculated from training pixels.Textural analysis conducted on the IKONOS image included mean, variance, entropy, homogeneity, contrast, dissimilarity, second moment, and correlations for each class.

Fig. 4
Fig. 4 Infeasibility score compared with MF score for different land cover types (evergreen trees, bare soil, SY, and dry grass).

Fig. 5
Fig. 5 Land classification and identification of flowering SY in Yanagisawanoike Park.(a) Dominant features of the study area, (b) MF scores of MTMF classification (red color represents flowering SY with MF scores ranged from 0.8 to 1.2) and (c) maximum likelihood (ML) classification of the IKONOS image. ):

Table 1
Number of training pixels and test pixels for each class for maximum likelihood (ML) and mixture tuned matched filtering (MTMF) classifications.
7%).MTMF tended to misclassify flowering SY as dry grass or soil.

Table 2
Accuracy assessment for maximum likelihood (ML) and mixture tuned matched filtering (MTMF) classifications of flowering SY trees.The values for each class represent the number of ground-truthed points used to evaluate the accuracy of classification.