New hyperspectral difference water index for the extraction of urban water bodies by the use of airborne hyperspectral images

Abstract Extracting surface land-cover types and analyzing changes are among the most common applications of remote sensing. One of the most basic tasks is to identify and map surface water boundaries. Spectral water indexes have been successfully used in the extraction of water bodies in multispectral images. However, directly applying a water index method to hyperspectral images disregards the abundant spectral information and involves difficulty in selecting appropriate spectral bands. It is also a challenge for a spectral water index to distinguish water from shadowed regions. The purpose of this study is therefore to develop an index that is suitable for water extraction by the use of hyperspectral images, and with the capability to mitigate the effects of shadow and low-albedo surfaces, especially in urban areas. Thus, we introduce a new hyperspectral difference water index (HDWI) to improve the water classification accuracy in areas that include shadow over water, shadow over other ground surfaces, and low-albedo ground surfaces. We tested the new method using PHI-2, HyMAP, and ROSIS hyperspectral images of Shanghai, Munich, and Pavia. The performance of the water index was compared with the normalized difference water index (NDWI) and the Mahalanobis distance classifier (MDC). With all three test images, the accuracy of HDWI was significantly higher than that of NDWI and MDC. Therefore, HDWI can be used for extracting water with a high degree of accuracy, especially in urban areas, where shadow caused by high buildings is an important source of classification error.


Introduction
Extracting surface land-cover types and analyzing changes are among the most common applications of remote sensing. [1][2][3] One of the most basic tasks is to identify and map surface water boundaries. Optical remote sensing of water bodies is based on the difference in the spectral reflectance of land and water. Water absorbs most of the energy in the near-infrared (NIR) and the mid-infrared wavelengths, whereas vegetation, soil, and impervious surfaces have a higher reflectance in these wavelengths. Thus, in a multi/hyperspectral image, water appears in a darker tone in the IR bands and can be easily differentiated from the dry land surfaces. To date, various water body extraction algorithms for optical imagery have been developed, and they can be categorized into four basic types: 4 (a) thematic classification; [5][6][7][8][9][10][11][12][13][14][15] (b) spectral-unmixing; [16][17][18][19] (c) single-band thresholding; 17,[20][21][22] and (d) the spectral water index methods. [23][24][25][26][27][28][29][30][31][32][33][34] Among these methods, the spectral water index methods are the most commonly used water body extraction methods, because of the ease of use and low computational cost. 35 Even though a number of water body extraction methods have been proposed in the literature, water extraction methods often fail to distinguish low-albedo surfaces and shadows caused by clouds or other built-up objects in urban areas. 25,36 As far as the detailed mapping of urban water bodies is concerned, airborne hyperspectral remote sensing, which is characterized by very high spatial and spectral resolutions, 37,38 is one of the most valuable data sources for classification. [39][40][41] However, directly applying a water index method to hyperspectral images disregards the abundant spectral information and involves difficulty in selecting appropriate spectral bands.
In this paper, we introduce a multiple-band hyperspectral water index, called the hyperspectral difference water index (HDWI), with the objectives of: (a) improving the accuracy of water body extraction by automatically suppressing classification noise from shadow and other nonwater dark surfaces; and (b) extending the multispectral-designed water index method into the applications of hyperspectral images.

Review of Shadow Detection Methods and Spectral Water
Indexes for Extracting Water Bodies

Spectral Water Indices
A spectral water index is a single number derived from an arithmetic operation (e.g., ratio, difference, and normalized difference) of two or more spectral bands. An appropriate threshold of the index is then established to separate water bodies from other land-cover features, based on the spectral characteristics. 8 Reference 23 introduced the normalized difference water index (NDWI) to delineate open water features using the green (band 2) and NIR (band 4) bands of Landsat TM. Reference 16 further applied bands 3 and 5 of Landsat TM in the NDWI. Reference 25 proposed a modified normalized difference water Index (MNDWI) that modifies NDWI by replacing band 4 by band 5 of Landsat 5 TM, and it has become the most widely used water index until now. Reference 30 developed a normalized difference pond index (NDPI), which is expressed as the normalized difference of the green and short-wave infrared (SWIR) reflectance (SPOT-5 bands 1 and 4, respectively). Based on the water indexes proposed by Refs. 27 and 29, Ref. 27 further tested three groups of water indexes using bands 7 and 5, bands 5 and 4, and bands 7 and 2 of Landsat TM/ETM+, and they suggested that the water index using bands 5 and 4 achieved the best performance for detecting water features. Reference 34 introduced a new automated water extraction index (AWEI), in which two water indexes are proposed using five spectral bands of Landsat 5 TM. Table 1 lists a summary of these water indexes. Two things need to be noted here: (a) Ref. 42 developed a different NDWI used for estimating the water content of a vegetation canopy, which is calculated as the normalized difference of the NIR and the SWIR bands; and (b) the normalized difference vegetation index (NDVI) has also been used in Refs. 43 and 44 to map surface water bodies.

Spectral Shadow Detection Methods
Shadows exist in most aerial remote sensing images and high-resolution satellite images, and in these images, shadow is generally produced by urban materials such as buildings and trees. 45 The existence of shadow affects the accuracy of land-cover classification 46 and change detection. 47 In general, spectral shadow detection algorithms do not require any a priori information to distinguish shadow areas from nonshadow areas, 48 and they can be directly applied to radiance or raw data, based on certain specific spectral and spatial assumptions. Compared to the shadow detection methods based on three-dimensional modeling, 49,50 spectral shadow detection methods have been proven to be simple and efficient in many applications. 51 The spectral shadow detection methods can be organized into three categories. (a) Threshold values from the histogram of a single spectral band; for example, Ref. 52 separated shadow from nonshadow by thresholding at a predetermined level and postprocessing the segmented regions, and the water bodies were distinguished by the variance of the segmented regions. (b) Values from an arithmetic operation of two or more spectral bands, which is similar to the spectral water indexes. Reference 53 used a linear combination of red, green, blue, and NIR bands to detect shadow, and the removal of water bodies from the shadow was based on the histogram of the blue band. Reference 54 developed a spectral shape index (SSI) using the blue, green, and red bands of QuickBird imagery to distinguish shadow from water bodies. Reference 55 constructed a morphological shadow index (MSI) based on the relationship between the spectral-structural characteristics of shadows and the corresponding morphological operators of panchromatic high-resolution images. Reference 56 presented a cloud shadow detection index (CSDI) to detect cloud shadows on homogeneous water bodies. (c) Invariant color models. In this category, the shadow chromaticity is expressed in a two-dimensional image, instead of reducing the dimensionality of the shadow information to a one-dimensional histogram. A color space 57 named C 1 , C 2 , and C 3 is used as a suitable color space for the shadow detection. 48 Reference 58 presented a method which uses the spectral ratio image in the hue, intensity, and saturation space to segment shadow. Reference 59 further improved the spectral ratio of Ref. 58 and showed that their method achieves a better accuracy. Table 2 lists a summary of these spectral shadow detection methods.

Study Data
For the purpose of urban water extraction, three test sites were selected. The first test image is a subset of a Pushbroom Hyperspectral Imager II (PHI-2) image of Shanghai. The target area was focused on Lujiazui, the city center of Shanghai, which is surrounded by the Huangpu River, and is located at 121°29′38″E, 31°14′18″N (Fig. 1). The PHI-2 has been developed by the Shanghai Institute of Technical Physics, China, since 2001. It has been applied in the fields of environmental monitoring, geological studies, oil and gas prospecting, vegetation studies, ocean observation, city layout studies, agricultural monitoring, and forest fireproofing. 60 The hyperspectral image was required at about 12:00 a.m. (AE2 h) at a relative flight elevation of about 1500 m, in cloud-free sky, on November 26, 2002. PHI-2 has a field-of-view of 23 deg, with a spatial resolution of 1.5 mrad, and 246 spectral bands covering 400 to 870 nm in wavelength, with a spectral resolution of better than 5 nm in full range. 60 The spatial resolution is about 1.5 m (under the plane spot). The spectra were calibrated between the PHI-2 data and the field spectra at the same spots, to eliminate the atmospheric effect, 61 and they were atmospherically corrected in the ENVI QUAC module. 62    The second test image is a small subset of a HyMAP image of Munich, Germany (48°8′ N, 11°3 6′E). The target area is the Isar River (Fig. 2). The hyperspectral HyMAP data were acquired on July 30, 2004, with a spatial resolution of 4 m, and 128 spectral bands in the visible and IR channels (400 to 2400 nm). 63 This image was geo-referenced and atmospherically corrected with ATCOR 4. 64 The third test image is a subset of a reflective optics system imaging spectrometer (ROSIS) image of Pavia, Italy (45°11'N, 9°9'E). The target area is the Ticino River (Fig. 3). The ROSIS data were acquired on July 8, 2002, with a spatial resolution of 1.3 m, and 102 available bands covering 430 to 860 nm. 65 The data were atmospherically corrected but not geometrically corrected.

Spectral Characteristics of Water and Shadow
The total radiance reaching a satellite sensor consists of three parts: 54 (a) the contribution from the atmosphere, i.e., the path radiance due to light scattering; (b) the reflectance of the ground surface caused by direct sunlight; and (c) the reflectance from the scattered sunlight.
For an atmospherically corrected hyperspectral image, the contribution from the atmosphere is corrected, and thus the first part is almost removed from the image and can be considered as zero. For a ground object covered by shadow, the second part, reflectance caused by direct  sunlight, is zero, and the surface-leaving radiance only consists of the scattered sunlight of the ground object, which is assumed to be in proportion to the reflectance caused by direct sunlight while the sunlight is not blocked.
Typical pixel reflectance values of six major land-cover types were sampled from all the wavelengths of test image 1. The land-cover types were: water, vegetation, bright built-up, dark built-up, water shadow (water essentially, which is in shadow caused by buildings or trees), and shadow. Spectral data from these pixels were used to examine the reflectance patterns and to identify the land-cover types that affect water body extraction accuracy in urban areas, with the aim being to design a method that accurately discriminates between such surfaces and water, especially for shadow and water. Figure 4 shows the reflectance curves of water (red), dark building (blue), shadow (black), and water under shadow (brown), and it can be clearly observed that the reflectance of shadow is in proportion to the reflectance of dark building, whereas the reflectance of water under shadow is in proportion to the reflectance of water.
In the wavelength of blue light (450 to 520 nm), both the reflectance of water and shadow over water are lower than that of dark building and shadow. Reference 53 took advantage of this spectral property to separate water and shadow. In the wavelength of green light (520 to 600 nm) and red light (600 to 690 nm), the reflectance of water is higher than that of buildings. However, in the wavelength of NIR light (700 to 850 nm), the water absorbs the majority of the sunlight, and thus the reflectance in NIR is lower than for other types of ground surface. These spectral characteristics are used in the majority of spectral water indexes to detect water bodies. 16,23,25,30,34 However, in shadow regions, as the incoming radiance is limited, the reflectance curves of water and shadow are mixed from 490 to 650 nm, which means that the green light does not contribute to the development of a spectral water index for distinguishing water and shadow.

Development of the Hyperspectral Difference Water Index
The spectral curves of urban water bodies and shadow indicate that the spectral shape and the amplitude might be adequate to separate the water and the shadow regions for an entire image. Thus, the proposed water detection index-the HDWI-is constructed to increase the contrast between water and other dark surfaces, as follows This water index HDWI amplifies the contrast between water and shadowed regions by taking advantage of the differences in the spectral amplitudes, particularly in the red and the NIR Fig. 4 The spectral reflectance curves of water and shadow. regions of the spectra (Fig. 4). The primary aim of the formulation of HDWI is to maximize the separability of water and nonwater pixels through spectral integration and differencing. To amplify the contrast between these two regions, we integrate the reflectance of these two spectral regions and calculate the reflectance difference between these two regions. Figure 5 plots the index value distributions of six land-cover types of test image 1 for HDWI, NDWI HIS , and NDWI. NDWI HIS and NDWI are derived from Ref. 23 In Eq. (2), green and NIR denote single bands near the center wavelength of 535 and 820 nm, respectively. From Fig. 5, it can be observed that when using NDWI HIS , the majority of the water and water shadow can be separated from the other land covers; however, HDWI HIS cannot avoiding mixing with shadow. Using HDWI, the separability of water, water shadow, and shadow are greatly improved, which means that the HDWI is capable of separating water and water shadow from other low-albedo ground surfaces.

Classification and Accuracy Assessment
To compare the accuracy of the proposed water body extraction technique with other methods, we made preliminary tests of various water indices, including the NDWI of Ref. 23 55. Based on a preliminary evaluation, it appeared that all the indices, except for NDWI, performed poorly with our test images. We therefore only considered NDWI for comparison with the new index proposed in this paper. A supervised Mahalanobis distance classifier (MDC) was also included in the comparison, as this classifier performed the best with our test images among the other widely used methods (including maximum likelihood, K-means, spectral angle mapper, minimum distance, and parallelepiped, which had overall accuracy of 95.09%, 82.44%, 78.42%, 81.22%, and 87.14% in their respective study areas) in water classification. For the MDC, water and nonwater training data were produced for each test image.
In order to determine the optimal threshold to separate water and nonwater pixels in HDWI and NDWI HIS , a threshold with the minimum sum of commission errors and omission errors was chosen. The classification accuracies of the three methods, i.e., HDWI, NDWI, NDWI HIS , and MDC, were then assessed by calculating kappa coefficients and error matrices. The accuracy comparison between HDWI and NDWI HIS was made at their optimal thresholds.
It is worth noting that the performance of support vector machine (SVM), a trainable machine learning classification method, is also tested. Although the overall accuracy of SVM classifier (99.49%) is slightly higher than that of HDWI, SVM required much more time for training, parameter optimization, and classification. Therefore, SVM is not used in our experiments in order to have a fair comparison with water indexes. Fig. 5 Index value distributions of six land-cover types of test image 1, from left to right, the charts is the representative of HDWI, NDWI HIS and NDWI respectively. Each box plot shows the location of the 25th, 50th, and 75th percentile (boxes), and the extreme outliers (whiskers). 66

Water Extraction Maps
The water extraction maps by the use of the three classifiers with the three test images are presented in Fig. 6. A visual inspection of Fig. 6 indicates that the HDWI results in a better accuracy of surface water mapping than NDWI, NDWI HIS , and MDC. For test image 1 from Shanghai, in particular, the shadow over water is correctly extracted by the new HDWI index. In test image 1, the water extraction maps by the use of NDWI and NDWI HIS show noisy results. The MDC performs well in the classification of nonwater pixels in the urban area, but it fails to classify the shadow water bodies as water. However, for test image 2 from Munich, all the maps of Fig. 6 indicate smaller differences between the three water extraction methods. All three methods  Fig. 6 Comparison of the water extraction results using the three classifiers with the three test images.
perform well in the extraction of water bodies. HDWI mixes up the least amount of urban builtup areas, and NDWI HIS produces the worst result by visual inspection. For test image 3 from Pavia, HDWI produces a better result than NDWI and NDWI HIS in suppressing shadow and other non-water surfaces. The MDC also produces a good result by visual inspection, with the exception of a large area in the city that is misclassified as a water body.

Classification Accuracy
The results of the mapping accuracy with the three test images are summarized in Tables 3-5. The accuracy achieved by HDWI is higher than that of the NDWI HIS , NDWI, and MDC classifiers. The total omission and commission errors of HDWI are less than 50% of those of the NDWI HIS , NDWI, and MDC classifiers with test image 1. With test image 2, HDWI, NDWI HIS , NDWI, and MDC produce low commission errors, and HDWI produces very low omission and commission errors. With test image 3, the classification accuracies for HDWI and MDC are quite similar, whereas NDWI and NDWI HIS achieve the worst accuracy.

Shadow Distinguishing Effects
To evaluate the shadow distinguishing effect of the new proposed index in urban areas, the confusion between water, water shadow, shadow, and dark building (low-albedo surfaces) was    Table 6. For test image 3, no shadow is found on the water surface, and thus only shadow and dark building are analyzed in Table 7. For test images 1 and 3, additional reference samples for water shadow, shadow, and dark building were selected, and the numbers of pixels classified as water or nonwater (in the blanket) are listed. For test image 1, using the new proposed HDWI, almost all the water shadow pixels are correctly classified as water (6985 of 7034 pixels), and only 53 shadow pixels and 89 dark building pixels are wrongly classified as water. NDWI HIS also correctly classifies the majority of the water shadow pixels as water (6928 of 7034 pixels), but 1162 shadow pixels and 275 dark building pixels are wrongly classified as water. The NDWI performs worse than NDWI HIS except for water shadow recognition. The MDC performs very well in the classification of water pixels, shadow pixels, and dark building pixels, and only five shadow pixels and three dark building pixels are misclassified. However, MDC misclassifies over 40% of the water shadow pixels into the nonwater class (3010 of 7034 pixels).
For test image 3, HDWI also performs the best in distinguishing shadow and dark buildings, and no shadow pixels and only 13 dark building pixels are wrongly detected by HDWI. The numbers of misclassified pixels for NDWI, NDWI HIS , and MDC are 4, 4, and 337; and 337, 359, and 270, respectively.

Discussion
The new proposed water index in this paper contributes to the efforts being made to apply water indexes to extract water bodies from hyperspectral images, and to improve the accuracy of surface water mapping for further environmental studies and applications. The proposed HDWI is specially designed for urban water detection, where the shadows and the other low-albedo surfaces have not been correctly classified in the previous studies. HDWI uses the small differences between dark surfaces and dark water surfaces in the red and the NIR wavelengths. It is a simple and effective technique for enhancing the separability of water and other dark pixels, without  using any additional data to remove them beforehand and without any color space transformation. The spectral integration is a sum over all the bands in specific wavelength range, which is a simple calculation and would not bring much burden for computation cost. If there are large areas needed to classify with stringent time requirements, we can choose part of bands in the wavelength range with a wavelength step for classifying. Through experiments, we find under the condition of good data quality, this process also can get results as good as the original approach.
We tested the new index using three different hyperspectral sensors: PHI-2 is manufactured in China; HyMAP is manufactured in Australia; and ROSIS is manufactured in Germany. The images were captured in three different cities with rivers flowing through them: Shanghai, China; Munich, Germany; and Pavia, Italy. The water extraction results show that the HDWI index is a sensor-free hyperspectral water index, and the only requirement for the hyperspectral image is that the center wavelengths of each spectral band should be known. HDWI works in the red and NIR wavelengths (650 to 850 nm), and these wavelengths are included in all hyperspectral sensors, which will certainly enlarge the application field of this index.
Although the new water index was tested under different sensors and different cities, several issues that may affect the results were not considered. (1) With our test images, HDWI was operated on the reflectance curves, and the importance and type of atmospheric correction applied in the image preprocessing stage was not considered when evaluating the accuracies.
(2) The composition of water, such as the phytoplankton, chlorophyll a, and suspended sediment content, which may lead to a change in the reflectance patterns, was not considered and evaluated in this paper. (3) In terms of urban areas, the open water near to high buildings may also be difficult to classify. Our experimental data did not contain open water, and we think the difficulty in open water also exists in the river of our experimental data (water, water shadow caused by buildings, and building shadows on land are highly mixed). Therefore, the open water near high buildings has not been considered.

Conclusions
The new water extraction index introduced in this paper is designed to improve the accuracy of urban surface water mapping by the use of hyperspectral images. The proposed method uses a simple technique of spectral integration and enhancing class separability without any additional data to remove shadow and dark surface noises, which are often major causes of misclassification in urban surface water mapping.
Based on a review on spectral water index methods, spectral shadow detection methods, and spectral analysis of shadowed surfaces, we introduced a new HDWI for improving the water classification accuracy in the case where the area consists of shadow over water, shadow over other ground surfaces, and low-albedo ground surfaces. The proposed index uses spectral integration and operates on the reflectance differences between dark surfaces and water surfaces in the red and the NIR wavelengths. The proposed index was tested with PHI-2, HyMAP, and ROSIS hyperspectral images of Shanghai, Munich, and Pavia. The performance of the water index was compared with the NDWI, the NDWI applied to hyperspectral image (NDWI HIS ), and the Mahalanobis distance (MDC) classifier. From the experimental results with the three test sites based on the proposed HDWI method, several conclusions are drawn as follows 1. HDWI is effective for the extraction of water bodies by the use of airborne hyperspectral images, especially for the images that cover the urban areas. It works on the reflectance from 650 to 850 nm, and it is suitable for any hyperspectral images with known center wavelengths. Atmospheric correction is suggested in the preprocessing step for the application of HDWI. 2. HDWI is showed to be capable of extracting water bodies by the use of different hyperspectral sensors. The HDWI can be used to extract surface water with a high degree of accuracy, particularly in urban areas where high buildings cast shadows on water and nonwater surfaces. In all three test images, the accuracy of HDWI is significantly higher than that of NDWI, NDWI HIS , and MDC.
3. HDWI is particularly designed for distinguishing water shadows. The experimental results showed that HDWI correctly classifies water shadow pixels into water, and other shadow pixels into nonwater. In addition, few pixels are incorrectly classified by HDWI compared with NDWI, NDWI HIS , and MDC.