Remote estimation of chlorophyll-a concentration in turbid water using a spectral index: a case study in Taihu Lake, China

Abstract Chlorophyll-a concentration (Chla) is a key indicator of water quality, and accurate estimates of Chla using remote sensing data remain challenging in turbid waters. Previous research has demonstrated the feasibility of retrieving Chla in vegetation using spectral index, which may be the potential reference for Chla inversion in turbid waters. In this study, 106 hyperspectral indices, including vegetation, fluorescence, and trilateral indices, as well as combinations thereof, are calculated based on the in situ spectra data of 2004 to 2011 in Taihu Lake, China, to explore their potential use in turbid waters. The results show that the normal chlorophyll index (NCI) ( R 690 / R 550 − R 675 / R 700 ) / ( R 690 / R 550 + R 675 / R 700 ) is optimal for Chla estimation, with a determination coefficient ( R 2 ) of 0.92 and a root mean square error (RMSE) of 14.36     mg / m 3 for the data from July to August 2004, when Chla ranged from 7 to 192     mg / m 3 . Validation using the datasets of 2005, 2010, and 2011 shows that after reparameterization, the NCI model yields low RMSEs and is more robust than the three- and four-band algorithms. The results indicate that the NCI model can satisfactorily estimate Chla in multiple datasets without the need of additional band tuning.


Introduction
The water quality of inland lakes is a main concern of the public and the government given its importance in land use, eutrophication, global change, and regional biogeochemical cycles. 1 Chlorophyll-a concentration (Chla) is a major indicator of water quality and the important indicator of lake eutrophication. Characterizing the heterogeneity of, and temporal changes in, water quality across lake ecosystems is difficult when using conventional sampling methodologies. 2,3 Thus, detecting Chla in water through remote sensing has become the subject of intense research, given the efficiency, economy, and macrography of this method. 1 Spectral reflectance above the water surface in the visible and near-infrared (NIR) spectra provides qualitative and quantitative information on optically active substances in the water. In the open ocean, where the optical properties of the water are determined by phytoplankton and their associated degradation products, the ratio of blue spectral reflectance to green spectral reflectance has been used to assess Chla. 4 However, in turbid inland waters, whose optical properties are determined by the combination of phytoplankton, total suspended matter, and colored dissolved organic matter (CDOM), 5 blue-green algorithms generally return inaccurate results owing to the strong overlapping absorption by nonalgal particles and CDOM in the blue spectral region.
The red and NIR spectral regions, where the absorption effects of nonalgal particles and CDOM are largely decreased, are often used to estimate Chla in turbid waters. Many algorithms have been developed based on these spectral regions, such as the ratio of the NIR peak reflectance to the red trough reflectance near 675 nm, 6,7 the position of the NIR reflectance peak, 6 and fluorescence line height (FLH). 8 In 2003, Dall'Olmo et al. 9 developed a semianalytical threeband algorithm to estimate Chla in turbid waters. This algorithm has also been validated for the use in Chla estimation in other water bodies. 10,11 Subsequently, a modified four-band algorithm was proposed by Le et al. 12 to remove the effects of the absorption and backscattering caused by suspended solids in the NIR region and to suppress pure water absorption. However, the model band combinations and parameters used by different algorithms may vary, given the great spatial and temporal changes in the biophysical characteristics of turbid waters. Models from different authors for the same water body may vary owing to the variety of sampling times and positions. 10,13,14 The band combinations from different authors may also differ, even if the same model-building method was used. 10,15,16 The inversion model derived from a specific dataset has to be refined and calibrated when applied to new datasets.
The basic building process of the Chla inversion model involves calculating the correlation between the constituent concentration and spectral reflectance and then determining the optimal band combination with high accuracy and robust performance. These combinations and regionspecific algorithms for Chla inversion in turbid waters are still currently under development. 17 However, endless combinations of hyperspectral reflectance exist, making it laborious and impractical to exhaust all possible combinations and expressions to find the optimal one. Many band combinations in existing inversion algorithms can produce satisfactory results, prompting the mining of useful information from previous studies.
Chlorophyll-a is the primary photosynthetic pigment in terrestrial green plants and phytoplankton in water, which is strongly absorbent of the blue and red spectral region, and highly reflective of the green and NIR spectral region, indicating a similarity between the spectral reflectance of algae-containing water and terrestrial vegetation. The principle of the commonly used three-band conceptual algorithm for Chla inversion in turbid waters 9,10,15 originates from terrestrial vegetation. 18 Several inversion methods have been applied to both terrestrial and aquatic systems, such as derivatives of reflectance spectra, 7,19 the NIR/R ratio method, 6,20,21 normalized difference index, [22][23][24] and so on, demonstrating the potential of the use of spectral indices from vegetation remote sensing in Chla estimation in turbid waters.
The potential application of spectral indices obtained from vegetation remote sensing in turbid water was tested in this study, including vegetation index, pigment index, fluorescence index, and trilateral index. Based on a collection of 106 typical spectral indices from the literature and their calculation using in situ spectra from Taihu Lake, China, during the period of 2004 to 2011, the objectives of this study are to (1) find the optimal spectral index from vegetation remote sensing by comparing the performance of these indices in Chla estimation in turbid waters and build an estimation model based on this index; (2) validate the model using the datasets of 2005, 2010, and 2011 from Taihu Lake, and test the robustness of the model by comparing it with the three-and four-band models; and (3) validate the application of the model in the hyperspectral images using the band reflectance of EO-1/Hyperion and PROBA/CHRIS simulated by the in situ spectra data.

Study Area
The study area is Taihu Lake, the second largest freshwater lake in China, located between 30°56′ to 31°33′ N and 119°55.3′ to 120°53.6′ E and having an area of 2427.8 km 2 and an average depth of 2.12 m. This lake is large and the water in the lake is highly turbid. The eutrophication of the lake is serious, with the hypereutrophic area mainly covering Meiliang Bay, Wuli Lake, and the Western Lake. 25 The Chla in the lake has obvious seasonal and annual variation, with the highest concentration in July to August when algae blooms occur.
Four datasets were used in this study, including July to August of 2004 and 2005, as well as September of 2010 and 2011, and the sampling distributions are shown in Fig. 1. In July to August of 2004 and 2005, water samples were collected at the Taihu Lake monitoring sites and the spectra were measured on the first 10 days of each month. On September 19, 2010, and September 3, 2011, the sampling positions mainly covered the hypereutrophic area of Meiliang Bay and the central lake, with spectral measurements taken near noon within 1 day.

Data Acquisition
Using an Analytical Spectral Devices Field Spectroradiometer (Analytical Spectral Devices Inc., Boulder, Colorado) with 512 bands ranging from 350 to 1050 nm with increments of 1.5 nm, the hyperspectral reflectance in the study area was measured at 0.5 to 1 m high above the water surface, with a probe field angle of 10 deg. The instrument was positioned at a specific viewing geometry to avoid the effects of direct solar radiation and prevent the ship from interfering with the water surface. 26 Ten curves were acquired for each location, and the median value of the repeated measurements was used to calculate the remote sensing reflectance. The reflectance of a standard gray plate is 30%, and the spectra with wavelengths shorter than 400 nm or longer than 900 nm were discarded owing to noise. The spectra were resampled to 1-nm interval and smoothed using the kernel regression smoothing method with a window width of 5 nm. 27 Discarding samples under windy or cloudy conditions, a total of 85 samples were finally kept, of which 24 samples in July to August of 2004 were used for model building and 20 samples in July to August of 2005, 25 samples in September of 2010, and 16 samples in September of 2011 were used for model validation.
Chla was measured according to the Chinese national standard three-color spectrophotometry (SL88-1994). First, the water samples collected in the field were filtered through a Whatman GF/C membrane, after which the membrane was kept in darkness in a refrigerator overnight. After the removal from refrigeration, Chla was extracted using 90% acetone. The extracted liquid was centrifuged for 10 min, and the supernatant was spectrophotometrically analyzed using a Shimadzu UV-2550 UV-Vis spectrophotometer. The Chla (mg∕m 3 ) was calculated using the absorbance at 750, 663, 645, and 630 nm, according to the standard formula.

Spectral Index
The spectral index is the mathematical combination of reflectance at the visible and NIR bands, 28 including vegetation index, fluorescence index, and trilateral index. A total of 106 spectral indices were collected in this study (Tables 1 and 2).

Vegetation index
The vegetation index (Table 1) is the mathematical expression of reflectance, at a specific band (Nos. 1 to 73) or within a band range (Nos. 74 to 87) that reflects the biochemical component of the vegetation. Given that the spectral response function was not available to simulate the reflectance of a broad band range (such as the blue, green, yellow, red, and NIR bands), the average reflectance within the wavelength range was used instead.

Fluorescence and trilateral index
The spectral fluorescence indices (Nos. 88 to 90 in Table 2) include the fluorescence peak (maximum fluorescence peak near 685 nm) and the normalized fluorescence height (fluorescence peak divided by the peak reflectance at 560 and 675 nm). 70 The spectral trilateral indices (Nos. 91-106 in Table 2) include the red-edge, green-edge, and trilateral parameters. The red-edge parameters were calculated using the inverted Gaussian model, including the reflectance at the red shoulder, the reflectance at the absorption valley, the position of the absorption valley, the position of the red edge, and the width of the red-edge absorption valley. The green-edge index includes the reflectance and position of the green peak. The trilateral refers to the red edge, blue edge, and yellow edge, and trilateral parameters include the location, amplitude, and area of the edge. To find the spectral index most sensitive to Chla, the combinations of vegetation indices, fluorescence indices, and trilateral indices were calculated, including band ratio, band difference, and band normalization.

Model Building and Accuracy Assessment
Regression analysis was used to build the model between Chla and the spectral index. Both the three-band 9,10,15 and the four-band 12 algorithms were used for model comparison. The formulas of the three-and four-band models are as follows: Based on the study by Zimba and Gitelson, 75 the three bands were searched in the range of 450 to 800 nm and the initial iterative positions of λ1 and λ3 were 675 and 750 nm, respectively. Iterative calculation will stop when the root mean square error (RMSE) comes to its lowest. The optimal four bands were searched using the same iteration method, and the initial positions of λ2, λ3, and λ4 were 700, 720, and 750 nm, respectively. 12 The RMSE and average relative error (ARE) were used to evaluate the model accuracy, and their respective formulas are as follows: where y is the measured Chla (mg∕m 3 ), y 0 is the estimated Chla (mg∕m 3 ), and n is the sample size. The determination coefficient R 2 was used to evaluate the goodness of fit in the regression model, and the F-test and p value were used to evaluate the model significance. To guarantee the model assumption, the residual plots were used to diagnose the regression model before the model was used for prediction. 76 Two residual plots are important: Q-Q (Quantiles-quantiles) plots and scatter plots between the residuals and the variable, both of which were used in this study. The former was used to check the normality of the residuals, whose scatter points should form an approximately straight line if the distribution is close to the standard normal distribution. The latter was used to check the residual heteroscedasticity: the model must be modified if the residual is a certain function of the estimated Chla. 77

Hyperspectral Data Simulation
To avoid the uncertainty of atmospheric effect, the band reflectance of hyperspectral sensors EO-1/Hyperion and PROBA/CHRIS that are currently in orbit was simulated using the in situ spectra and the spectral response function of each sensor so as to validate the application of the model in satellite images.
The spectral response function of Hyperion can be simulated by a Gaussian spectral response function since each band's full width at half maximum (FWHM, nm) of this hyperspectral image is narrow. 78 Assuming the Gaussian peak value at the center wavelength is 1, the spectral response function of Hyperion can be calculated using Eq. (5), where the subscript i represents the sensor band,λ i is the center wavelength (nm), σ i is the band width (nm) and can be calculated from FWHM.
The spectral response function of CHRIS is a strip function determined by the center wavelength and band width [Eq. (6)], whereas each channel's center wavelength (λ i , nm) and band width (Δλ i , nm) values can be found in the technical documentation of CHRIS. 79 Using the spectral response function and in situ spectra data with wavelength interval of 1 nm, the reflectance of channel i in hyperspectral sensors can be calculated according to Eq. (7), where r i represents the spectral reflectance of channel i, λ si is the initial wavelength of channel i, λ ei is the end wavelength of channel i, rðλÞ is the in situ reflectance at wavelength λ, and φ i ðλÞ is the spectral response factor at wavelength λ within channel i, which can be calculated from the spectral response function.

Spectral Reflectance and Constituent Concentration
The spectral magnitudes and shapes of the four datasets after smoothing (Fig. 2) showed similar characteristics to that of typically turbid water: 15 the relative low reflectance in the blue range (400 to 500 nm) was a result of high absorption by water constituents; a slight reflectance trough at around 620 nm was formed by the absorption peak resulting from phycocyanin in water containing blue-green algae; 80 a second reflectance trough at around 675 nm corresponded to Chla absorption; and a distinct peak at around 700 nm mainly resulted from both chlorophyll fluorescence and minimum absorption by optically active constituents and water. Table 3 shows the statistical characteristics of Chla and TSS across the four datasets. The TSS in July to August of 2004 and 2005 can refer to the TSS data in July to August of 1998 to 2003. The CDOM is not considered in this study.

Spectral Index Sensitive to Chla
Correlation analysis was used to determine the sensitivity of spectral index to Chla. The workflow included determining the relationship between Chla and the spectral reflectance and calculating the correlation coefficient. 81 Given that Pearson's correlation coefficient only implies a linear relationship, the relation between spectral index and Chla has to be first verified using a scatter plot. A logarithmic relation was found in these datasets; therefore, Chla was transformed into its natural logarithm, denoted as lnChla.
The spectral indices and their correlation coefficients with lnChla in the datasets of 2004 and 2005 were calculated (Fig. 3). The same tendency indicated the consistent change of spectral indices with Chla in these two years.
Correlation coefficients in 2004 were sorted from high to low and the top 10 are listed in Table 4, where p < 0.001. With regard to wavelengths, the sensitive reflectance to Chla mainly focused at 450, 550, 670 to 700, 800 nm or the nearby location. In terms of spectral indices, RARSa had the highest sensitivity to Chla. The top three highly correlated spectral indices were all composed by the band ratio of the reflectance peak near 700 nm and the reflectance trough near 670 nm, demonstrating the effectiveness of these two bands for Chla estimation in turbid water.
The combinations of spectral indices, including band difference, band ratio and band normalization, were calculated based on the dataset of 2004.
The top five highly correlated difference and ratio combinations are shown in Table 5, where p < 0.001. Results show that the difference and ratio of spectral indices generally had higher correlation with Chla than a single spectral index, and the performance of band ratio was slightly better than that of band difference. The ratio of the red/green pigment index 31 and the ratio analysis of reflectance spectra 56 (RGI/RARSa) has the highest correlation (0.94) with lnChla, better than RARSa (−0.91), as shown in Table 4.
The correlation coefficients between lnChla and the normalized combination of the spectral index pairs were calculated and the top three are listed in Table 6, denoted as NR1, NR2, and NR3. They were highly correlated with lnChla (correlation coefficient > 0.93, p < 0.001).

Estimation of Chla Based on Spectral Index
Based on the same dataset used in Sec. 3.2, the three normalized combinations, NR1, NR2, and NR3 (Table 6), as well as the RARSa (Table 4) and RGI/ RARSa (Table 5) were used for model building. Results are shown in Table 7.
In the comparison between the R 2 value and RMSEs of NR1, NR2, and NR3, NR1 had the best fitting accuracy and NR3 had the lowest RMSE ( Table 7). The fitting accuracy of NR1 was slightly better than NR3, but its RMSE and residual variation were larger. The residual diagnostics plots of NR1 and NR3 (Fig. 4) indicated that the residuals of NR1 showed  heteroscedasticity as Chla changed, and more points in NR1 deviated from the straight line of its Q-Q plot [ Fig. 4(a)]. Comparatively, NR3 fulfilled the requirement of regression analysis better than NR1 [ Fig. 4(b)]. The results of RARSa and RGI/RARSa were then compared, and the results of the comparison are also shown in Table 7. Results showed that the RGI/RARSa had a higher fitting accuracy (R 2 ¼ 0.93) than NR3 (R 2 ¼ 0.92). However, the scatter plot between RGI/RARSa and NR3 with lnChla (Fig. 5) showed that the variation of RGI/ RARSa was larger than NR3, indicating that the value of the RGI/RARSa ratio was higher and had a greater variation in sample distribution and that the NR3 value was lower and had a smaller variation. The NR3 was more stable with the change in lnChla.

Model Validation
Datasets of 2005, 2010, and 2011 were used for model validation. First, Chla was directly estimated using the model of Chla ¼ expð7.6334 × NCI þ 3.3325Þ, and their models were denoted as DV1, DV2, and DV3. Then, the model was reparameterized by rebuilding the regression models between the lnChla and NCI derived from the three datasets and then the Chla was estimated. The models were denoted as RV1, RV2, and RV3. Figure 6 shows the scatter plots between the estimated and measured Chla of the three datasets.   When the coefficients of the model were refitted using the calculated NCI and Chla in the new datasets, the estimated and measured Chla in the three datasets were consistent with each other (RV1, RV2, and RV3 in Fig. 6) and most samples were within the error line of 10 mg∕m 3 . The RMSEs between the estimated Chla and the measured Chla for 2005, 2010, and 2011 were 10.39, 11.87, and 12.65 mg∕m 3 , respectively. These results demonstrated the usability of NCI in the new datasets for the remote sensing mapping of Chla in a water body.

Model Application
To verify the application of NCI index model in hyperspectral sensors, the Hyperion and CHRIS band reflectances were first simulated using the in situ spectra data in 2004 and 2005 according to the method mentioned in Sec. 2.5. Both Hyperion and CHRIS have four spectral channels consistent with the 550, 675, 690, and 700 nm of NCI, the basic parameters of which are shown in Table 8.
The NCI was first calculated using the Hyperion band reflectance simulated by the in situ data in 2004 and 2005, whereas data in 2004 were used for model building and data in 2005 were used for model validation. Figure 7(a) shows the regression model between NCI and lnChla in 2004 and also the model accuracy, including R 2 , RMSE, and ARE. Figure 7(b) shows the validation results by directly using this model in 2005 data and also the validation accuracy of RMSE and ARE, calculated by the estimated results and the measured Chla. Similarly, Fig. 7(c) shows the NCI regression model using the CHRIS band reflectance simulated using the in situ data in 2004 and Fig. 7(d) shows the model validation result in 2005. The accuracy of the NCI model using hyperspectral data is satisfactory, with R 2 > 0.92 and RMSE < 17 mg∕m 3 . The accuracy of the model built by the simulated hyperspectral data is better than that of using the in situ spectra (Table 7). Meanwhile, the estimation results are also satisfactorily in line with the measured Chla in 2005 using simulation data from both sensors [Figs. 7(c) and 7(d)], with RMSE < 10 mg∕m 3 .
These results indicated that the NCI model can be used to estimate the Chla in Taihu lake using the Hyperion/EO-1 and CHRIS/PROBA spectra data with an accurate atmospheric correction of water body. This index will also have good usability in other hyperspectral satellite images with similar channel settings.

Model Performance of NCI
With regard to Taihu Lake, Le et al. 12 stated that the optimal positions for the three-band model are 660, 692, and 740 nm, and the optimal positions for the four-band model are 662, 693, 740, and 705 nm. When the three-and four-band models were directly used without reparameterization of the new datasets in this study, the performance was poor (not shown). This result is similar to the results from previous studies. 82,83 Based on the data obtained in July to August, 2004, the three-and four-band combinations were compared with NCI. The regression parameters between ð1∕R660 − 1∕R692Þ × R740 and ð1∕R662 − 1∕R693Þ∕ð1∕R740 − 1∕R705Þ were first refitted with Chla in 2004, and the optimal three-and four-band positions were then tuned to calibrate the model. Table 9 shows the results of the three-and four-band models after reparameterization and model calibration, and Fig. 8 shows their residual diagnostic plots.  Table 9 shows that the models of the previous three-and four-band combinations after reparameterization (R 2 ¼ 0.88; RMSE > 15.9 mg∕m 3 ) were less significant than the NCI model (R 2 ¼ 0.92; RMSE ¼ 14.36 mg∕m 3 ). Both the three-and four-band models achieved superior results after model calibration by band tuning, whereas the four-band model (R 2 ¼ 0.93; RMSE ¼ 12.14 mg∕m 3 ) had a better estimation than NCI.
However, the residual diagnostics plots (Fig. 8) showed that the residuals of the three-and four-band models were clearly non-normal and increased with Chla. The samples with lower and higher Chla had a larger deviation. Moreover, the error variance was correlated with Chla, indicating the residuals heteroscedasticity and the necessity for data transformation. 77 Comparatively, the residual diagnostics of the NCI model [ Fig. 4(b)] was better, indicating its robust performance in Chla prediction.
The robustness of the NCI model can be explained from two aspects. The first is the logarithmic transformation of Chla before model building, which guarantees data normality. After data transformation, the model is better than directly using regression analysis. Previous studies 83,84 demonstrated the same result.
Regression models between the three-and four-band combinations and lnChla were also constructed to further verify the effect of data transformation. The reparameterized and calibrated three-and four-band models had poorer residual diagnostics results than the NCI, except for the calibrated three-band model (not shown). However, in this calibrated model ð1∕R664 − 1∕R702Þ × R679, the third band λ3 found in the range of 450 to 800 nm did not confirm the assumption that it is the spectral region minimally affected by pigment absorption and can compensate for the variability in the backscattering between samples. 9 If the searching ranges of λ1, λ2, and λ3 were reset into 658 to 676 nm, 691 to 735 nm, and 723 to 780 nm, respectively, according to Gitelson et al.,10,15 although the 664, 716, and 768 nm would be the last optimal bands, the residual diagnostic result of the calibrated model would still be poorer than that of NCI.
The second aspect that can explain the robustness of the NCI model is the stable wavelengths used in this integrated index. Using the characteristic bands obtained from the vegetation index   10,15 showed the necessity of model calibration of the three-band algorithm to make it suitable to new datasets. The three-and four-band models constructed by specific dataset may not fulfill the model assumptions when applied to other water bodies with different optical properties, thus producing unfavorable model diagnostics and inapplicable validation in new datasets. The estimation model based on NCI had a better residual diagnostics result, indicating that it is robust enough to be used in multiple datasets in Taihu Lake.

NCI and Chla
The spectral reflectance of green plants is primarily affected by leaf color, cell structure, and plant moisture. The typical spectral curves of green plants are as follows: a small reflectance peak near 550 nm with reflectance of 10% to 20%; two obvious reflectance troughs near 450 and 670 nm owing to the strong absorption of pigment concentration; and a sharp increase of reflectance from 700 to 800 nm, leading to an obvious slope called "red edge." Given a wide range of leaf greenness, the maximum sensitivity to Chla was found at 550 to 560 nm and 700 to 710 nm; 85,86 its correlation with Chla at these two bands was larger than 675 nm, and the reciprocal reflectance near 550 and 700 nm was proportional to Chla. 18,85 The reflectance at 670 nm decreased sharply when Chla increased up to 3 to 5 mol∕cm 2 . Thereafter, R670 was almost pigment-concentration independent. R670 can thus be used as a reference. However, the reflectance peak in the NIR area (larger than 750 nm) was insensitive to Chla. 85 The ratio may amplify the differences between the spectra at specific bands due to absorption maxima and minima of the photosynthetic pigments. 56 The band ratio of 675 and 700 nm (RARSa) was found to have a strong linear relationship with Chla (R 2 ¼ 0.93) in the leaves of soybean. 56 The red and green pigment index RGI (R690∕R550) was used to estimate the pigment concentration of the leaf. 31 The spectrum above water surface is affected by the absorption and scattering of water and particles in it, primarily including phytoplankton, suspended sediment, and CDOM. The four characteristic bands related to chlorophyll are the blue, green, red, and NIR bands in water color remote sensing. 7,[87][88][89] The ratio of the reflectance peak near 700 nm and the reflectance trough near 675 nm are proven to be most sensitive to Chla in turbid waters of different trophic states, 6,17 which makes RARSa (R700∕R675) reasonable. Moreover, the R690 in NCI can be regarded as the fluorescence peak because the fluorescence peak is around 690 nm, but can vary from 685 to 695 nm, 90 or be constant at ∼685 nm. 91 Studies showed that normalizing the fluorescence peak near 700 nm to the reflectance value of the global maximum of the spectrum at green peak can accurately predict Chla, 6,70 making RGI (R690∕R550) reasonable.
Although RARSa and RGI are proven to be theoretically effective in Chla estimation, the normalized combination of these two indices-NCI is optimal because of its robust performance as Chla changes (Fig. 5). The first advantage of NCI is the information supplementary of low Chla in the hyperspectral reflectance that cannot be fully expressed by the red-NIR spectrum by using the green band. Accurate estimates of low Chla using fluorescence algorithms, such as FLH and maximum chlorophyll index, 92 or the red-NIR algorithms, 17 are unavailable in oligotrophic and some mesotrophic lakes. Thus, the green band was additionally used because the blue and green bands in the OC2-OC4 algorithms were successfully applied to retrieve 0 to 10 mg∕m 3 of Chla in optically complex waters. The second advantage of the NCI is that the feature wavelengths of 550, 675, 690, and 700 nm have all been set in hyperspectral sensors, such as EO-1/Hyperion, HJ1/HSI, PHILLS/HICO, and so on. Compared with the NCI, the commonly used three-or four-band models always require a reflectance longer than 750 nm. However, obtaining reliable estimates of reflectance at these wavelengths is a difficult task based on existing atmospheric correction schemes for turbid waters. 16 Moreover, unlike previous studies on Chla estimation based on the spectra data obtained from July to August, 2004, in Taihu Lake, including the NIR∕R ratio, 93 the three-band combination 94 or the band normalized combination, 84 the NCI model proposed in this study was tested by regression diagnostics and validated by multiple datasets, thus guaranteeing more reliability.

Conclusion
Based on the 106 spectral indices used in the vegetation remote sensing, this study first calculated the spectral indices based on the in situ spectra from July to August 2004, in Taihu Lake, China. The sensitivity of the spectral indices and their band combinations to the logarithmic transformation of Chla (lnChla) was then analyzed and compared. The integrated spectral index, NCI [ðR690∕R550 − R675∕R700Þ∕ðR690∕R550 þ R675∕R700Þ], was found to be highly correlated with Chla, demonstrating its potential use in Chla estimation in turbid waters.
Based on the NCI, a new Chla estimation model was constructed based on the 2004 data, which is Chla ¼ expð7.6334 × NCI þ 3.3325Þ, with R 2 of 0.92 and RMSE of 14.36 mg∕m 3 . When the model was validated using the datasets of July to August 2005, September 2010, and September 2011, the model after reparameterization yielded low RMSEs between measured and estimated Chla, which were 10.39, 11.87, and 12.65 mg∕m 3 , respectively. Compared with the three-and four-band models, the residuals' diagnostics of the NCI model were significantly better, indicating the robustness of the model and its satisfactory validation performance in multiple datasets. Using the Hyperion/CHRIS band reflectance simulated by the in situ spectra data, model results in 2004 and validation results in 2005 were both satisfactory, showing good applicability of the NCI model.
This study indicates that the abundant results from vegetation remote sensing have a great potential for Chla estimation in turbid waters. The NCI proposed in this study with stable band positions and robust model performance can be preferably used for Chla estimation in turbid waters.
The Chla range suitable to NCI in this study is from 4 to 192 mg∕m 3 . The usability of NCI out of this range will be discussed in future research. Given that the four datasets were all collected from Taihu Lake, one of the limitations of this study is that the calibration and validation datasets only contain a part of the optical properties of natural turbid waters. We suggest calibration and validation of the algorithms based on more field-measured data.