Combining satellite ocean color and hydrodynamic model uncertainties in bio-optical forecasts

Abstract We describe an approach to produce short-term (1- to 3-day) forecasts of bio-optical properties by coupling moderate-resolution imaging spectroradiometer satellite (MODIS) ocean color imagery with a hydrodynamic model. The bio-optical property (chlorophyll in this case) is treated as a conservative tracer; the satellite distribution is advected forward in time using the current field from the hydrodynamic model. Uncertainties in both the satellite chlorophyll values and the currents from the circulation model impact the final forecast; we apply ensemble techniques to quantify the errors separately and in combination. For the ocean color imagery, we further apply ensemble techniques to partition the chlorophyll uncertainties into components due to atmospheric correction and bio-optical inversion, by applying noise to the near-infrared and visible band sets separately. The standard deviation for each ensemble suite provides an indication of uncertainty, or confidence in the satellite chlorophyll values and the hydrodynamic model current fields. By combining the two ensemble sets, we produce a final chlorophyll forecast field and associated uncertainty map that include both sets of uncertainties. We examine mean and individual forecast ensemble members (spread-skill statistics, RMS differences) to assess predictive value. This work represents a significant advancement in representing errors associated with satellite ocean color imagery and bio-optical forecasts.

Combining satellite ocean color and hydrodynamic model uncertainties in bio-optical forecasts

Introduction
Satellite ocean color imagery can provide synoptic, surface estimates of water bio-optical properties, which are used in a wide variety of applications that are directly impacted by the penetration of light in the water column. Thus, applications ranging from process-based ecological studies, to coastal management, and target detection/diver visibility for the military can all benefit from remotely sensed bio-optical property estimates. However, the bio-optical properties are not directly measured by the satellite spectroradiometer, they must be derived through a series of complicated steps, each of which has associated uncertainties. The satellite measures spectral radiances reflected from the surface layer of the ocean, after transmission upward through the atmosphere. Thus, these measured top-of-the-atmosphere (TOA) radiances must undergo an atmospheric correction procedure to remove the light scattered into the viewing cone of the sensor by the atmosphere, in order to retrieve the desired water-leaving radiances (Lw). 1 The water-leaving radiance or remote sensing reflectance (Rrs), following conversion from radiance, is the important parameter related to the in-water bio-optical constituents. 2 The Lw or Rrs estimates can then be incorporated into bio-optical inversion algorithms to subsequently derive estimates of the water bio-optical properties. [3][4][5][6] Typically, to address uncertainties in satellite-retrieved water reflectances and bio-optical properties (partitioned absorption coefficients, backscattering coefficients, chlorophyll concentration, suspended particulates, diffuse attenuation, and euphotic depth), the satellite values are compared to in situ measurements, 7 but this approach has limitations. Due to cloud cover in the satellite imagery, and the expense and spatial/temporal coverage limitations associated with in situ data collection, there are often very few match-ups between the satellite and in situ values, particularly for regional comparisons. [8][9][10][11] Satellite ocean color image products, such as the chlorophyll concentration, are typically provided without any indication of the uncertainty in the estimation, so the end-users (scientists, coastal managers, and military personnel) have very limited information on the reliability of the satellite retrievals for a specific image. Any property field, whether derived from in situ measurements, remotely sensed imagery, or models, has an associated uncertainty, and the property estimate is only as good as the knowledge of this uncertainty. Furthermore, if the satellite-derived products, such as chlorophyll concentration or inherent optical properties (IOPs), are subsequently used in primary production and/or electrooptical sensor performance models, or assimilated into ecosystem models, then the satellite uncertainties will propagate through the downstream products of those models as well.
To help address this shortcoming, we have extended an approach used by the environmental modeling community to satellite ocean color imagery. There are many errors and sources of uncertainty associated with operational forecasting, regardless of the forecast property. These include incomplete and/or inaccurate observations, uncertainties introduced through data assimilation, and unresolved dynamics and instabilities. Ensemble methods have been applied in meteorology and physical oceanography to predict model errors and to improve weather and ocean hydrodynamic forecasts, 12,13 by perturbing the dominant sources of uncertainties (e.g., initial and boundary conditions, resolution, physics, atmospheric forcing, bathymetry girds, included data sets, and coefficient values) in the forecast model. An ensemble, therefore, reflects known sources of forecast uncertainty and allows them to be evaluated. 14 The "spaghetti plots" of potential hurricane tracks are a familiar example. An ensemble forecast suite is generated; the ensemble mean represents the "best-guess" forecast track, and the ensemble variance or standard deviation represents a proxy estimate for the uncertainty in the forecast. Statistics and metrics can be utilized (ensemble mean, RMS error, and spread) to provide some estimate of how well the ensembles capture "reality," thereby providing insight into the underlying deterministic processes and aiding decision support.
Although there are also multiple error sources throughout the processing of satellite ocean color imagery, similar approaches have not been applied to satellite optical property estimates. At each step of the processing, from measurement of TOA radiances, through atmospheric correction and bio-optical inversion, uncertainties propagate and are intertwined. Thus, ocean color image processing should lend itself to an ensemble approach to address the error cascading through the various steps; we have developed such an approach to partition and assess the error sources.
In addition, we are interested in forecasting the bio-optical products (in this case, chlorophyll) by coupling the satellite images with a hydrodynamic model. There are two approaches to produce short-term (1-to 3-day) forecasts of bio-optical properties: case (i) treats the property as a conservative tracer and advect a satellite-observed distribution forward in time using current fields from a hydrodynamic model, and case (ii) uses a fully coupled biophysical process model that includes applicable sources and sinks. Here, we address only the first case, which does not include biology in the simulation (it only accounts for dynamical processes, such as winds, currents, and tides). We would like to propagate the uncertainties in the chlorophyll image values throughout the forecasting process, along with the uncertainties in the model current fields.
Therefore, a strong requirement exists to provide the users of the bio-optical properties, including coupled biophysical modelers who will assimilate the imagery and Navy mission planners and warfighters, with estimates of uncertainty in the derived satellite products and biooptical forecasts. New measurement systems are now available with optical instrumentation (AERONET-OC, gliders, scanfish, and moorings), new satellite sensors are online [(hyperspectral imager for the coastal ocean, (HICO); visible infrared imaging radiometer suites, (VIIRS)], and climatological image archives are all available to help constrain the data ranges and calibrate the ensembles, so the framework is in a place to develop these methodologies for ocean color. Application of ensemble techniques to satellite ocean color processing will greatly improve the value of the derived optical products to the end-users.
Our objectives are: 1) to apply noise to satellite TOA radiances in an ensemble approach to quantify uncertainties in satellite-derived ocean color chlorophyll estimates; 2) to determine whether the ensembles are realistic; 3) to generate ensembles using different wavelength sets to partition the uncertainties into components due to atmospheric correction and bio-optical inversion; 4) to generate a separate ocean hydrodynamic ensemble set to quantify uncertainties in the model currents; and 5) to produce short-term chlorophyll forecasts and associated uncertainty maps using a hydrodynamic model simulation that incorporates both sets of uncertainties (chlorophyll and currents).

Methods
To assess uncertainties in ocean color bio-optical properties, we repeatedly apply realistic noise to the satellite TOA radiances (L t ). This leads to an ensemble of radiance values at each image pixel (and subsequently an ensemble of chlorophyll images). We apply AE2% random noise to the TOA radiances at each wavelength (i.e., different noise applied at each wavelength), with the same noise applied at each pixel in an image. The 2% value was selected as reasonable because the spectral vicarious calibration gain coefficients used to correct the moderate-resolution imaging spectroradiometer (MODIS) Aqua TOA radiance values during standard NASA ocean color processing are generally in the range of AE2-3% (http://oceancolor.gsfc.nasa.gov/ VALIDATION/operational_gains.html).
The 2-3% gains derived through the MODIS vicarious calibration process are really "system gains" that are used to adjust the sensor L t radiances to force closer agreement between satellitederived and in situ bio-optical measurements. Thus, they represent the "mismatch" between the satellite L t values and those that would be required to yield the "correct" bio-optical property values; they do not strictly represent uncertainties in the L t radiances (which were about 3% prelaunch but are more on the order of 1.0% or less following vicarious calibration). 15 However, the gains do indicate the % change in L t that is required to improve the matchups. Hence, we use them as an indication of a reasonable magnitude of noise to apply to create the ensembles. The 2% level we used is just an example to demonstrate the process; perhaps the level could be adjusted with further testing/evaluation. However, the range of chlorophyll values that result from application of these uncertainty/noise levels yields standard deviations (which we use as a proxy for uncertainty) that are similar to generally accepted uncertainty values for satellite-derived chlorophyll estimates (35%-50%). 8,16 This lends further credence to our selection of 2% noise levels. If 2% was radically too high, our results would show unreasonably highchlorophyll standard deviations, which is not the case. We also compare the range of ensemble variability to observed satellite climatological variability, to further assess the appropriateness of this selection of 2% random noise (Sec. 3.2). Finally, keep in mind that the actual noise applied during the ensemble generation was generally less than 2% (that is the maximum amount applied, the random noise applied to the individual ensembles was between 0% and 2%).
To partition the error sources in a satellite-derived bio-optical product, we apply the noise to separate wavelength sets. For example, to examine the effect of sensor radiance measurement uncertainty on the atmospheric correction process, we apply noise only to the TOA radiances in the two near-infrared (NIR) MODIS channels used in the atmospheric correction routines to select the aerosol models (748-and 869-nm bands). To examine the effect of measurement uncertainty on the bio-optical inversion process, we apply noise only to the TOA radiances in the seven visible MODIS channels (412, 443, 488, 531, 547, 667, and 678 nm) used in the algorithms to estimate water bio-optical properties, such as chlorophyll, absorption, and backscattering coefficients (see Sec. 2.1 for further clarification on the chlorophyll algorithm used and the effects of the modified radiance values). To examine the combined uncertainty due to both the atmospheric correction and bio-optical inversion processes, we apply noise to both the NIR and visible band sets. Figure 1 schematic summarizes the ensemble approach applied to satellite ocean color imagery. Here, we only examine the effect of noise applied to the TOA radiances; we do not examine the effect of noise on the coefficients in the bio-optical inversion algorithms, which can be addressed separately.
For the ocean color error partitioning, the generation of the random TOA radiance noise is repeated 100 times to create an ensemble suite of 100 chlorophyll images for the analysis. For the bio-optical forecasting, only 20 ensemble members are generated; each will be advected forward in time by the hydrodynamic model currents to create 1-to 3-day forecasts of the chlorophyll distribution, as described below.
In addition to developing ensembles to examine uncertainties in the satellite products, we generate ensembles from a hydrodynamic model to examine uncertainties in the ocean currents. For the hydrodynamic ensembles, we perturb the model initial and boundary conditions and atmospheric forcing 32 times, to create an ensemble suite of 32 different model current fields.
We then use these ensemble sets to perform two sets of chlorophyll forecasts. For the first case, we do not include any uncertainty in the initial chlorophyll field; we simply use the original chlorophyll image without any noise applied. This means we are treating the satellite chlorophyll distribution as error-free and assuming no uncertainty in the values. The single initial image is advected separately 32 times using the ocean ensemble suite, resulting in 32 different chlorophyll forecast realizations. Thus, the variability in the chlorophyll forecast is only due to the uncertainty in the model currents. For the second case, we no longer assume that the chlorophyll field is perfectly known; we use the 20 chlorophyll ensemble members (with noise applied to both the visible and NIR band sets) as an indication of possible variability. Each chlorophyll ensemble member is advected forward in time using each hydrodynamic ensemble member to create a forecast ensemble suite with 640 ensemble members (i.e., 20 × 32 ¼ 640 separate chlorophyll forecast images). Thus, the variability in these chlorophyll forecasts is due to uncertainties in both the model currents and the initial chlorophyll image. From each ensemble set (with and without noise included in the initial chlorophyll image), we derive mean and standard deviation images for the chlorophyll forecasts. The standard deviation images serve as a proxy for uncertainty. Thus, by comparing the differences in the two forecasts, we can assess the separate effects of the chlorophyll and current uncertainties.

Satellite Imagery/Processing
The Naval Research Laboratory (NRL) at the Stennis Space Center (SSC) in Mississippi has developed an automated processing system (APS) that ingests and processes AVHRR, SeaWiFS, MODIS, MERIS, OCM, HICO, and VIIRS satellite imagery. 17 APS is a powerful, extendable, and image-processing tool. It is a complete end-to-end system that includes sensor calibration, Fig. 1 Schematic representation of the ensemble process applied to satellite ocean color imagery. The goal is to derive an uncertainty estimate for the bio-optical products (chlorophyll in this example), using the ensemble standard deviation as a proxy.
atmospheric correction (with NIR correction for coastal waters), and bio-optical inversion. APS incorporates the latest NASA MODIS code and enables us to produce the NASA standard MODIS products, as well as Navy-specific products using NRL algorithms. We can quickly reprocess many data files (hundreds of scenes/day), and we maintain compatibility with NASA/Goddard algorithms and processing code in the SeaWiFS Data Analysis System (SeaDAS); APS is updated as SeaDAS is updated.
All imagery was processed with consistent atmospheric correction and bio-optical algorithms using the NRL APS Version 4.6, which is consistent with SeaDAS Version 6.3 (both SeaDAS and APS have been modified and updated). The OC3M algorithm was used to estimate chlorophyll concentration; 4 it requires radiances at 443, 488, and 547 nm. Thus, adding noise to the other MODIS visible wavelengths will not directly impact the chlorophyll estimates (the other visible wavelengths are used in other bio-optical algorithms, but those are not presented here). However, an iterative, NIR atmospheric correction tuned for coastal waters was applied, 18 as was a correction for absorbing aerosols. 19 Perturbing the 667-nm radiances has an effect on the NIR atmospheric correction, and the absorbing aerosol correction can potentially affect radiance values at all the visible wavelengths [the absorbing aerosol algorithm increases normalized water-leaving radiances (nLw) (with greater increases at blue wavelengths) by reducing the aerosol radiance subtracted during the atmospheric correction]. Thus, application of these algorithms will indirectly affect the chlorophyll estimates.
With the NRL APS, the architecture is in place for the image ensemble analysis. From an initial MODIS image, we simply create an ensemble of new images by applying the AE2% random noise to the TOA radiance values. Each ensemble image is then reprocessed through APS to yield an ensemble of derived products, such as n L w and chlorophyll (that we examine here) among others.

Hydrodynamic Model
To advect the surface MODIS satellite chlorophyll field and produce 24-, 48-, and 72-h forecast simulations, we used currents derived from the Relocatable Navy Coastal Ocean Model (RELO-NCOM). RELO-NCOM is based on a standardized development and an efficient configuration management to facilitate transitions of new tools and real-time configurations of regional highresolution (order 1 km) ocean predictions. The physics and numerical procedures of NCOM are based on the Princeton Ocean Model and a Sigma/Z-level Model (SZM). It solves a three-dimensional (3-D), primitive equation, baroclinic, hydrostatic and free surface system using a Cartesian horizontal grid, a combination of σ∕z level (i.e., bottom-following/constant depth) vertical grid, and implicit treatment of the free surface. 20 It uses the Mellor-Yamada level 2.5 turbulence closure scheme, and the Smagorinsky formulation for horizontal mixing. 21 For mesoscale real-time applications, boundary conditions are taken from an operational run of the global NCOM (GNCOM). The domain of this particular experiment covered the entire Gulf of Mexico (18°N 98°W, 40°N 79°W), from April 1, 2011, to October 30, 2011. The atmospheric forcing was taken from the regional 15-km coupled ocean/atmosphere mesoscale prediction system run by the Fleet Numerical Meteorological and Oceanographic Center. Tides were introduced at the boundaries and through local tidal potentials. The horizontal grid spacing was set at 3 km and used 50 σ∕z levels in the vertical. The model assimilates local in situ observations along with satellite altimetry and sea-surface temperature data using a combination of model analysis and data; all available observations from global and local databases were assimilated over the full period.
For the chlorophyll forecasts, a "pseudo 3-dimensional" Eulerian advection scheme was used (without molecular or turbulent diffusion terms). With this approach, there are essentially two vertical layers, a 1-m-thick surface layer and a conceptual deep layer to preserve continuity (i.e., there is vertical flux between the two layers, but they move together horizontally). These simulations only include current advection and an assumed uniform vertical chlorophyll distribution based on the surface values. Future enhancements will include addition of diffusion terms, full 3-D vertical layering, and the capability to include more realistic vertical chlorophyll profiles. The forecast simulations do not include any assimilation of in situ chlorophyll data or additional satellite imagery, so currently the values are unconstrained. Also, with this approach, there is an implicit assumption that the bio-optical property (chlorophyll) is conservative. Although this is not strictly true, of course, it may be approximately valid over the short time scales (1-3 days) that we are examining, particularly in coastal areas where transport processes might be expected to dominate biological processes. Therefore, we consider the optical properties to be "pseudoconservative" tracers for our purposes. This allows us to ignore growth and grazing terms for this case and treats the distributional changes as though they are entirely due to dynamical processes. 22 3 Results Figure 1 illustrates the process we used to generate ensembles for an ocean color satellite image. As mentioned above, we generated a suite of 100 individual chlorophyll ensemble members from a single image by randomly applying AE2% noise to the TOA radiances in the visible and NIR bands. Different random noise was applied to each band, but there was no variation from pixel to pixel (i.e., it was constant across the scene). By perturbing the radiances, we are simulating measurement uncertainty or sensor degradation, which affects the atmospheric correction process and subsequently the downstream water-leaving radiance estimates (and eventually the derived bio-optical products, following application of the bio-optical inversion algorithms). Thus, from the single original image, we create a set of 100 separate chlorophyll estimates, each representing some deviation from the unperturbed chlorophyll estimate. We then calculate the standard deviation at each pixel using this image set, and we use this estimate of variability as a proxy for uncertainty. The assumption here is that the ensemble set is realistic, that we are applying a reasonable amount of noise to the radiances, and that the noise encompasses natural variability. We need to test this assumption, to ensure that the ensemble variability will represent accurate estimates of product uncertainties.

Generating Ensembles
As an example test case to demonstrate the methodology, we selected a clear MODIS image covering the northern Gulf of Mexico (October 14, 2011). Figure 2(a) shows the chlorophyll image from the unperturbed, original image following standard processing (consistent with NASA algorithms). The mean chlorophyll image from the 100-member ensemble suite is shown in Fig. 2(b), and the associated uncertainty image (the ensemble standard deviation) is shown in Fig. 2(c). Note that the ensemble mean image looks very similar to the unperturbed chlorophyll image, as we would expect if our ensemble suite is realistic.

Assessing Ensemble Representativeness
An initial step with the use of ensembles is to assess whether the ensemble suite adequately represents reality. In other words, is the ensemble variability representative of natural variability? For this assessment, since we applied noise to the image TOA radiances (L t ), we compared the ensemble L t radiances (mean, minimum, and maximum values) to the L t radiances from the original image (nearly identical; not shown), and to L t radiances derived from selected clear MODIS scenes covering the same area over a 2-year period from 2006 to 2007 (Fig. 3). We refer to the 2-year values as climatological values. The values in Fig. 3 are spatial averages across the Mississippi Bight region (i.e., the entire scene in Fig. 2), as well as temporal averages over the 2-year period for the climatology. For the most part, the ensemble mean and the minimum/ maximum values fall within the envelope of variability described by the climatology. Hence, the ensemble suite is not generating "unusual" variability outside the realm of observed natural variability, and our assessment is that the ensemble suite is realistic. However, the L t radiances for this image (original as well as ensemble mean values) are lower than the climatological means at all wavelengths (solid red versus solid blue lines, Fig. 3).
We also examined the effect of the L t noise on the derived n L w values, to verify that the radiance values following atmospheric correction were also realistic. Again, we compared the ensemble values to the original values [ Fig. 4(a)] and the climatological values [ Fig. 4(b)]. These are mean values averaged across the entire image, as in Fig. 3. The mean ensemble n L w values at the short wavelengths are slightly higher than the original values [ Fig. 4(a)]. Both the ensemble and original n L w values are significantly lower than the climatological values at 412 nm, but the mean and standard deviation values are generally similar at the other wavelengths [ Fig. 4(b)].
To more closely examine the n L w spectral variability of the ensemble suite, and to further assess the validity of the ensembles, we compared the individual ensemble spectra to the spectrum of the original, unperturbed MODIS image, at two locations (instead of looking at spatial and temporal averages, as in Figs. 3 and 4). The first location is a coastal aerosol robotic network  in situ measurements at this location. The ensemble and original spectra were extracted and averaged over 3 × 3 pixel boxes centered at the AERONET and open-ocean locations.
At the AERONET location, the ensemble mean and standard deviation [ Fig. 5(a), solid and dashed red lines, respectively] fall within the observed envelope of measured variability from the in situ data set (3-year mean and standard deviation, solid and dashed green lines, respectively), for most wavelengths. Note that the purpose of this comparison is not to validate the satellite data against the AERONET data, but simply to demonstrate that the generated ensembles are realistic and not outside the longer-term envelope of variability for the region. However, the mean ensemble n L w spectral shape differs from the in situ measured spectral shape; relative to the AERONET mean spectrum, the ensemble mean spectrum has higher n L w values at 443 nm and lower values at 531 nm (the individual ensemble spectra match this pattern as well). The n L w spectral shape from the original, unperturbed image (blue line), also follows this pattern; however, indicating that the differences in the ensemble versus in situ spectral shapes are not due to improper ensemble generation, but due to differences between the original versus in situ spectral shapes.
As at the AERONET site, the ensemble mean spectrum at the open-ocean site [ Fig. 5(b), solid red line] closely matches the original spectrum (blue line), although the ensemble values are slightly higher than the original spectrum at 412 and 443 nm. Note that at both sites, the spectral shapes of the individual ensemble members can differ significantly from the mean spectrum. In some cases, the individual spectra are not even realistic, particularly in the 443-to 547-nm wavelength range at the AERONET location, and in the 412-to 488-nm range at the open-ocean  location. These spectral shape differences will undoubtedly lead to widely varying and probably unrealistic chlorophyll values derived from the individual spectra (although the mean spectrum is fine) and point to the need for spectral constraints when generating the noise ensembles; this is discussed further below and it is a topic of our current research.
Since the ultimate goal is to derive water bio-optical properties from the ocean color imagery, we further compared the chlorophyll frequency distribution from the original, unperturbed image to the ensemble frequency distribution [log chlorophyll; Fig. 6(a)]. As a result of the slightly higher n L w radiances for the ensemble mean at blue wavelengths (412 and 443 nm) compared to the original values [ Fig. 4(a)], the ensemble chlorophyll values are slightly lower than the original chlorophyll values, i.e., the chlorophyll distribution is skewed to slightly lower values across the scene [ Fig. 6(a)]. This is also apparent in the frequency distribution of the percent differences between the original chlorophyll values and the ensemble chlorophyll values [Fig 6(b)]. The ensemble mean chlorophyll value is about 5% lower than the original value (ranges from about 15% lower to about 5% higher than the original chlorophyll).
These results indicate that although our mean L t and n L w ensemble variability falls within the range of variability observed in the climatology and in situ data sets, the "unconstrained" spectral noise on the L t values might be creating some unrealistic individual ensemble members. By "unconstrained," we mean that the noise applied to one wavelength is independent of the noise applied to another wavelength, which could potentially lead to the undesirable effect of creating L t spectral shapes (and subsequently n L w spectral shapes) that are not realistically observed in nature (e.g., see some of the individual ensemble spectra in Fig. 5). This effect would be particularly troublesome at blue and green wavelengths, since the spectral shape in this portion of the spectrum determines the blue-green ratio, which is used in the OC3M chlorophyll algorithm. Furthermore, a 2% increase in L t at 443 nm and a 2% decrease at 547 nm has a more significant impact on the chlorophyll retrieval than a 2% increase at both wavelengths. Thus, these altered spectral shapes could significantly change the derived chlorophyll value. We are currently developing a method to apply spectral constraints during the generation of the L t ensembles that should eliminate the erroneous individual spectral shapes and yield an ensemble chlorophyll frequency distribution that is more consistent with the original distribution.
In Fig. 6, we compared the ensemble mean chlorophyll to the original chlorophyll (from the unperturbed image) and observed that, on average, the ensemble chlorophyll was slightly less than the original chlorophyll. We have also compared the ensemble chlorophyll range to the climatological range observed in the 2-year MODIS climatology. Figure 7 shows the spatial distributions of the differences between the ensemble minimum chlorophyll value at each pixel and the climatological minimum chlorophyll value [ Fig. 7(a)], and the difference between the climatological maximum chlorophyll value and the ensemble maximum chlorophyll value [ Fig. 7(b)]. Pixels in yellow indicate areas where the ensemble range (minimum or maximum) falls outside of the climatological range. The minimum ensemble chlorophyll values fall slightly below the minimum climatological chlorophyll values at many of the offshore pixels [ Fig. 7(a)], suggesting again that perhaps the ensembles require adjustment (through application of spectral constraints to the L t noise values applied).

Partitioning Chlorophyll Differences and Uncertainties
We examined the spatial distributions of the differences between the mean ensemble and original chlorophyll values; Fig. 8 shows the percent differences between the two. We first apply noise to the L t values in both the visible and NIR channels [ Fig. 8(a)]. This demonstrates the noise impact on the complete processing. Then, by only applying noise to the L t values in the two NIR bands (748 and 869 nm), we can assess the effects of the noise only due to the atmospheric correction process [i.e., differences due to different aerosol selection models, Fig. 8(b)]. Similarly, by only applying noise to the visible channels (412, 443, 488, 531, 547, 667, and 678 nm), we can assess the effects of the noise on the bio-optical inversion algorithms [ Fig. 8(c)]. Figure 8(a) indicates that the mean ensemble chlorophyll values are generally lower than the original chlorophyll values across most of the image, by about 5-10% (mean difference, averaged across the entire image is −4.8% when noise is applied to all channels [which is consistent with Fig. 6(b)]. A much lower percent difference is observed when noise is applied to just the NIR channels [mean difference ¼ −0.5%, Fig. 8(b)]. When noise is applied to just the visible wavelengths [ Fig. 8(c)], the mean difference is −4.3%, indicating that a relatively larger proportion of  the observed chlorophyll differences is due to the bio-optical inversion algorithms, rather than the atmospheric correction.
Using the noise partitioning approach described above (for Fig. 8), we can also examine the separate effects of the atmospheric correction and bio-optical inversions on the uncertainty distributions (Fig. 9). Figure 9(a) shows the coefficient of variation (CV) expressed as a percentage (CV, the ratio of the standard deviation to the mean * 100) across the image, when L t noise is applied to all wavelengths (both visible and NIR). The mean CV across the image is 33.6%. Figure 9(b) shows the result when noise is applied only to the NIR bands (mean CV ¼ 11.5%), and Fig. 9(c) shows the results for noise applied only to the seven visible bands (mean CV ¼ 30.8%). As in Fig. 8, Fig. 9 indicates that most of the uncertainty is associated with the bio-optical inversion algorithms (expressed by adding noise to the visible bands), rather than with the atmospheric correction (expressed by adding noise to the NIR bands).

Forecasting Chlorophyll, Partitioning Uncertainties, and Assessing Skill
In addition to generating chlorophyll ensembles to create uncertainty maps for the chlorophyll product, we also generated a short-term (3-day) forecast of the chlorophyll distribution (along with an accompanying uncertainty image for the forecast), by coupling the satellite image with current estimates from a hydrodynamic model. To do this, we examined a clear, 3-day period (covering the Mississippi Bight region in the northern Gulf of Mexico) from October 14-17, 2011. This hydrodynamic approach treats the chlorophyll as a passive tracer and only accounts for dynamical processes (winds, currents, and tides) and does not include biogeochemical mechanistic processes (growth and grazing); it allows us to examine the effect of just current variability on the bio-optical forecasts.
We created two sets of ensembles, one set for the ocean hydrodynamic model currents and one set for the initial chlorophyll image to be advected using the currents; by examining them separately or in combination, we can assess the uncertainty in the 3-day forecast due to the uncertainty in the currents (top panel in Fig. 10), or due to the combined uncertainty in the chlorophyll estimate and the currents (bottom panel in Fig. 10). An ensemble of 20 chlorophyll images was generated for the initial October 14 MODIS scene by applying random noise (AE2%) to the L t radiances for all the visible (7) and NIR (2) channels. An ensemble of 32 ocean model members was generated by varying initial and boundary conditions, and atmospheric forcing. Thus, a total of 640ð32 × 20Þ chlorophyll forecasts were generated by advecting each of the 20 chlorophyll ensemble members from October 14 for 3 days with each of the 32 ocean current ensembles. Fig. 9 Chlorophyll coefficient of variation for noise applied to separate band sets (partitioned uncertainty) for the October 14, 2011 MODIS image (a) L t noise applied to both NIR and visible wavelengths, (b) L t noise applied to just NIR wavelengths, and (c) L t noise applied to only the visible wavelengths.
In Fig. 11(a), the unperturbed chlorophyll image for October 17 (end of the forecast period) is shown, for comparison to the forecast chlorophyll images. The mean ensemble chlorophyll forecast resulting from the 3-day simulation when only the hydrodynamic uncertainty was included (mean of 32 ensemble members) is shown in Fig. 11(b) for October 17. In this case, we are essentially assuming that the initial chlorophyll image is error-free. The mean ensemble chlorophyll forecast resulting from inclusion of both the hydrodynamic and chlorophyll uncertainties (mean of 640 ensemble members) is shown in Fig. 11   River plume and advected toward the east [high-chlorophyll feature near 29°N, 88°W in Fig. 11(a)].
The standard deviation image corresponding to Fig. 11(b), for the simulation including only the hydrodynamic uncertainty, is shown in Fig. 12(a). The highest standard deviation values (i.e., greatest uncertainty in the currents) are along the Mississippi River delta. The standard deviation image corresponding to Fig. 11(c), for the simulation including both the hydrodynamic and chlorophyll uncertainties, is shown in Fig. 12(b). Now, high-standard deviation values are observed along the Louisiana, Mississippi, and Alabama coastlines, in addition to the high values near the delta. Thus, the addition of the chlorophyll uncertainty has increased the forecast uncertainty along the coastlines [i.e., the standard deviation field that would result if image 12(a) was subtracted from image 12(b)]. Figure 12(c) shows the forecast mismatch (i.e., the difference between the forecast and observed chlorophyll distributions, without regard to sign, |Figs. 11(c)-11(a)|), for the forecast that includes both hydrodynamic and chlorophyll uncertainties. If we compare Fig. 12(c) to the corresponding forecast standard deviation [ Fig. 12(b)], we see that the patterns match quite well (except for the eddy-like feature mentioned above); the largest ensemble standard deviations correspond to the largest forecast observation mismatches. This indicates that the forecast mismatch is related to both the hydrodynamic and chlorophyll uncertainties and that the chlorophyll is behaving conservatively over most of the image area (i.e., our assumption that the chlorophyll is behaving as a passive tracer over this relatively short time period is valid). Thus, we can account for most of the mismatch by the uncertainties in the currents and in the satellite-derived chlorophyll estimates.
By subtracting Fig. 12(b) from Fig. 12(c), we can get an indication of the chlorophyll sources and sinks (Fig. 13). In this figure, the red pixels indicate a chlorophyll "sink," where observed values were lower than the forecast values, and the blue pixels indicate a chlorophyll "source," where the observed values were higher than the forecast values. White pixels indicate differences close to 0. These sources and sinks are due to errors in the model currents and/or processes that affect the chlorophyll concentrations that are not accounted for in the model, such as growth and grazing. This image is essentially an assessment of the forecast mismatch after trying to account for the uncertainty. Notice that, except for the eddy feature, the sources and sinks are located along the coast and near the Mississippi River delta. The eddy feature highlighted as a source is most likely due to erroneous current forecasts in that area and probably represents a separation of high-chlorophyll water from the Mississippi River plume, rather than in situ phytoplankton growth.
For a more quantitative evaluation of the forecast skill, we can examine spread-skill statistics. Spread-skill scatter plots of the standard deviation of the observed mismatch between the forecast and observed distribution versus the ensemble predicted standard deviation are shown in Fig. 14. Each point in Fig. 14 represents a bin of 1000 pixels. Good spread-skill (a linear distribution following the one-to-one line) indicates that the predicted standard deviation increases with the mismatch standard deviation. For the forecast with only hydrodynamic uncertainty included [ Fig. 14(a)], there is good spread-skill at standard deviations less than about 1.6, but deviation from the 1∶1 line are observed at standard deviations above this value. The points above the 1∶1 line indicate sources and sinks not accounted for in the model (i.e., the red and blue pixels in Fig. 13). For the forecast with both hydrodynamic and chlorophyll uncertainties included [ Fig. 14(b)], there is improved spread-skill, indicated by the more linear distribution over a larger range (higher coefficient of determination, R 2 ). The distribution of points below the 1∶1 line up to X values of about 2.5 indicates a slight overprediction of uncertainty from the ensembles, whereas the points above the 1∶1 line at higher values indicate an underprediction. In general, the forecast demonstrates good spread-skill with both properties increasing fairly linearly.  Finally, we examine the ensemble root-mean-square (RMS) difference to assess the spread of the ensemble variability and look for outliers in the ensemble suite (Fig. 15). This matrix represents the RMS difference between an individual ensemble forecast image and the reference image (the unperturbed October 17 image), for each of the 640 ensemble members. The vertical dimension represents the 20 image ensemble members and the horizontal dimension represents the 32 ocean current ensemble members. The horizontal banding at image ensemble members 7 and 18 (higher RMS differences compared to the other image ensemble members) indicates that these two ensemble members have chlorophyll values significantly different from the others. These two are outliers and could be removed (or the image ensemble suite should be adjusted using spectral noise constraints as discussed above) to improve the estimates of the forecast uncertainty.

Concluding Remarks
We developed, applied, and evaluated ensemble methodology to satellite ocean color imagery. We extended advances by the numerical modeling community to represent observational and algorithm error sources for ocean ensembles, and we evaluated the ensemble representations and assessed ensemble metrics. We investigated uncertainties (coupled and uncoupled) associated with satellite instrumentation errors, bio-optical algorithm processing errors, and advective model current errors. We recognize that these errors do not cover the complete uncertainty space associated with satellite-derived observations, but we consider them as most critical for representation of satellite ocean color forecast product uncertainties.
We first assessed whether our ensemble members were realistic, and whether they adequately captured observed environmental variability, by comparing the ensemble spectral radiances (L t and n L w ) to the original, unperturbed radiances, and to natural variability observed in climatological ocean color imagery and in situ radiance measurements from the same geographic region. We observed a slight (5%) bias in the mean ensemble chlorophyll values compared to the original chlorophyll values, which was due to the inclusion of spectral outliers in the ensemble set. We are currently developing a method to spectrally constrain the radiance ensembles to address this issue.
Next, we partitioned the chlorophyll variability into separate components due to the atmospheric correction and due to the bio-optical inversion, by applying noise to the NIR and visible wavelength band sets separately and together. We observed that most of the variation in the derived chlorophyll values was due to the bio-optical inversion algorithms. Also, based on our initial results, it seems that adding L t radiance noise to the NIR wavelengths might have a greater impact on chlorophyll uncertainty in clear, offshore waters than in the more turbid coastal waters [based on the pattern observed for the CV, in Fig. 9(b)]. However, our results are based on limited test cases and would require additional analyses to make any definitive comments regarding derived uncertainties in clear versus turbid waters.
Finally, we produced chlorophyll forecast images by advecting the satellite imagery forward in time using hydrodynamic model currents. We again applied an ensemble approach, to assess the uncertainties in the resulting forecast. We applied the ensembles in two ways, to partition the uncertainties. In the first case, we generated only hydrodynamic ensembles to assess forecast uncertainties only due to uncertainties in the model currents. In the second case, we generated both hydrodynamic and chlorophyll ensembles, to assess forecast uncertainties due to uncertainties in both the currents and in the satellite-estimated chlorophyll values. This approach allowed us to spatially separate the dominant sources of uncertainty in the forecast images. The results demonstrated that the regions of highest hydrodynamic uncertainty do not necessarily correspond to the regions of highest satellite uncertainty [Figs. 12(a) and 12(b)], and even when both sets of uncertainty are included, the forecast chlorophyll image can show mismatches with the observed image [ Fig. 12(c)]. The mismatches are due to incorrect advection by the currents in the hydrodynamic model and to the simplified modeling/advection approach that does not include chlorophyll source/sink terms. For example, the eddy-like feature observed to the east of the Mississippi River delta in the MODIS image from October 17 [ Fig. 11(a)] does not show up in either forecast [the one that includes only hydrodynamic uncertainty Fig. 11(b) or the one that includes both the hydrodynamic and satellite uncertainties Fig. 11(c)].
An ensemble approach allows quantitative error evaluations and error cascading to estimate uncertainties in satellite-derived surface bio-optical properties. This approach also enables us to gain a better understanding of the uncertainties at all levels of the processing of satellite ocean color imagery, as we demonstrated by partitioning the chlorophyll uncertainty into separate components due to the atmospheric correction and bio-optical inversion algorithms. In addition, by combining the chlorophyll ensemble suite with a hydrodynamic ensemble suite, we can produce chlorophyll forecasts that include both sources of uncertainty. Our analyses indicate that the ensemble standard deviation represented the forecast uncertainty, so that the standard deviation images can be used as a proxy for uncertainty. Thus, the ensemble approach can provide important new information in the form of uncertainty maps to accompany satellite ocean color image products, rather than providing only "best guess" optical products to end-users as has been done in the past. This work will also lead to improved bio-optical forecasts from data assimilative forecast systems, by providing uncertainty estimates to accompany the bio-optical fields assimilated into coupled biophysical ecological models.