# A Statistical Modeling Framework for Characterising Uncertainty in Large Datasets: Application to Ocean Colour

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. The Problem: Remote Sensing Uncertainty Accounting

_{rs}) spectrum, assigning a bias and standard deviation (SD) to each class [6]. A promising approach that is emerging across many branches of remote sensing uses metrological methods [7] to model contributions to uncertainty at each stage of processing and propagate it to the next stage [8]. This approach is highly rigorous, but application to fields with a complex processing chain such as ocean colour is a huge task that is likely to proceed gradually.

_{SAT}) and reference or validation data (in this case in situ chlorophyll-a, chl

_{IS}), trained using a database of matchups (co-locations in time and space) between the two (in this case covering mostly the eastern North Atlantic and neighbouring seas). It should be noted here that we are presenting the method rather than the specific results, which are only intended to illustrate the method and are far from general. However, we present this example in detail in order to give the reader an idea of the issues involved in a practical application of the method. This method is purely statistical, with no attempt at error propagation. It can be seen as a final stage of processing—after all efforts to explicitly model a parameter (and possibly its uncertainty), this method estimates the residual uncertainty and its dependencies, and hence gives an indication of where the explicit models can be improved.

#### 1.2. Statistical Modeling

- location or central tendency (e.g., the mean);
- scale or range of variation (e.g., the SD);
- shape (e.g., skewness or kurtosis).

_{1}and v

_{2}, with a mean described as the sum of a polynomial term in v

_{1}and an exponential term in v

_{2}, and a standard deviation described as the sum of a linear term in v

_{1}and a nonlinear term in v

_{2}characterised using cubic splines. The gamlss smoothing functions of most relevance to this work are pb, ps, and pbc, which respectively fit beta splines, cubic splines, and cyclic beta splines to the data, with the option of determining the number of degrees of freedom of the spline from the data. The pb function tends to produce more curved splines that follow variations in the data more closely, while ps produces smoother splines. gamlss finds the optimal parameters of the distribution by minimising the global deviance GD

## 2. Materials and Methods

#### 2.1. Data—The Matchups Database

_{IS}and satellite observed chl as chl

_{SAT}.

_{IS}measurement, we searched for all overlapping MODIS-Aqua overpasses within ±12 h. This is a larger range than normally used for matchups, allowing us to investigate the effect of time difference on matchup uncertainty. For each overlapping overpass, we formed a 3 × 3 pixel grid of the nearest pixels to the in situ measurement, treating each pixel as an independent matchup. Note that due to the unusual geometry of the MODIS-Aqua sensor [14], these may not be from adjacent MODIS-Aqua scan lines. This gave a total of 2951 satellite-in situ comparisons (matchups), consisting of up to nine satellite pixels for each in situ measurement. Each pixel was stored as a separate line of the matchups database, recording the in situ date, time, latitude, longitude, and chl

_{IS}, and the satellite granule ID, time difference, chl

_{SAT}, sun and view zenith angles, wind speed, glint radiance at 869 nm, aerosol optical depth at 869 nm, R

_{rs}in all visible and near infrared bands (see Table A1), and all Level 2 flags.

#### 2.2. Candidate Explanatory Variables

- We expect chl
_{SAT}itself to be related to chl errors. The first indication that the chl retrieval has failed is usually the presence of outliers, or implausible values of chl_{SAT}. Also, the task of retrieving chl in oligotrophic waters is very different from that in highly eutrophic waters, so we expect retrieval uncertainty to vary with chl [15]. We therefore included ln(chl_{SAT}) as a candidate explanatory variable. Other derived geophysical products that could give insight into the performance of the chl algorithm, such as inherent optical properties, could also be used. - The standard OC3 MODIS-Aqua chl algorithm is based on ratios of R
_{rs}[13], so we expect R_{rs}at different wavelengths to affect errors in different ways. The R_{rs}spectrum is also indicative of water type [6], with different water types having different effects on atmospheric correction (e.g., highly scattering waters are bright in the near infrared) and on the quality of chl retrievals (e.g., highly absorbing waters can give erroneously high chl values) [15]. Here, we represented the R_{rs}spectrum by simply including all visible wavelength spectral bands of R_{rs}, though many other combinations such as R_{rs}ratios may also be useful. See Table A1 for a list of spectral bands used. - The solar and view zenith angles affect retrievals through atmospheric and surface effects. When light passes obliquely through the atmosphere, it is attenuated more than a vertical beam, and it also becomes harder to predict the effect of interaction of the light with the atmosphere and the sea surface; hence, atmospheric correction uncertainty is expected to increase [16]. The solar zenith angle also affects the amount of light available to phytoplankton at the time of measurement. We represented solar and view zenith angles with 1/cos(zenith angle), a measure of how much atmosphere light has to pass through without being absorbed or scattered. We also represented the airmass as 1/cos(solar zenith angle) + 1/cos(view zenith angle), a measure of the total atmospheric path length from sun to satellite via the sea surface.
- Wind speed affects retrievals through disturbance of the sea surface, making it more difficult to predict its effect on incident light, particularly at high zenith angles [17]. Wind can also create wave breaking and surface foam, which appears bright in the near infrared, potentially causing problems for atmospheric correction [18].
- Sun glint is bright in the near infrared and can cause problems for atmospheric correction. It can be modeled as a function of wind speed, view geometry, and wavelength [17], and we would like to represent sun glint with the modeled glint at 869 nm. However, in the matchups database there were many pixels for which this product was missing, and we could find no satisfactory way of including an explanatory variable with so much missing data, so this variable was excluded from analysis.
- Aerosol haze makes retrieval of chl more difficult by scattering and absorbing light [19]. Thin or sub-pixel cloud that does not trigger the CLDICE flag is also interpreted as aerosol in MODIS-Aqua processing [20]. We would like to represent aerosol with the retrieved aerosol optical depth at 869 nm, but again there were many pixels for which this product was missing, so it was excluded. Although sun glint and aerosol are not used in this analysis, they are mentioned because they are likely to be important sources of uncertainty in satellite ocean colour.
- The date could influence chl uncertainty in two ways, through long-term changes and through seasonal variations. Long-term changes could be due to sensor degradation with time [21], or to climatological ecosystem changes, which could affect the quality of chl retrievals from space. We represented long-term changes with the satellite age in days to try to detect sensor degradation effects, since here we are only dealing with one satellite. If multiple sensors were used, we could try to distinguish changes due to sensor degradation from those due to ecosystem changes by looking for sensor-specific changes. Seasonal variations could be due to the seasonal cycle of phytoplankton, or to optical changes, the most obvious of which is the change in sun zenith angle. We represented this with the day of year, which should be represented as a cyclic function with a 365-day cycle. Our data are all in the northern hemisphere, but if data from both hemispheres are used, it may be necessary to create an interaction term between latitude and day of year.
- Latitude (together with day of year) determines the day length, a fundamental influence on phytoplankton ecology, as well as the solar zenith angle. Ocean circulation patterns also tend to segregate the oceans into zonal provinces (e.g., [22]). Our in situ data are geographically too sparse to reliably distinguish provinces, so instead of using latitude directly, we used day length calculated from latitude and day of year, as well as solar zenith angle (see above).
- The time difference between satellite and in situ measurements obviously has a potential impact on the quality of the retrieval, and this is traditionally accounted for by imposing a maximum time difference on matchups, e.g., ±6 h. This approach has the same drawback as the Level 2 flags: that a measurement slightly less than 6 h away from the in situ measurement, that almost exceeds several other mask thresholds, is given full weight in calibration or validation exercises, while one slightly more than 6 h away that is otherwise exemplary is given zero weight. We included time difference as an explanatory variable in order to try to quantify its effect. This might shed light on the choice of maximum time difference, as well as allowing us to weigh calibration or validation measurements according to time differences. We would actually expect this uncertainty to be dependent on how rapidly chl is changing at the point of measurement. Given sufficient data, it may be possible to distinguish different weightings or maximum time differences in different regions. Another possible way that time differences could influence retrieval quality is through diurnal changes in chl. Since a sun-synchronous satellite measures at approximately the same time of day everywhere, this effect would be seen as a bias due to time difference for a given satellite orbit, while we might expect differences due to non-diurnal changes to occur as often in one direction as the other, and so appear as an increase in SD.
- Level 2 flags are intended to inform users of the circumstances surrounding the pixel measurement. Some are simply informative (e.g., this is a land or shallow water pixel), some are warnings (e.g., suspected sun glint), and some denote errors (e.g., atmospheric correction or chl algorithm failure). The aim of this work is to replace flag-based approaches with continuously varying uncertainties, so we did not include Level 2 flags. However, we did examine the effectiveness of level 3 masking in eliminating pixels with large errors, and the results were not as expected. Histograms of δln(chl) = ln(chl
_{SAT}) − ln(chl_{IS}) for both the whole dataset and only pixels masked at level 3 showed that the pixels masked at level 3 had a consistently lower bias, with a root mean squared δln(chl) of 0.871 for the whole dataset and 0.714 for the pixels masked (i.e., rejected as suspected low quality) at level 3. This is the opposite of the intended effect of masking. - Chlorophyll can exhibit high variability on many spatiotemporal scales, and we would expect high variability on the scale of the satellite-in situ comparison (up to a few km and hours) or smaller to result in increased chl discrepancies. We represented spatial variability with the SD of the ln(chl
_{SAT}) associated with each chl_{IS}, of which there can be up to nine. This combines two possible effects, spatial variability in chl, and effects causing changes in chl_{SAT}such as stray light. MODIS-Aqua data generally recur in a given location once a day at best except at high latitudes, so representation of temporal variability is not possible using these data. HPLC is time consuming and expensive, so in situ HPLC data with high spatiotemporal resolution are rare. However, if this analysis were repeated using in situ data with the appropriate resolution, such as fluorometric chl from autonomous sensors, the effect of in situ chl variability could be studied. - The number of valid chl values in the 3 × 3 grid, either at level 2 or 3, is an indication of the presence or absence of features such as clouds or land that may increase uncertainty in the remaining values. Here, we used the number of valid chl values at level 2.
- If a more analytic uncertainty model is available, then the outputs of this (e.g., bias and SD, or just overall uncertainty) may be used as inputs to this model. If the prior model is perfectly successful, then, e.g., output bias will equal input bias with no other dependencies. It is far more likely that other dependencies exist, giving insight into how the prior model can be improved.

#### 2.3. Statistical Modeling

_{SAT}uncertainty to be δln(chl), which we assume to be normally distributed, neglecting chl

_{IS}uncertainties [23]. Initial work focused on [δln(chl)]

^{2}as a measure of the overall uncertainty, but this loses the distinction between bias and SD, and between positive and negative bias. Plotting δln(chl) or ln(chl

_{IS}) as a function of ln(chl

_{SAT}) shows the expected clear positive bias at high chl

_{SAT}and a slight negative bias at low chl

_{SAT}(Figure 2a, small circles and solid grey 1:1 line). Applying a traditional simple error analysis with globally constant bias and SD, δln(chl) has mean (bias) 0.37 and SD 0.79, with a root mean square deviation (RMSD = $\sqrt{\overline{\mathsf{\delta}\mathrm{ln}{\left(\mathrm{chl}\right)}^{2}}}$) of 0.87 in natural log space. Simple subtraction of the global bias would reduce the RMSD to 0.79, explaining 18% of the squared deviation, though it is clear from Figure 2a that significant biases would remain in the data, and this may increase bias at intermediate chl values.

_{SAT}) (see Table A2 for detailed results). When we replace the basic model above with the simple but implausible choice of a linear mean model, i.e., modeling the mean of δln(chl) as a linear function of ln(chl

_{SAT}) with a constant SD, the mean deviance reduces by 0.266 with gamlss and 0.256 with gamlssCV, and the difference between the two increases to 0.025. Using the GAMLSS ps (cubic spline) function for the mean but still with a constant SD, the mean deviance is reduced by 0.068 with gamlss and 0.044 with gamlssCV, and the difference increases to 0.05. This model is shown in Figure 2a as a solid red line. If we replace the ps function with the more responsive pb (beta spline) function, shown in Figure 2a as a solid blue line, the mean deviance is reduced by 0.063 with gamlss and 0.012 with gamlssCV, with a difference of 0.101. In this example, the pb function is not over-fitted in comparison to ps, but the improvement in mean deviance (0.012) is much less than that suggested by gamlss (0.063).

_{SAT})) to the ps mean model reduced the mean deviance by 0.123 with gamlss and 0.072 with gamlssCV, with a difference of 0.152. The increasing difference from simple to more complex models highlights the increased need for independent checking as the model complexity increases, but the gamlssCV mean deviance of the final model is the lowest found so far (by 0.017), so the new model is not over-fitted in comparison to the previous models. The final model has a RMSD of 0.68, explaining 40% of the deviation as bias. By investigating changes to the best model and choosing those changes that decrease the gamlssCV mean (or global) deviance most, we optimise our model.

_{SAT}). Note that SE as used here is not the same as the SE of a dataset, commonly evaluated as SD/$\sqrt{N}$. To evaluate the overall uncertainty, we use the square root of the sum of the squares of bias, SD, and SE (the root squared sum, henceforth RSS), i.e., we assume them to be uncorrelated. A more rigorous treatment would account for covariance between them, which could be calculated from the training data, but this approximation is sufficient to illustrate the method. This uncertainty can be evaluated for any combination of explanatory variables, allowing us to produce uncertainty maps for arbitrary satellite data. If the mean is subtracted from ln(chl

_{SAT}) to give a bias-corrected estimate of ln(chl

_{IS}), the remaining uncertainty is the RSS of SD and SE.

_{IS})) that also both have the same ln(chl

_{SAT}). This can happen because the satellite product is stored digitally, so two pixels with the same digital number will be ascribed exactly the same chl value. The calculation of the mean is robust with respect to such duplication, but the calculation of SD is not. In the absence of other points nearby, the presence of two identical values implies a local SD of zero, forcing a responsive model of SD towards zero. When these points are at the tail of the distribution, as in this case, the result is that the model SD tends strongly towards zero as the tail is approached.

- Start with a basic model (we used constant mean and SD);
- Try adding each explanatory variable in turn to the mean and keep only the one with the lowest global deviance;
- Repeat 2, adding further variables to the mean until no variable improves the global deviance;
- Repeat 2–3, adding variables to the SD;
- Try removing each variable in turn from the SD and keep only the removal that results in the lowest global deviance. Repeat until no variable removal improves the global deviance;
- Repeat 5, removing variables from the mean.

#### 2.4. Application to Satellite Data

_{IS}, but it is possible that information is lost or distorted in this process. For example, if the spatial noise in the bias image is greater than its magnitude, bias subtraction will result in increased noise without meaningful improvement and could potentially obscure features visible in the uncorrected image.

_{SAT}with uncertainty equal to the combined bias, SD, and SE, or with bias corrected chl

_{SAT}with uncertainty equal to combined SD and SE. If the uncertainty model is sufficiently accurate and comprehensive, this should result in a reduced incidence of outliers in composites, as well as giving per-pixel composite uncertainties and reducing, or perhaps even eliminating, the need for masking of suspect data.

## 3. Results

#### 3.1. Model Dependencies

_{SAT})) + ps(day length) + ps(R

_{rs}(412)) + pbc(day of year) + ps(satellite age) + ps(R

_{rs}(469)) + ps(R

_{rs}(531)) + ps(time difference) + ps(airmass) + ps(1/cos(view zenith angle)) + ps(R

_{rs}(547)) + ps(R

_{rs}(555)). The best model found for the SD in step 4 consisted only of ps(R

_{rs}(645)), noting that the failures of the gamlss function referred to in Section 2.3 mean that this is probably not an optimal SD model. It also exhibits the problem described in Section 2.3, that the SD tends to zero at extreme values, in this case high values. There were no changes made in steps 5–6. The apparent mean deviance using gamlss was 0.98, accounting for 76% of the squared deviation as bias, and the actual mean deviance using gamlssCV was 1.49, accounting for 67% of the squared deviation as bias. This model performs significantly better than the best model found using ln(chl

_{SAT}) alone (mean deviances 1.84 and 1.99, actual squared deviation explained 40%).

_{PRED}) is used as a measure of the magnitude of the impact of the explanatory variable on the bias, and is shown in Table A1. Care should be taken to distinguish SD

_{PRED}from the model prediction of the SD of δln(chl), and to distinguish the order of these impact values from the order in which variables were added to the model, which is a measure of the impact of inclusion of the variable on the model’s ability to represent the residual (defined as (measured value − mean)/SD) as normally distributed with mean 0 and SD 1. This is not the same as the model explaining the variance of the data, for example, and selection of a new explanatory variable can change the impact of the previously selected variables, so there is no guarantee that the impacts will decrease with order of selection.

_{rs}(469) (SD

_{PRED}= 1.21), with lesser effects from ln(chl

_{SAT}) (0.79), R

_{rs}(412) (0.78), R

_{rs}(547) (0.76), R

_{rs}(555) (0.66), and R

_{rs}(531) (0.65), and much lesser effects from day length (0.32), 1/cos(view zenith angle) (0.30), day of year (0.28), airmass (0.27), satellite age (0.14), and time difference (0.11). The explanations for these explanatory variables can be found in Table A1. It is interesting that R

_{rs}(547) and R

_{rs}(555) were the last to be selected but have among the highest impact on the bias, and that the two wavelengths are very close together but have opposite impacts, suggesting that the ratio of the two is an important factor in determining satellite chl bias. R

_{rs}(547) is an important part of the MODIS-Aqua OC3 chl algorithm, forming the denominator of the band ratios used to calculate chl, but R

_{rs}(555) is not used in the OC3 algorithm, because it has lower sensitivity than R

_{rs}(547).

#### 3.2. Visualisation of Uncertainties

_{rs}(645) (see Figure 3b), SD values lower than those found by applying the SD model to the training data are set to the minimum of these values. To convert these to an estimate of overall uncertainty in chl, we multiplied their RSS by chl

_{SAT}(Figure 4b). Comparison of Figure 4a,b shows high uncertainty in coastal zones, river plumes, and the Baltic Sea, all areas where satellite chl algorithms are known to have problems, especially where chl

_{SAT}is implausibly high, so this initial uncertainty map looks plausible and has no obvious model artefacts such as banding or noise.

_{SAT}in Figure 4d. The distribution of chl shown in Figure 4c looks much more plausible than that in Figure 4a, and areas where it remains implausibly high, e.g., North Sea river plumes and the Baltic, have correspondingly high residual uncertainty in Figure 4d.

#### 3.3. Effect on Composites

^{−2}. It would also be desirable to generate maps of the composite uncertainty and produce combined maps similar to those in Figure 4. If only one overpass contributes to a map grid cell, the uncertainty is the same as in Figure 4. With more than one overpass, simple error propagation assuming uncorrelated errors gives us a composite uncertainty equal to $\sqrt{\sum {\left(w\sigma \right)}^{2}}/\sum w$, in which w is the weight of each overpass in the grid cell and σ is its uncertainty. Applying w = σ

^{−2}gives composite uncertainty equal to $1/\sqrt{\sum w}$. Note, however, that uncertainties are very likely to be correlated in this case.

_{c}, the uncertainty model predicts that each bias-corrected ln(chl) measurement would be distributed with measurement-independent mean μ

_{c}and a measurement-dependent standard deviation σ

_{c}equal to $\sqrt{{\mathrm{SD}}^{2}+{\mathrm{SE}}^{2}}$, assuming uncorrelated errors. If we assume μ

_{c}to equal the bias-corrected weighted mean calculated above, we can subtract this from all measurements and divide each measurement by its σ

_{c}to give a ‘model residual’, which should be distributed with mean 0 and standard deviation 1. However, there may also be natural variation in ln(chl) between measurements, a further source of composite uncertainty not included in the model.

_{r}of [ln(chl) − bias − weighted mean]/σ

_{c}and, if this is greater than 1, attribute the excess to natural variation in ln(chl) between measurements with a standard deviation of σ

_{n}. Assuming all terms to be uncorrelated, ${\sigma}_{c}^{2}$ then becomes ${\mathrm{SD}}^{2}+{\mathrm{SE}}^{2}+{\sigma}_{n}^{2}$. At each grid cell with σ

_{r}greater than 1, we assume that σ

_{n}is constant across overpasses, i.e., we allow it to vary spatially across the image but not temporally over the time range of the composite. We used Newton-Raphson root finding to estimate σ

_{n}, at each iteration recalculating the weighted mean and all σ

_{c}until σ

_{r}converges to 1. The resulting uncorrected composite is shown in Figure 7a and its uncertainty $\sqrt{{\mathrm{bias}}^{2}+{\sigma}_{c}^{2}}$ in Figure 7b. The bias-corrected composite is shown in Figure 7c, with uncertainty σ

_{c}shown in Figure 7d. σ

_{n}is shown in Figure 8.

## 4. Discussion

_{rs}band ratios as explanatory variables, particularly R

_{rs}(547)/R

_{rs}(555). Another would be to try to circumvent the problem of duplicate measurements causing problems for SD modeling by adding a random offset to each ln(chl

_{SAT}) value with a range equal to the ln(chl

_{SAT}) digitisation increment, which might allow a more complex and realistic SD model. A third would be to try creating weighted composites without applying Level 3 masks to test the extent to which outliers are de-weighted. However, our pursuit of this limited dataset and model thus far is sufficient to show the potential of the method and the types of issues that may arise in its application, so we leave these avenues unexplored for now.

_{rs}. It makes sense to start by creating models of R

_{rs}uncertainty in different bands (and possibly band ratios); then, the R

_{rs}uncertainties can be propagated into products that use R

_{rs}or band ratios of R

_{rs}, such as chl. This error propagation could be done explicitly, using standard error propagation methods, or implicitly, for instance, by including R

_{rs}uncertainties as explanatory variables in a model of chl uncertainty.

_{rs}, even if a theoretical error model exists, this method could still be of use in identifying limitations of the error model, with the theoretical uncertainties being used as inputs to a statistical model of actual uncertainties. The main requirement for the creation of a model of the uncertainty in a satellite product is the existence or creation of a database of matchups of the satellite product with corresponding validation measurements, along with values of all the candidate explanatory variables.

_{rs}and consequently large chl errors. Improving our understanding of scenarios like this might prompt the inclusion of an interaction term between solar zenith angle and cloud proximity in the model, or an attempt to correct for stray light at high solar zenith angle. This is a further example of the generic nature of this approach and how new knowledge can easily be built into the uncertainty model.

^{−3}and another from the other side with chl 0.1 mg m

^{−3}, with the two pixels having similar uncertainties in ln(chl). In this extreme case, the geometric mean is 1 mg m

^{−3}, and the unweighted arithmetic mean is ~5 mg m

^{−3}. Since the uncertainty of chl is 100 times greater for the larger value, and weight is divided by the square of uncertainty, in this extreme example a straightforwardly weighted arithmetic mean would be very close to 0.1 mg m

^{−3}.

^{−3}. In this case, the unweighted arithmetic mean is an unbiased estimate of chl in the composite pixel, even if the uncertainties in chl or ln(chl) are different. Hence, it is not clear how to calculate a weighted arithmetic mean that behaves like the unweighted mean when uncertainties in ln(chl) are uniform but reduces the weight of values with larger uncertainty. Weighting chl using the uncertainty in ln(chl) appears to have no statistical justification.

_{rs}followed by application of a chl algorithm, each step in the sequence would have its own uncertainty model that would be informed by the previous step. At each step, there would be the option of bias correction if such a correction were considered to be advantageous.

_{SAT}or to the time difference between satellite and in situ measurements. In practice, we know that in situ measurements are also uncertain, and this method could be extended to account for these. For instance, validation of satellite data commonly starts with quality control of both satellite and in situ datasets and criteria for comparison of the two, e.g., time and space separation, after which the two are compared. Here, we have described a method of modeling satellite errors instead of discarding uncertain data, but a similar approach could be taken with the in situ data. So, rather than discarding all in situ data with more than a threshold of some quality control measure, the measure could be included in the model of the satellite-in situ difference along with measures of satellite and comparison uncertainties.

## 5. Conclusions

## Supplementary Materials

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A

**Table A1.**Explanatory variables investigated with gamlssCV, functional forms in the mean and SD models, order of addition (1 = first) and SD

_{PRED}. Variables with blank lines were not chosen by gamlssCV.

Variable | Functional form in Mean Model | Order of Addition to Mean Model | SD_{pred} | Functional form in SD Model |
---|---|---|---|---|

ln(chl_{SAT}) | pb | 1 | 0.79 | |

Day of year | pbc | 4 | 0.28 | |

Day length | ps | 2 | 0.32 | |

Satellite age | ps | 5 | 0.14 | |

1/cos(solar zenith angle) | ||||

1/cos(view zenith angle) | ps | 10 | 0.30 | |

Airmass | ps | 9 | 0.27 | |

Wind speed | ||||

Aerosol optical depth at 869 nm | ||||

Glint radiance at 869 nm | ||||

R_{rs}(412) | ps | 3 | 0.78 | |

R_{rs}(443) | ||||

R_{rs}(469) | ps | 6 | 1.21 | |

R_{rs}(488) | ||||

R_{rs}(531) | ps | 7 | 0.65 | |

R_{rs}(547) | ps | 11 | 0.76 | |

R_{rs}(555) | ps | 12 | 0.66 | |

R_{rs}(645) | ps | |||

R_{rs}(667) | ||||

R_{rs}(678) | ||||

Time difference | ps | 8 | 0.11 |

**Table A2.**Models of δln(chl) as a function of ln(chl

_{SAT}), their mean deviance using gamlss and gamlssCV, and the percentage of squared deviation explained by the mean model.

Model or Change from Previous Model | Mean Deviance (Improvement over Previous Best Model) | gamlssCV-Gamlss | RMSD | Percentage of Squared Deviation Explained | |
---|---|---|---|---|---|

Gamlss | GamlssCV | ||||

Original data | - | - | - | 0.87 | - |

Constant mean and SD | 2.362 | 2.377 | 0.016 | 0.79 | 17% |

Linear mean | 2.095(0.266) | 2.121(0.256) | 0.025 | 0.70 | 37% |

ps mean | 2.027(0.068) | 2.077(0.044) | 0.05 | 0.682 | 39% |

pb mean | 1.964(0.063) | 2.065(0.012) | 0.101 | 0.68 | 40% |

ps SD | 1.841(0.123) | 1.993(0.072) | 0.152 | 0.68 | 40% |

## Appendix B

**Figure A1.**RGB composites of absolute bias (red), SD (green), and SE (blue), scaled from zero to one. Light grey is missing data, white is bias, SD, and SE all greater than one. (

**a**) 5 May 2013 at 12:10, mapped to the composite grid; (

**b**) mean composite from 1 to 8 May 2013.

## References

- Global Climate Observing System (GCOS). Systematic Observation Requirements for Satellite-Based Products for Climate 2011 Update, WMO GCOS Report 154; World Meteorological Organization (WMO): Geneva, Switzerland, 2011; p. 127. [Google Scholar]
- O’Reilly, J.E.; Maritorena, S.; Mitchell, B.G.; Siegel, D.A.; Carder, K.L.; Garver, S.A.; Kahru, M.; McClain, C. Ocean color chlorophyll algorithms for SeaWiFS. J. Geophys. Res. Oceans
**1998**, 103, 24937–24953. [Google Scholar] [CrossRef] - Level 2 Ocean Color Flags. Available online: https://oceancolor.gsfc.nasa.gov/atbd/ocl2flags/ (accessed on 20 February 2018).
- Hu, C.; Carder, K.L.; Muller-Karger, F.E. How precise are SeaWiFS ocean color estimates? Implications of digitization-noise errors. Remote Sens. Environ.
**2001**, 76, 239–249. [Google Scholar] [CrossRef] - Hu, C.; Feng, L.; Lee, Z.; Davis, C.O.; Mannino, A.; McClain, C.R.; Franz, B.A. Dynamic range and sensitivity requirements of satellite ocean color sensors: Learning from the past. Appl. Opt.
**2012**, 51, 6045–6062. [Google Scholar] [CrossRef] [PubMed] - Moore, T.S.; Campbell, J.W.; Dowell, M.D. A class-based approach to characterizing and mapping the uncertainty of the MODIS ocean chlorophyll product. Remote Sens. Environ.
**2009**, 113, 2424–2430. [Google Scholar] [CrossRef] - Joint Committee for Guides in Metrology. Evaluation of Measurement Data—Guide to the Expression of Uncertainty in Measurement; Joint Committee for Guides in Metrology: Paris, France, 2008. [Google Scholar]
- Merchant, C.J.; Paul, F.; Popp, T.; Ablain, M.; Bontemps, S.; Defourny, P.; Hollmann, R.; Lavergne, T.; Laeng, A.; de Leeuw, G. Uncertainty information in climate data records from Earth observation. Earth Syst. Sci. Data Discuss.
**2017**. [Google Scholar] [CrossRef] - Stasinopoulos, D.M.; Rigby, R.A. Generalized Additive Models for Location Scale and Shape (GAMLSS) in R. J. Stat. Softw.
**2008**, 23. [Google Scholar] [CrossRef] - Akaike, H. Likelihood of a model and information criteria. J. Econom.
**1981**, 16, 3–14. [Google Scholar] [CrossRef] - Geisser, S. The predictive sample reuse method with applications. J. Am. Stat. Assoc.
**1975**, 70, 320–328. [Google Scholar] [CrossRef] - MODIS-Aqua Reprocessing 2012. Available online: https://oceancolor.gsfc.nasa.gov/reprocessing/r2012/aqua/ (accessed on 13 February 2018).
- O’Reilly, J.E.; Maritorena, S.; Siegel, D.A.; O’Brien, M.C.; Toole, D.; Mitchell, B.G.; Kahru, M.; Chavez, F.P.; Strutton, P.; Cota, G.F. Ocean color chlorophyll a algorithms for SeaWiFS, OC2, and OC4: Version 4. In SeaWiFS Postlaunch Calibration and Validation Analyses Part 3; NASA: Washington DC, USA, 2000; pp. 9–23. [Google Scholar]
- Wolfe, R.E.; Roy, D.P.; Vermote, E. MODIS land data storage, gridding, and compositing methodology: Level 2 grid. IEEE Trans. Geosci. Remote Sens.
**1998**, 36, 1324–1338. [Google Scholar] [CrossRef] - International Ocean-Colour Coordinating Group. Remote Sensing of Ocean Colour in Coastal, and Other Optically-Complex, Waters; IOCCG: Dartmouth, NS, Canada, 2000. [Google Scholar]
- Yang, H.; Gordon, H.R. Remote sensing of ocean color: Assessment of water-leaving radiance bidirectional effects on atmospheric diffuse transmittance. Appl. Opt.
**1997**, 36, 7887–7897. [Google Scholar] [CrossRef] [PubMed] - Wang, M.; Bailey, S.W. Correction of sun glint contamination on the SeaWiFS ocean and atmosphere products. Appl. Opt.
**2001**, 40, 4790–4798. [Google Scholar] [CrossRef] [PubMed] - Gordon, H.R.; Wang, M. Influence of oceanic whitecaps on atmospheric correction of ocean-color sensors. Appl. Opt.
**1994**, 33, 7754–7763. [Google Scholar] [CrossRef] [PubMed] - Gordon, H.R.; Wang, M. Retrieval of water-leaving radiance and aerosol optical thickness over the oceans with SeaWiFS: A preliminary algorithm. Appl. Opt.
**1994**, 33, 443–452. [Google Scholar] [CrossRef] [PubMed] - Wang, M.; Shi, W. Cloud masking for ocean color data processing in the coastal regions. IEEE Trans. Geosci. Remote Sens.
**2006**, 44, 3196–3205. [Google Scholar] [CrossRef] - Xiong, X.; Sun, J.; Xie, X.; Barnes, W.L.; Salomonson, V.V. On-orbit calibration and performance of Aqua MODIS reflective solar bands. IEEE Trans. Geosci. Remote Sens.
**2010**, 48, 535–546. [Google Scholar] [CrossRef] - Longhurst, A.; Sathyendranath, S.; Platt, T.; Caverhill, C. An estimate of global primary production in the ocean from satellite radiometer data. J. Plankton Res.
**1995**, 17, 1245–1271. [Google Scholar] [CrossRef] - Campbell, J.W. The lognormal distribution as a model for bio-optical variability in the sea. J. Geophys. Res. Oceans
**1995**, 100, 13237–13254. [Google Scholar] [CrossRef] - SeaDAS. Available online: https://seadas.gsfc.nasa.gov (accessed on 26 February 2018).
- Andersen, J.H.; Carstensen, J.; Conley, D.J.; Dromph, K.; Fleming-Lehtinen, V.; Gustafsson, B.G.; Josefson, A.B.; Norkko, A.; Villnäs, A.; Murray, C. Long-term temporal and spatial trends in eutrophication status of the Baltic Sea. Biol. Rev.
**2017**, 92, 135–149. [Google Scholar] [CrossRef] [PubMed][Green Version] - Brewin, R.J.W.; Sathyendranath, S.; Müeller, D.; Brockmann, C.; Deschamps, P.Y.; Devred, E.; Doerffer, R.; Fomferra, N.; Franz, B.; Grant, M. The ocean colour climate change initiative: A round-robin comparison of in-water bio-optical algorithms. Remote Sens. Environ.
**2012**. [Google Scholar] [CrossRef] - Lee, K.; Tong, L.T.; Millero, F.J.; Sabine, C.L.; Dickson, A.G.; Goyet, C.; Park, G.H.; Wanninkhof, R.; Feely, R.A.; Key, R.M. Global relationships of total alkalinity with salinity and temperature in surface waters of the world’s oceans. Geophys. Res. Lett.
**2006**, 33. [Google Scholar] [CrossRef] - International Ocean-Colour Coordinating Group. Guide to the Creation and Use of Ocean-Colour, Level-3, Binned Data Products; Antoine, D., Ed.; IOCCG: Dartmouth, NS, Canada, 2004; Volume 4. [Google Scholar]

**Figure 1.**The distribution of data used in the matchups database. The map shows the number of matchups in each 1° × 1° cell.

**Figure 2.**A simple uncertainty model in which ln(chl

_{SAT}) is the only explanatory variable. (

**a**) The points are ln(chl

_{IS}) plotted against ln(chl

_{SAT}), and the black line is the 1:1 line. The blue line shows the best fitting pb(ln(chl

_{SAT})) mean model, and the solid red line shows the best fitting ps(ln(chl

_{SAT})) mean model. The remaining lines are offset above and below the latter. The solid pink line is offset by the standard error, the dashed pink line (almost overlapping with the dashed red line) by the best fitting ps(ln(chl

_{SAT})) standard deviation model, and the dashed red line by the square root of the sum of squares of standard error and standard deviation. (

**b**) Contributions to absolute uncertainty using the ps(ln(chl

_{SAT})) mean model. The points are the absolute difference between ln(chl

_{SAT}) and ln(chl

_{IS}); the solid red line is the absolute bias; the solid pink line is the standard error; the dashed pink line is the standard deviation; the dashed red line is the square root of the sum of squares of standard error and standard deviation; and the solid black line is the square root of the sum of squares of bias, standard error, and standard deviation.

**Figure 3.**Dependencies of the best fitting model. Each graph shows the dependency of a model parameter (mean or standard deviation) on a single explanatory variable (the ‘partial dependency’), all others being held constant. The red line is the model prediction, the pink lines are one standard error either side of this, and the pale blue circles are the prediction plus the model residual at each data point. (

**a**) Mean model dependencies, in the order that they were added by gamlss: (top row) pb(ln(chlSAT)), ps(day length), and ps(R

_{rs}(412)); (second row) pbc(day of year), ps(satellite age), and ps(R

_{rs}(469)); (third row) ps(R

_{rs}(531)), ps(time difference), ps(airmass); (bottom row) ps(1/cos(view zenith angle)), ps(R

_{rs}(547)), and ps(R

_{rs}(555)). All graphs have the same scale on the vertical axis. (

**b**) Dependency of ln(standard deviation) on ps(R

_{rs}(645)).

**Figure 4.**(

**a**) Chl

_{SAT}(mg m

^{−3}) from a MODIS-Aqua overpass on 5 May 2013 at 12:10 (original satellite projection at ~1 km resolution, central section removed); (

**b**) overall uncertainty in chl

_{SAT}, chl

_{SAT}× $\surd $(bias

^{2}+ standard deviation

^{2}+ standard error

^{2}), estimated using GAMLSS; (

**c**) chl

_{SAT}with bias subtracted; (

**d**) uncertainty in bias-subtracted chl

_{SAT}, chl

_{SAT}× $\surd $(standard deviation

^{2}+ standard error

^{2}).

**Figure 6.**(

**a**) Unweighted composite of chl from 1 to 8 May 2013 using the mean of ln(chl), shown at the top of the scale bar in mg m

^{−3}; (

**b**) number of chl values contributing to each pixel, shown at the bottom of the scale bar.

**Figure 7.**Weighted composites and their uncertainty. (

**a**) Uncorrected weighted composite of chl in mg m

^{−3}from 1 to 8 May 2013 using the mean of ln(chl); (

**b**) uncertainty in the uncorrected composite due to bias, standard deviation, standard error, and estimated natural variability of ln(chl); (

**c**) bias corrected weighted chl composite; (

**d**) uncertainty in the corrected composite due to standard deviation, standard error, and estimated natural variability of ln(chl).

**Figure 8.**Natural chl variation used to account for variability greater than the uncertainty predicted by the model. Light grey regions have fewer than two measurements, white regions are zero.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Land, P.E.; Bailey, T.C.; Taberner, M.; Pardo, S.; Sathyendranath, S.; Nejabati Zenouz, K.; Brammall, V.; Shutler, J.D.; Quartly, G.D. A Statistical Modeling Framework for Characterising Uncertainty in Large Datasets: Application to Ocean Colour. *Remote Sens.* **2018**, *10*, 695.
https://doi.org/10.3390/rs10050695

**AMA Style**

Land PE, Bailey TC, Taberner M, Pardo S, Sathyendranath S, Nejabati Zenouz K, Brammall V, Shutler JD, Quartly GD. A Statistical Modeling Framework for Characterising Uncertainty in Large Datasets: Application to Ocean Colour. *Remote Sensing*. 2018; 10(5):695.
https://doi.org/10.3390/rs10050695

**Chicago/Turabian Style**

Land, Peter E., Trevor C. Bailey, Malcolm Taberner, Silvia Pardo, Shubha Sathyendranath, Kayvan Nejabati Zenouz, Vicki Brammall, Jamie D. Shutler, and Graham D. Quartly. 2018. "A Statistical Modeling Framework for Characterising Uncertainty in Large Datasets: Application to Ocean Colour" *Remote Sensing* 10, no. 5: 695.
https://doi.org/10.3390/rs10050695