Consistency of Radiometric Satellite Data over Lakes and Coastal Waters with Local Field Measurements

: The Sentinel-3 mission launched its ﬁrst satellite Sentinel-3A in 2016 to be followed by Sentinel-3B and Sentinel-3C to provide long-term operational measurements over Earth. Sentinel-3A and 3B are in full operational status, allowing global coverage in less than two days, usable to monitor optical water quality and provide data for environmental studies. However, due to limited ground truth data, the product quality has not yet been analyzed in detail with the ﬁducial reference measurement (FRM) dataset. Here, we use the fully characterized ground truth FRM dataset for validating Sentinel-3A Ocean and Land Colour Instrument (OLCI) radiometric products over optically complex Estonian inland waters and Baltic Sea coastal areas. As consistency between satellite and local data depends on uncertainty in ﬁeld measurements, ﬁltering of the in situ data has been made based on the uncertainty for the ﬁnal comparison. We have compared various atmospheric correction methods and found POLYMER (POLYnomial-based algorithm applied to MERIS) to be most suitable for optically complex waters under study in terms of product accuracy, amount of usable data and also being least inﬂuenced by the adjacency e ﬀ ect.


Introduction
There is a growing constellation of satellite sensors providing Earth observation data for monitoring aquatic ecosystems. These ecosystems provide complex conditions for optical remote sensing in terms of optical water types. The adjacency to land also causes bias due to multiple scattering of the light detected by the sensor [1] which needs to be removed when the aim is to detect optical properties of the water.
Sentinel-3 (S3) is an ocean and land mission that consists of three satellites S3A, S3B and S3C, providing environmental monitoring data under the Copernicus program until 2030 [2,3]. There are four different instruments onboard S3 satellites, from which Ocean and Land Colour Instrument (OLCI) is a medium-resolution imaging spectrometer aimed for monitoring optical water quality. OLCI fulfills many of the mission objectives, e.g., measuring ocean and land surface color, monitoring seawater quality and pollution and monitoring land use change. OLCI is also the main contributor to monitoring inland waters [4]. One of the S3 mission requirements is that the measurements and products shall include uncertainty estimates [4]. The uncertainties shall be within 5% for the radiometric data. Currently in OLCI-A open water products, the water-leaving reflectance ρ w N partly meets the S3 mission requirements at averaged global and temporal scales [5] where bands at 490, 510 and 560 nm are within the 5% mission requirement uncertainty for all water types; bands 400, 412 and 442 are

In Situ Measurements
The above-water radiometric measurements were performed from the research vessel (approximately 6 m long) from about 2 m height from the water surface. In each station, the vessel was anchored and radiometric measurements were recorded at least for 15 min. The water-leaving reflectance spectra were calculated from the well-synchronized time series measured with the three above water TriOS-RAMSES hyperspectral radiometers following the protocol of REVAMP [11]. Calculations included the following steps: firstly, all measured radiance and irradiance spectra were corrected for the stray light [6,12]; secondly, spectral response functions of OLCI bands were used to convolve spectra into OLCI band values; thirdly, the time series of water-leaving reflectance ρ w N was calculated as where R rs (λ) is the remote sensing reflectance, L u (λ) is the upwelling radiance from the sea, L d (λ) is the downwelling radiance from the sky, E d (λ) is the downwelling irradiance and ρ(W) is the sea surface reflectance as function of wind speed (W, m·s −1 ), calculated as ρ(W) = 0.0256 + 0.00039W + 0.000034W 2 (2) [11]. Next, the Near-Infrared (NIR) similarity correction with λ1 = 720 nm, λ2 = 780 nm and α = 2.35 was applied to the water-leaving reflectance according to Ruddick et al. [13]. The constant parameter α of the NIR similarity correction [14] is determined in [13] and depends on the choice of wavelengths λ1 and λ2; α = 2.35 for the λ1 = 720 nm and λ2 = 780 nm. After that, for each measurement station, the median of R rs (560) (OLCI band value with center at 560 nm) was calculated. Any spectrum deviating from the mode more than ±10% was excluded from further analysis in order to eliminate outliers due to changing measurement and illumination conditions. Finally, the mean water-leaving reflectance with uncertainty [15] was calculated to each measurement station.
Additionally, environmental parameters such as wind speed, cloudiness, sun condition, solar elevation angle, wave height, Secchi depth and concentrations of optically significant constituents (OSC, e.g., concentration of chlorophyll-a, concentration of total suspended matter and absorption coefficient of colored dissolved organic matter at wavelength 442 nm) were measured. The wind speed was measured with a handheld mechanical anemometer. The overall sky cloudiness, the presence of clouds in front of the sun and the wave height were estimated by visual inspection. For cloudiness, a 100-point scale was used with 0 for clear sky to 100 for fully covered. Sun condition was classified into four groups: clear, partially covered, through optically thin clouds and fully covered. For Secchi depth, the white disk with 30 cm diameter was used and measurements were performed on the shaded side of the vessel. The solar elevation angle was calculated based on the measurements time and geographic coordinates. For concentrations of OSC, the water samples were collected from the water surface (up to 0.5 m depth) and analyzed using the methods of Lindell et al. [16]. Chlorophyll-a (Chl a) was measured spectrophotometrically with a Hitachi U-3010 spectrophotometer and calculated according to the method of Jeffrey and Humphrey [17]. TSM was measured gravimetrically. Lastly, a CDOM (442) was derived by water sample filtering through a filter with a pore size of 0.2 µm and measured in a 5 cm optical cuvette against distilled water with a Hitachi U-3010 spectrophotometer.

Uncertainty Budget
The methodology used for the uncertainty evaluation is consistent with the ISO Guide to the Expression of Uncertainty in Measurement (GUM) [18]. The evaluation is based on the measurement model, which describes the output quantity Y as a function f of input quantities Xi: Y = f(X1, X2, X3, . . . ). For example, for remote sensing reflectance Rrs(λ), Equations (1) and (2) are used. For every input quantity Xi, respectively, estimate xi and standard uncertainty u(x i ) are evaluated which Remote Sens. 2020, 12, 616 4 of 33 are considered as parameters of probability distribution describing the Xi. The combined standard uncertainty u c (y) for output estimate is calculated from the standard uncertainties associated with each input estimate xi, using a first-order Taylor series of y = f(x1, x2, x3, . . . ). There are two types of standard uncertainties: Type A is of statistical origin; Type B is determined by other means. Both types of uncertainties are indicated as standard deviation, denoted correspondingly by s and u. In calibration of array spectrometers, the uncertainty contributions arising from averaging of a large number of repeatedly measured spectra is considered as of Type A. Contributions from calibration certificates (standard lamp, diffuse reflectance panel, multimeter, current shunt, etc.), but also from instability and spatial non-uniformity of the lamp are considered for Type B.
Radiometric calibration of the irradiance and radiance sensors and their uncertainty budgets are described in [19]. The uncertainty of radiometric calibration stated in [19] has been successfully verified in international comparison between four participants (Tartu observatory of University of Tartu, National Physical Laboratory, The Joint Research Centre, TriOS) in 2016, and since 2018, respective calibration services are accredited by Estonian Accreditation Centre (EAK). Information about the use of calibrated sensors for laboratory and field measurements, about the evaluation of corrections due to different effects and/or respective uncertainty contributions without corrections applied are given in [6,7]. Additional information about the long-time instability of sensors can also be found in [6].
In the three-radiometer system used for the determination of Rrs(λ) in this paper, the same standard lamp was used for the calibration of all three sensors measuring, respectively, Ed, Lu and Ld. Therefore, the system calibration accounts for mechanical alignment of the lamp, plaque and sensors, for inadequate baffling, only for the short time instability of the irradiance standard, and for the uncertainty of the diffuse reflectance plaque. The contribution of the lamp calibration uncertainty cancels almost fully out.
The following components were included to the uncertainty budget of the in situ Rrs(λ): radiometric calibration of the three-radiometer system, responsivity drift of sensors after calibration, temperature effects, interpolation to the common wavelength scale, angular response of the irradiance sensor, wind speed uncertainty, uncertainty in the stray light correction and contribution due to polarization effects (all Type B estimates), repeatability of recorded time series, corrected for lag-1 autocorrelation and uncertainty in the NIR similarity correction (both Type A estimates).
Due to the unstable nature of natural illumination in individual time series recorded during field measurements, often a rather strong autocorrelation was visible. Consequently, besides white noise, a relatively high contribution of 1/f type noise can be expected, and the effective number of repetitions will be substantially reduced. Thus, in this case, the effective number of independent measurements has to be considered instead of actual number of data points in the recorded series [20]: where r 1 is the lag-1 autocorrelation coefficient of the analyzed time series. If the averaged values of different radiometers are used in calculations, then due to these random drifts in time series, zero correlation between the signals of different radiometers cannot be expected, and respective correlations shall always be estimated and accounted for. The calculation scheme of Rrs(λ) used in this article is based on example H2.4 of (ISO GUM) [18], where the output time series is determined from three sets of simultaneously obtained observations. By using this approach, the combined uncertainty of Rrs(λ) is estimated from the time series of Rrs(λ) output spectra, and the evaluation of correlations between input quantities is not needed. As the values of the time series are statistically dependent, for uncertainty of the averaged value, the effective sample size is calculated by using Equation (3). NIR similarity correction is calculated for every spectrum of water-leaving reflectance ρ w N , and then as the average of these values. For uncertainty of the averaged NIR similarity correction value, the effective sample size estimated from Equation (3) has also been used. Finally, the spectra of the relative uncertainty components were convolved to the OLCI band values used for comparison.

Atmospheric Correction Processors for OLCI Data
S3A OLCI L1 and L2 Full Resolution Non Time Critical data were downloaded from databases CODAREP (period 2016-2017, baseline 2.23) and CODA (2018, baseline 2.42). Same-day match-ups were used and the distance from the shoreline and the time difference between the satellite overpass and in situ measurements were derived. A 1 × 1 pixel area was used as a satellite match-up point.
S3 OLCI L1 are geo-located top-of-atmosphere radiance products, which have passed quality checks and radiometric calibration with pixel classification, correction for atmospheric gasses and smile effect correction. S3 OLCI L2 are atmospherically corrected products produced by using two different AC methods (Baseline Atmospheric Correction (BAC) and Alternative Atmospheric Correction) in parallel for ensuring similarity and consistency to MERIS products. BAC is based on previously developed AC for MERIS [21], which includes also a Bright Pixel Correction [22]. BAC is based on a coupled atmosphere-hydrological model using spectral optimization inversion for outputting water-leaving reflectances. It includes sun glint and white gaps correction, which is determined by certain thresholds for glint detection on a pixel. For detecting the correct band for further AC procedure, Case 2 NIR reflectance estimation process is performed based on radiometry, which includes aerosol and Rayleigh correction. Uncertainties of L1 and L2 products do not contain the full uncertainty budget at the moment, therefore these are suggested by the developers only for qualitative analyses [23]. Quality control was done by excluding pixels flagged as: WQSF_lsb_CLOUD, WQSF_lsb_CLOUD_AMBIGUOUS, WQSF_lsb_CLOUD_MARGIN, WQSF_lsb_COSMETIC, WQSF_lsb_SUSPECT, WQSF_lsb_HISOLZEN, WQSF_lsb_SATURATED, WQSF_lsb_HIGHGLINT, WQSF_lsb_OCNN_FAIL, WQSF_lsb_AC_FAIL and including flagged as WQSF_lsb_BPAC_ON.
POLYMER (POLYnomial-based algorithm applied previously to MERIS) is an AC processor originally developed for MERIS products to remove the sun glint effects of ocean waters, however further development made it applicable to optically complex waters. The AC procedure uses a spectral matching method based on polynomial the atmospheric model and bio-optical water reflectance model, which use all the spectral bands in the visible spectrum. The models are adjusted to obtain the best spectral fit for optimizing the parameters into both models. Unlike other AC processors, POLYMER is based only on NIR bands, which makes the processor able to derive the water-leaving reflectance in the presence of sun glint. For processing the data for the analysis, default parameters were used [24,25]. Quality control was done by including pixels where layer "Bitmask" had values 0 and 1024. Case 2 Regional CoastColor (C2RCC), originally developed by Doerffer and Schiller [26], is an atmospheric correction processor for optically complex Case 2 waters, which is trained and able to work in extreme conditions of scattering and absorption. It is based on inversion by the neural network technology, which uses a large database of radiative transfer simulations of water-leaving reflectance and top-of-the-atmosphere TOA radiances, taking into account certain water parameters (temperature, salinity) and atmospheric conditions (ozone, air pressure) [27]. For processing the data for the analysis, default parameters (except salinity for inland waters 0.0001) were used. Alternative Neural Net (ALTNN) is a combined AC processor of C2RCC and Case-2 Extreme. Alternative Neural Net is based on the same neural network system as C2RCC, but it has been improved and revised for more accurate results. It is a test version for OLCI and MERIS data with an extended training range and a larger number of training samples, which reduces noise in results and the gives opportunity to derive more reliable results [27]. For both C2RCC and ALTNN, the pixels were excluded from the analyses, which were flagged as: Rhow_OOS, Cloud_risk, Rhow_OOR, Rtosa_OOR, Rtosa_OOS, quality_flags_sun_glint_risk.
Second, the accuracy of the satellite-derived Remote Sensing Reflectance R rs (λ) olci,i was then compared against the in situ measured R rs (λ) insitu,i values. Mean Absolute Percentage Difference (MAPD) was applied to investigate dispersion and Mean Percentage Difference (MPD) to investigate bias: Here, R rs (λ) insitu,i and R rs (λ) olci,i are, respectively, in situ and OLCI-derived values for the band λ and match-up i. Figure 1 shows the in situ measured water-leaving reflectance processed to OLCIs wavelengths (a), corresponding uncertainty estimates (b) and the relationship between the Rrs and uncertainty (c) on selected wavelengths.

In Situ Dataset in Terms of Associated Uncertainties
Remote Sens. 2020, 12, 616 6 of 33

Data Analysis
First, the in situ dataset was analyzed with Principal Component Analysis (PCA) to visualize the variation present in the in situ radiometric dataset in relation to the uncertainty budget. PCA analyses were performed with R software using the ggbiplot package and prcomp function. Inputs for the PCA were the concentrations of (1) Chl a (mg·m Here, ( ) , and ( ) , are, respectively, in situ and OLCI-derived values for the band λ and match-up i. Figure 1 shows the in situ measured water-leaving reflectance processed to OLCIs wavelengths (a), corresponding uncertainty estimates (b) and the relationship between the Rrs and uncertainty (c) on selected wavelengths. The uncertainties are highest in short visible bands from 400 nm and decreasing toward green wavelengths (560 nm). For bands starting from 753.75 nm, the uncertainties are increasing toward longer wavelengths. Table A1 shows that the median uncertainty is less than 10% for bands 490-708.75 with the lowest median uncertainty of 3.9% for the 560 nm band. For bands 510-708.75, 50% of the measurements were obtained with <5% uncertainty, whereas for bands 400-442.5 and 753.75-885, less than 15% of the measurements were obtained with <5% uncertainty (Table A1). Figure 1c shows elevated uncertainty for some measurements in case of a weaker signal from the water; however, there is still a cluster of measurements with lower uncertainties independent from the signal strength.

In Situ Dataset in Terms of Associated Uncertainties
To show the contribution of various components to the full uncertainty budget, data from two stations with similar optical water quality but having different environmental conditions (wind speed, cloudiness) were analyzed.
The left panel in Figure 2 shows spectra measured at one station in challenging conditions (upper panel): wind speed 2 m·s −1 , overall cloudiness 90%, sun partially covered during the measurements (ID 839 in Table A2) and in good conditions (lower panel): wind speed 1 m·s −1 , overall cloudiness 5%, no clouds in front of the sun (ID 786 in Table A2). For both stations, the median hyperspectral spectra and the ones calculated on OLCIs wavelengths (blue) have similar shape and magnitude, although they greatly differ based on the uncertainty estimates (gray line). The uncertainty budget (on the right panel in Figure 2) shows the main contribution comes from environmental conditions for the upper panel due to the high standard deviation in the station spectra caused by changing and challenging (c) The uncertainties are highest in short visible bands from 400 nm and decreasing toward green wavelengths (560 nm). For bands starting from 753.75 nm, the uncertainties are increasing toward longer wavelengths. Table A1 shows that the median uncertainty is less than 10% for bands 490-708.75 with the lowest median uncertainty of 3.9% for the 560 nm band. For bands 510-708.75, 50% of the measurements were obtained with <5% uncertainty, whereas for bands 400-442.5 and 753.75-885, less than 15% of the measurements were obtained with <5% uncertainty (Table A1). Figure 1c shows elevated uncertainty for some measurements in case of a weaker signal from the water; however, there is still a cluster of measurements with lower uncertainties independent from the signal strength.
To show the contribution of various components to the full uncertainty budget, data from two stations with similar optical water quality but having different environmental conditions (wind speed, cloudiness) were analyzed.
The left panel in Figure 2 shows spectra measured at one station in challenging conditions (upper panel): wind speed 2 m·s −1 , overall cloudiness 90%, sun partially covered during the measurements (ID 839 in Table A2) and in good conditions (lower panel): wind speed 1 m·s −1 , overall cloudiness 5%, no clouds in front of the sun (ID 786 in Table A2). For both stations, the median hyperspectral spectra and the ones calculated on OLCIs wavelengths (blue) have similar shape and magnitude, although they greatly differ based on the uncertainty estimates (gray line). The uncertainty budget (on the right panel in Figure 2) shows the main contribution comes from environmental conditions for the upper   Table A2) and panel (b) good conditions (ID 786 in Table A2).

PCA
To study the factors associated with different levels of uncertainty in the Rrs data ( Figure 2a,b), a PCA was applied on the full in situ dataset. Ten input parameters were used for the PCA: TSM, Chl a, aCDOM(442), Secchi depth, Rrs(560), solar elevation angle, overall sky cloudiness, wind speed, wave height, sun conditions. This resulted in 10 principal components and the contribution from each parameter is shown in Table 2.
Based on the calculated uncertainty budget on the 442 nm band, four categories based on the level of uncertainty were derived: This was used as an additional layer of information for each point in interpreting the PCA results.  Table A2) and panel (b) good conditions (ID 786 in Table A2).

PCA
To study the factors associated with different levels of uncertainty in the Rrs data ( Figure 2a,b), a PCA was applied on the full in situ dataset. Ten input parameters were used for the PCA: TSM, Chl a, a CDOM (442), Secchi depth, Rrs(560), solar elevation angle, overall sky cloudiness, wind speed, wave height, sun conditions. This resulted in 10 principal components and the contribution from each parameter is shown in Table 2.
Based on the calculated uncertainty budget on the 442 nm band, four categories based on the level of uncertainty were derived: This was used as an additional layer of information for each point in interpreting the PCA results.  Figure 3). Therefore, PC1, the group explaining the highest variation in the in situ data, can be associated with the optical properties of the water. Figure 3 shows how each in situ measurement is positioned in terms of principal components and associated level of uncertainty estimated at the 442 nm band. There is no association between the level of the uncertainty and the variables contributing most to PC1 ( Figure 3). PC2, describing 26.4% of the variance, can be associated mainly with changes in the solar elevation angle and overall sky cloudiness ( Table 2). Based on the PCA results, neither PC2 can be used to differentiate points with different levels of measurement uncertainty. The first principal component (PC1) described 30.8% of variance in the dataset and had the highest contribution from TSM, Chl a and Secchi depth (Table 2, Figure 3). Therefore, PC1, the group explaining the highest variation in the in situ data, can be associated with the optical properties of the water. Figure 3 shows how each in situ measurement is positioned in terms of principal components and associated level of uncertainty estimated at the 442 nm band. There is no association between the level of the uncertainty and the variables contributing most to PC1 ( Figure 3). PC2, describing 26.4% of the variance, can be associated mainly with changes in the solar elevation angle and overall sky cloudiness ( Table 2). Based on the PCA results, neither PC2 can be used to differentiate points with different levels of measurement uncertainty. Based on PC3 and PC4, different clusters were formed in terms of low (<5% uncertainty at the 442 nm band) and high (>70% uncertainty at the 442 nm band) measurement uncertainty (Figure 3). Both PC3 and PC4 are determined by environmental conditions. The main contribution to PC3 comes from wave height and wind speed, to PC4 from sun conditions (if there are clouds in front of the sun or not) and overall sky cloudiness ( Table 2). Based on the PC3 and PC4, measurements with a lower level of uncertainty are associated with lower wave height and wind speed (PC3) and also good Based on PC3 and PC4, different clusters were formed in terms of low (<5% uncertainty at the 442 nm band) and high (>70% uncertainty at the 442 nm band) measurement uncertainty ( Figure 3). Both Remote Sens. 2020, 12, 616 10 of 33 PC3 and PC4 are determined by environmental conditions. The main contribution to PC3 comes from wave height and wind speed, to PC4 from sun conditions (if there are clouds in front of the sun or not) and overall sky cloudiness ( Table 2). Based on the PC3 and PC4, measurements with a lower level of uncertainty are associated with lower wave height and wind speed (PC3) and also good illumination conditions (clear sky and no clouds in front of the sun) (PC4).

Spatial and Temporal Effects on Combining Satellite and In Situ Data
The S3A OLCI image ( Figure 4) coupled with in situ sampling dataset from 14 June 2016 was analyzed for the performance of various AC processors in comparison with in situ data in terms of changing temporal and spatial conditions. Remote Sens. 2020, 12, 616 10 of 33

Spatial and Temporal Effects on Combining Satellite and In Situ Data
The S3A OLCI image ( Figure 4) coupled with in situ sampling dataset from 14 June 2016 was analyzed for the performance of various AC processors in comparison with in situ data in terms of changing temporal and spatial conditions. The reference measurements, derived Rrs from AC processors and environmental conditions for each station, are shown in Figure 5. Each measurement was performed in conditions where no clouds were in front of the sun ( Figure A1). The overall sky cloudiness was higher in the first two stations (from 30% to 10%, respectively) but stayed constant at 5% for the following stations ( Figure A1). The clouds were Cirrus and Cirrostratus for the three first stations, and later Cumulus and Cirrostratus. Wind speed changed from 0 to 5.5 m/s and wave height from 0 to 0.3 m, both increasing toward the evening (except at station #5, at 11:14 UTC, Figure A1). The in situ measured Rrs(λ) with associated uncertainties compared to quality-controlled AC retrievals ( Figure 5) show that, although the image was cloud-free (Figure 4), the number of retrievals varies station-by-station. The lower two panels ( Figure 5) shows the ratio of the AC processor-derived Rrs(λ) to the in situ measured Rrs(λ) with respective uncertainty. The uncertainties are higher in the first and last three stations, while stations #2 to #5 have very low uncertainties as the measurements have been performed in conditions with low wave height and wind speed in combination with a solar elevation angle above 40 degrees. The optical properties of water are similar in the first two stations (Figure A1), where the main changes are in the overall sky cloudiness (decrease from 30% to 10%), solar elevation angle (from 36 to 43 degrees) and a slight increase in wind speed (from 0 to 2.5 m/s) and wave height (from 0 to 0.05 m).
In the first two stations, the Chl a, TSM and aCDOM absorption are the highest and decrease in the following stations, where they stay fairly the same ( Figure A1).  The reference measurements, derived Rrs from AC processors and environmental conditions for each station, are shown in Figure 5. Each measurement was performed in conditions where no clouds were in front of the sun ( Figure A1). The overall sky cloudiness was higher in the first two stations (from 30% to 10%, respectively) but stayed constant at 5% for the following stations ( Figure A1). The clouds were Cirrus and Cirrostratus for the three first stations, and later Cumulus and Cirrostratus. Wind speed changed from 0 to 5.5 m/s and wave height from 0 to 0.3 m, both increasing toward the evening (except at station #5, at 11:14 UTC, Figure A1). The in situ measured Rrs(λ) with associated uncertainties compared to quality-controlled AC retrievals ( Figure 5) show that, although the image was cloud-free (Figure 4), the number of retrievals varies station-by-station. The lower two panels ( Figure 5) shows the ratio of the AC processor-derived Rrs(λ) to the in situ measured Rrs(λ) with respective uncertainty. The uncertainties are higher in the first and last three stations, while stations #2 to #5 have very low uncertainties as the measurements have been performed in conditions with low wave height and wind speed in combination with a solar elevation angle above 40 degrees. The optical properties of water are similar in the first two stations (Figure A1), where the main changes are in the overall sky cloudiness (decrease from 30% to 10%), solar elevation angle (from 36 to 43 degrees) and a slight increase in wind speed (from 0 to 2.5 m/s) and wave height (from 0 to 0.05 m).  Table A2), the changes in the sky conditions, the temporal and spatial changes in the measurement conditions and the optical properties of water can be found in Figure A1.
In the first two stations, the Chl a, TSM and a CDOM absorption are the highest and decrease in the following stations, where they stay fairly the same ( Figure A1).
The best performance from each AC processor is in the case of station #5 measured at 11:14 UTC where all processor-derived Rrs pass the quality control by flags. This station is about 13 km away from the coast. The most complex conditions were for station #2 of the day (at 7:14 UTC), where measurements were performed in a very narrow part of the lake (Figure 4), just 0.8 km from the nearest coast. Figure 5a shows the changes in the Rrs uncertainty during one measurement campaign. Uncertainty was higher during early morning and evening measurements which can be linked with the sun elevation angle ( Figure A1) and also with the wind speed in the evening measurements (3.5-5 m·s −1 ). During the midday station, in good measurement conditions ( Figure A1), the uncertainties stayed low in the whole spectrum, e.g., station #4, where uncertainty was 1.1-1.5% for bands 490-708.75, and between 2% and 6.5% for shorter and longer wavelengths. In contrast, in the second to last station, #7, the uncertainty was 3-5.9% for bands 490-708.75, and between 12.1% and 33% for shorter and longer wavelengths, which can be explained by changes in the environmental conditions, e.g., lower sun elevation angle (36.7 degrees) and higher wind speed (5.5 m·s −1 ).
Although the uncertainties for the first three OCLI bands (up to 442.5 nm) are higher, the AC processor-derived Rrs is only derived in the limits of uncertainty by POLYMER AC in few cases; all other AC results tend to under-or overestimate. While the standard AC strongly underestimates bands up to 560 nm, its accuracy is comparable with other processors for longer wavelengths. For bands 665, 673.75 and 681.25, AC-derived Rrs is slightly (about 20-30%) underestimated. While POLYMER tends to show the most accurate spectra, it has the highest inaccuracies at the 865 nm band, where all other AC processors perform better.

Validation of AC Processors on All Match-Ups
Visual observation of spectra from all match-up stations showed that in productive waters (Chl a > 20 mg·m −3 ), the products of C2RCC and ALTNN (Figure 6a) tend to give peak reflectance at 620 nm instead of 560 nm as measured in situ. C2RCC and ALTNN products are often flagged out over absorbing waters (a CDOM > 1.5 m −1 ) with Chl a level < 20 mg·m −3 (Figure 6d, ID 2228-2288 in Table A2) regardless of the distance from the shore. It was also noted that the agreement between the processors increases further away from the shore (Figure 6b), except the bands up to 510 nm in case of standard product. In the vicinity of land (Figure 6c), the discrepancies between the various AC processors become higher, although the POLYMER tends to be least affected by the adjacency effect.

Validation of AC Processors on All Match-Ups
Visual observation of spectra from all match-up stations showed that in productive waters (Chl a > 20 mg·m −3 ), the products of C2RCC and ALTNN (Figure 6a) tend to give peak reflectance at 620 nm instead of 560 nm as measured in situ. C2RCC and ALTNN products are often flagged out over absorbing waters (aCDOM > 1.5 m −1 ) with Chl a level < 20 mg·m −3 (Figure 6d, ID 2228-2288 in Table A2) regardless of the distance from the shore. It was also noted that the agreement between the processors increases further away from the shore (Figure 6b), except the bands up to 510 nm in case of standard product. In the vicinity of land (Figure 6c), the discrepancies between the various AC processors become higher, although the POLYMER tends to be least affected by the adjacency effect.  Table A2). Red denotes in situ measurements and corresponding uncertainty at every wavelength. Table A2 shows the overview of bio-optical properties, environmental variables and adjacency to land for each match-up point with a reference to Rrs at 560 nm, measured in situ or derived by AC processors. Combining the results from regression analyses (Figures A2-A5) and Table A2, good conditions for satellite retrievals can be associated with high distance from the land (>7 km,, e.g., ID 786, 787, 842, 1985 and 1986 from Table A2) in combination with good illumination conditions (clear in front of sun, wind speed 0.5 to 5.5 m·s −1 , wave height < 0.15 m). The uncertainties at 560 nm stayed under 14%. In these cases, the difference between the processors was usually < 10% and deviation from the in situ < 15% (Table A2). The highest errors between in situ and satellite-derived Rrs (ID 903, 901 in Table A2) can be associated with low solar elevation angle (<30 degrees) in combination with high Chl a (>34 mg·m −3 ) which results in high uncertainty for the in situ measurements (78.3% and 20.9% at 560 nm, respectively). Additionally, measurements performed with high wind speed (5 m·s −1 ) in combination with high wave height (0.4 m) are associated with higher uncertainties and the errors are high between the AC processor retrievals and situ data (20% to 50%) but low between the processors (2-13%, ID 894, 898 in Table A2).  Table A2). Red denotes in situ measurements and corresponding uncertainty at every wavelength. Table A2 shows the overview of bio-optical properties, environmental variables and adjacency to land for each match-up point with a reference to Rrs at 560 nm, measured in situ or derived by AC processors. Combining the results from regression analyses (Figures A2-A5) and Table A2, good conditions for satellite retrievals can be associated with high distance from the land (>7 km" e.g., ID 786, 787, 842, 1985 and 1986 from Table A2) in combination with good illumination conditions (clear in front of sun, wind speed 0.5 to 5.5 m·s −1 , wave height < 0.15 m). The uncertainties at 560 nm stayed under 14%. In these cases, the difference between the processors was usually < 10% and deviation from the in situ < 15% (Table A2). The highest errors between in situ and satellite-derived Rrs (ID 903, 901 in Table A2) can be associated with low solar elevation angle (<30 degrees) in combination with high Chl a (>34 mg·m −3 ) which results in high uncertainty for the in situ measurements (78.3% and 20.9% at 560 nm, respectively). Additionally, measurements performed with high wind speed (5 m·s −1 ) in combination with high wave height (0.4 m) are associated with higher uncertainties and the errors are high between the AC processor retrievals and situ data (20% to 50%) but low between the processors (2-13%, ID 894, 898 in Table A2).

Filtering the In Situ Data Based on the Uncertainty
The difference between the in situ measured Rrs and AC derived is the highest for the 400 and 412.5 nm bands in the case of each processor (Figures 7 and A2-A4 and Figure A5a). Based on the statistics, the POLYMER-derived Rrs is the most accurate for all bands, except at 865 nm. POLYMER-AC-derived Rrs values are well aligned around at in 1:1 line ( Figure A2) with a relatively smaller bias compared to other processors.
412.5 nm bands in the case of each processor (Figure 7, Figures A2-A4 and Figure A5a). Based on the statistics, the POLYMER-derived Rrs is the most accurate for all bands, except at 865 nm. POLYMER-AC-derived Rrs values are well aligned around at in 1:1 line ( Figure A2) with a relatively smaller bias compared to other processors.
Bands up to 510 nm are overestimated by ALTNN and C2RCC (MPD 114% and 107%, respectively, at 400 nm), slightly overestimated by POLYMER (MPD 57%) and strongly underestimated by standard products (MPD −463%). Improved accuracy is obtained for longer wavelengths by all processors. For POLYMER, ALTNN and C2RCC, the band at 560 nm is derived with the highest accuracy, e.g., MAPD 21% (POLYMER), 32% (ALTNN), 28% (C2RCC). In general, for all AC processors, the derived Rrs for bands from 753 nm onwards show higher scatter compared to green bands, although the majority of the estimates are in the limits of the associated in situ measurement uncertainty (Figure 7, and more in detail for each processor in Figures A2-A5).
As the median uncertainty for the in situ Rrs varies greatly on different wavelengths (Appendix A, Table A1), the filtering of data was made based on the up to 5% uncertainty criteria at least in one band. This eliminated the in situ data measured in not optimal environmental or measurement conditions and improved the accuracy of the AC processor retrievals (Figures 7 and 8 and for each processor separately; Figures A2-A5). This decrease in match-ups was about 55% for each processor:   Bands up to 510 nm are overestimated by ALTNN and C2RCC (MPD 114% and 107%, respectively, at 400 nm), slightly overestimated by POLYMER (MPD 57%) and strongly underestimated by standard products (MPD −463%). Improved accuracy is obtained for longer wavelengths by all processors. For POLYMER, ALTNN and C2RCC, the band at 560 nm is derived with the highest accuracy, e.g., MAPD 21% (POLYMER), 32% (ALTNN), 28% (C2RCC). In general, for all AC processors, the derived Rrs for bands from 753 nm onwards show higher scatter compared to green bands, although the majority of the estimates are in the limits of the associated in situ measurement uncertainty (Figure 7, and more in detail for each processor in Figures A2-A5).
As the median uncertainty for the in situ Rrs varies greatly on different wavelengths (Appendix A, Table A1), the filtering of data was made based on the up to 5% uncertainty criteria at least in one band. This eliminated the in situ data measured in not optimal environmental or measurement conditions and improved the accuracy of the AC processor retrievals (Figures 7 and 8 and for each processor separately; Figures A2-A5). This decrease in match-ups was about 55% for each processor: POLYMER  To estimate the reason for outliers in the correlation plots ( Figures A2-A4 and A5b), the distance from the 1:1 line was measured for every point and analyzed against different variables. The outliers were most evident in the adjacency to land in the first five kilometers, and the errors between the ACderived and in situ measured Rrs decrease with increasing distance from the shore (Figure 9). It is especially pronounced in the standard OLCIs Rrs products, then C2RCC and ALTNN. In the case of POLYMER, the adjacency has the lowest impact on the quality of the retrievals compared to other processors. As the match-up dataset contains data pairs with variable time difference between the in situ measurements and satellite overpass, Figure 9 shows that over these inland and coastal waters, the time difference between −3 h up to +6 h (in situ minus satellite overpass) can be associated with outliers only in few cases. To estimate the reason for outliers in the correlation plots ( Figures A2-A4 and Figure A5b), the distance from the 1:1 line was measured for every point and analyzed against different variables. The outliers were most evident in the adjacency to land in the first five kilometers, and the errors between the AC-derived and in situ measured Rrs decrease with increasing distance from the shore (Figure 9). It is especially pronounced in the standard OLCIs Rrs products, then C2RCC and ALTNN. In the case of POLYMER, the adjacency has the lowest impact on the quality of the retrievals compared to other processors. As the match-up dataset contains data pairs with variable time difference between the in situ measurements and satellite overpass, Figure 9 shows that over these inland and coastal waters, the time difference between −3 h up to +6 h (in situ minus satellite overpass) can be associated with outliers only in few cases.

Comparability between AC Processors
To exclude the conclusions on AC performances due to the different number of match-up points and processor-based flagging, the statistics were calculated only for these stations when all four processors derived quality control retrievals (Table 3). These eliminated all the match-up stations representing aCDOM-rich waters because none of the C2RCC and ALTNN processor results passed the quality control by flags for these conditions. Table 3. Statistics on match-up data for tested AC processors only for 22 match-up points when each AC processor (Alternative Neural Net -ALTNN, Case 2 Regional CoastColor Processor -C2RCC) resulted in quality-controlled retrieval.

Comparability between AC Processors
To exclude the conclusions on AC performances due to the different number of match-up points and processor-based flagging, the statistics were calculated only for these stations when all four processors derived quality control retrievals (Table 3). These eliminated all the match-up stations representing a CDOM -rich waters because none of the C2RCC and ALTNN processor results passed the quality control by flags for these conditions. Table 3 shows POLYMER retrievals have the smallest dispersion over all OLCIs' bands except for 865 nm, and compared to other processors it has substantially smaller errors in the blue bands (400-490). For bands from 560 nm onwards, the retrievals of all processors are relatively close, while the C2RCC derived Rrs have the highest errors. However, the products of C2RCC and ALTNN tend to give a peak in reflectance at 620 nm instead of 560 nm in case of highly productive waters (strong absorption in 681 nm and scatter at 709 nm) and also overestimate Rrs at the 778.75 nm band. Although standard L2 products give systematically negative reflectance at shorter wavelengths, the products show comparable accuracy with other processors for bands from 560 nm onwards (Table 3, Figure 10). Table 3. Statistics on match-up data for tested AC processors only for 22 match-up points when each AC processor (Alternative Neural Net-ALTNN, Case 2 Regional CoastColor Processor-C2RCC) resulted in quality-controlled retrieval.  Table 3 shows POLYMER retrievals have the smallest dispersion over all OLCIs' bands except for 865 nm, and compared to other processors it has substantially smaller errors in the blue bands (400-490). For bands from 560 nm onwards, the retrievals of all processors are relatively close, while the C2RCC derived Rrs have the highest errors. However, the products of C2RCC and ALTNN tend to give a peak in reflectance at 620 nm instead of 560 nm in case of highly productive waters (strong absorption in 681 nm and scatter at 709 nm) and also overestimate Rrs at the 778.75 nm band. Although standard L2 products give systematically negative reflectance at shorter wavelengths, the products show comparable accuracy with other processors for bands from 560 nm onwards (Table 3, Figure 10).

MAPD (%) MPD (%)
As the uncertainty of OLCI products was not available at the time of writing, we decided to include the standard deviation between outputs from different AC processors to indicate the accuracy of satellite-driven reflectance ( Figure 10). As the uncertainty of OLCI products was not available at the time of writing, we decided to include the standard deviation between outputs from different AC processors to indicate the accuracy of satellite-driven reflectance ( Figure 10).
The highest discrepancies for the AC processors are in the blue wavelengths (up to 442.5 nm). Between these bands, POLYMER median Rrs has the closest match to in situ Rrs and within the limits of the in situ measurement uncertainty, while C2RCC and ALTNN overestimate about 50% and standard L2 products strongly underestimate Rrs values (Figure 10). For the 490 and 510 nm bands, the consistency increases, and for bands 560 to 778, all AC processors derive comparable results and deviate up to 30% from the in situ Rrs spectra. The Rrs at 665 to 681 nm are systematically underestimated by all processors.

Discussion
In order to improve the comparability between field-measured and satellite-derived radiometric data, the uncertainty budget associated with each dataset has to be considered. This enables to assess the uncertainty originating from each component and therefore allows to detect components with the highest uncertainty. This knowledge can direct future research to focus on decreasing the uncertainty on components that impede the meaningful comparison between satellite and in situ data.

Sources and Variation of Uncertainties on the In Situ Measured Radiometric Data
SI-traceable radiometric calibration of sensors before the field measurements is indispensable in order to guarantee the S3 mission uncertainty requirement of 5% for the in situ results. Without such calibration, only relative variability between the same type sensors measuring the same object was from 6% to 10%, as revealed during the laboratory inter-comparison of radiometers used for satellite validation [6]. The largest differences between sensors (about 10%) were evident for blue bands. After uniform SI-traceable calibration, the variability between sensors measuring the same stable source in laboratory conditions was well under 1% in the whole range from 400 nm to 900 nm.
Unfortunately, the radiometric calibration of sensors still may be insufficient for producing firm SI-traceable in situ results, as conditions during radiometric calibration and conditions during actual field measurements can be significantly different and therefore not entirely covered in the traceability chain [7,28]. Although traceability may be well in place under "normal" operating conditions, this may not be the case under the full range of realistic operating conditions, which may prevail during field measurements [7, 28,29]. Major differences during calibration and in-field use arise from various temperature conditions, from differences in the spectral and spatial distributions of measured radiation, to much larger instability/variability of recorded field signals. All these aspects do contribute to the substantial increase in the uncertainty of the field results [7].
In this study, a large part of the in situ measurements did not meet the S3 mission uncertainty requirement. As seen from analyzing the uncertainty budget for the all in situ dataset, Rrs at OLCI bands shorter than 442 nm and longer than 753.75 nm are obtained with less than 5% uncertainty only in about 10% of the cases. The smallest relative uncertainty of in situ results is usually achieved for green bands where the signal level during calibration and during field use is the largest. For the red and NIR bands, the contribution of temperature effects due to the temperature sensitivity of Si-based sensors may be rather large. At the same time, the expected signal level in this range is low. Therefore, relative effects due to different corrections (NIR similarity correction, stray light correction, etc.) can be substantial. For blue bands, the signal level of a reference radiation source used for calibration is low, and as a consequence, calibration uncertainties usually the largest. Relative effects due to different corrections for this range are also rather large. For blue bands, the high uncertainty can be due to the contamination by sun glint in case of above-water measurements, which should be corrected [30,31]. Therefore, it is a challenge to collect in situ data with low uncertainty especially for blue and red and NIR bands, which would be suitable for the validation of satellite data. The large uncertainties in blue bands are also present in the data collected by AERONET-OC stations, especially over the Baltic Sea area [32].
Laboratory and field intercomparison of ocean color radiometers [6,7] has clearly demonstrated that the uncertainty of in situ results cannot be reduced only with enhanced accuracy of radiometric calibration. For achieving better SI-traceability of field measurements, besides crucial radiometric calibration, additional characterization of individual sensors including temperature dependence, nonlinearity, angular response of all radiance and irradiance sensors, spectral stray light and wavelength scale effects is very important. The results of these tests enable considering all possible effects arising from specific field conditions during the in situ measurements.
It was shown in the uncertainty budget of in situ measurements performed during changing environmental conditions that a substantial increase of contributions from the repeatability of recorded time series and from uncertainty in the NIR similarity correction was present, and the increase of some systematic effects due to environmental effects can be expected. Therefore, it is crucial to perform measurements used for validation in optimal weather conditions.

Benefits of Including Uncertainty Budget for the In Situ Dataset
The PCA analyses showed that the level of uncertainty in the in situ data is associated with various environmental conditions. Based on the dataset used here, it was possible to classify data either with low or high uncertainty based on the wave height and wind speed and also the illumination conditions (visibility of sun and overall sky cloudiness). It was shown that the solar elevation angle is an important factor, especially while performing measurements over phytoplankton rich waters. This kind of analysis should be expanded to cover different optical water types to be able to set thresholds for data acquisition to have confidence in the in situ data which will be used to validate ocean color products.
In this study, we set a 5% uncertainty threshold at least in one band to filter the in situ data. This filtering decreased the dispersion and bias for most of the AC processor-derived results. This eliminates the conclusions made possibly based on the in situ data collected in poor environmental conditions. It was shown that over multiple cases, the performance of the AC processors was poor although the level of uncertainty in the in situ data was low. For these cases, neither the environmental conditions nor time difference between satellite overpass and in situ measurement had an effect. Instead, for each processor and for every band, the distance from the shore was then the parameter explaining most often the deviations between the in situ measured and satellite-derived radiometric data. The adjacency effect correction is an issue in the case of each tested AC processor but was less pronounced in the POLYMER products.
Bulgarelli and Zibordi [1] showed the adjacency perturbations can reach up to 36 km from land in case of OLCI data. The extent and degree are sensitive to land cover and slightly also to water type (at blue wavelengths), depending also on the signal-to-noise ratio level of the sensor. It was shown that perturbations induced by adjacency effect at NIR and visible wavelengths might compensate each other and biases are not directly linked with the intensity of the reflectance of the nearby land. For our test sites, both poorly (green vegetation) and highly reflective (sand beaches) land cover was present. Bulgarelli and Zibordi [1] found that adjacency effect increases with water absorption in case of highly reflective land covers (e.g., sand), and over bare soil and green vegetation (which was the majority of land cover in this study), the impact might be limited to few first kilometers for the visible wavelengths. We found that the outliers could be explained by the vicinity to land up to 5 km. Thus, the adjacency effect is an important factor contributing to the accuracy of the satellite-derived products over coastal and enclosed water bodies. Various optical water types, adjacent to a composite of multiple land covers, can result in various perturbations and should be accounted for while deriving accuracy estimates on satellite-derived radiometric data.

Comparison of Atmospheric Correction Algorithms over Optically Complex Waters
To get accurate atmospheric correction over lakes and coastal waters is much more difficult than in a marine environment due to larger non-uniformity and instability of atmospheric parameters (atmospheric water vapor, temperature, etc.), due to different water types, shallow water and/or vicinity to land effects.
For the satellite-derived radiometric product, the highest deviations from the in situ data are for the blue bands. This is a challenge partly due to the higher uncertainties in the field data [33] and also due to the optimal parametrization of the AC processor [34,35]. The green bands are estimated most accurately and the bands in the red and NIR wavelengths are often estimated within the limit of in situ uncertainty. C2RCC and ALTNN products often showed Rrs peak at 620 nm over phytoplankton rich waters and also a majority of data was flagged out over absorbing waters (high a CDOM in combination with low Chl a). Standard L2 product underestimates strongly Rrs up to 560 nm but for longer wavelengths shows comparable results with other processors. This can be partly explained by the system vicarious calibration (SVC) gains, which mitigate the radiometric biases in L1b, but they are currently not optimal for optically complex waters. For the standard L2 products, there are many ongoing activities (bright pixel correction) and planned activities on SVC (recomputation of gains for OLCI-A, computation for OLCI-B) with potential perspective to derive SVC gains specific to complex water products and also additional improvements on cloud flags (cloud risk, cloud shadow, snow/ice cloud). This should all increase the accuracy of the OLCI-A operational products for the Case 2 waters.
In general, the flagging criteria for satellite data used in this study do not eliminate invalid pixels for the standard L2 products. For the products of POLYMER, ALTNN and C2RCC, the flagging criteria seem reasonable as the few clear outliers were associated with higher uncertainty in the in situ data or the vicinity of the land.
Currently, the POLYMER AC tends to give the most accurate results for the radiometric data over tested inland and coastal waters. It tends to account for the adjacency effect better compared to other processors. It shows the lowest bias compared to in situ values in all bands except 865 nm and performs equally well over different water types.

Conclusions
Copernicus program has ensured a growing constellation of Sentinel satellites usable to monitor optical water quality from regional to global applications. Consistency between radiometric satellite data and local field measurements has to be obtained in order to have the assurance on the remote sensing derived products. The traceability chain for both measurements provides knowledge about the components associated with the highest uncertainty which should be accounted for and decreased, if possible, to improve the accuracy and precision of the measurements.
The calibration and characterization of the radiometers are essential to know the measurements' uncertainty in controlled conditions. However, the environmental conditions tend to differ from the laboratory conditions causing effects that cannot be properly accounted for. In situ radiometric measurements should be performed in optimal conditions to reduce noise in the data due to poor environmental conditions (high wind speed, wave height, low solar elevation angle). This allows obtaining in situ data with low uncertainty which can be further used to validate the satellite-derived radiometric data.
From the tested AC processors, POLYMER radiometric products show the best agreement with in situ data, being least influenced by the adjacency effect. The C2RCC and ALTNN processors tend to depend on the level of OAS in water, being not suitable for highly absorbing waters with low Chl a content. OLCI-A operational L2 products strongly underestimate the blue wavelengths, although, from the 560 nm band onwards, the results from all AC processors are relatively similar.   (b) Figure A5. Correlation between in situ measured (x-axis) vs ALTNN-derived rho (y-axis) on OLCI bands. Upper figure (a) shows validation results on all data and lower (b) after filtering the in situ data based on the 5% criteria at least in one band. For each match-up point, the color shows the distance from the shore.