Consistency between Satellite Ocean Colour Products under High Coloured Dissolved Organic Matter Absorption in the Baltic Sea

: Ocean colour (OC) remote sensing is an important tool for monitoring phytoplankton in the global ocean. In optically complex waters such as the Baltic Sea, relatively efﬁcient light absorption by substances other than phytoplankton increases product uncertainty. Sentinel-3 OLCI-A, Suomi-NPP VIIRS and MODIS-Aqua OC radiometric products were assessed using Baltic Sea in situ remote sensing reﬂectance ( R rs ) from ferry tracks (Alg@line) and at two Aerosol Robotic Network for Ocean Colour (AERONET-OC) sites from April 2016 to September 2018. A range of atmospheric correction (AC) processors for OLCI-A were evaluated. POLYMER performed best with <23 relative % difference at 443, 490 and 560 nm compared to in situ R rs and 28% at 665 nm, suggesting that using this AC for deriving Chl a will be the most accurate. Suomi-VIIRS and MODIS-Aqua underestimated R rs by 35, 29, 22 and 39% and 34, 22, 17 and 33% at 442, 486, 560 and 671 nm, respectively. The consistency between different AC processors for OLCI-A and MODIS-Aqua and VIIRS products was relatively poor. Applying the POLYMER AC to OLCI-A, MODIS-Aqua and VIIRS may produce the most accurate R rs and Chl a products and OC time series for the Baltic Sea.


Introduction
The Ocean and Land Colour Instrument onboard the Sentinel-3 (OLCI) satellite was launched in 2016 and is the latest mission to provide global maps of Chlorophyll a (Chl a) [1]. Chl a is estimated from reflectance at the sea surface, which is derived from the top-ofatmosphere (TOA) radiance after AC to remove absorption and molecular scattering by atmospheric aerosols, water surface glint and whitecaps, as well as signals from neighbouring land, cloud, snow or ice. Accurate AC is crucial in providing the highest quality ocean colour Chl a concentrations that can then be used operationally, to assess water quality and to quantify the role and dynamics of phytoplankton under the influence of climate change. The Copernicus Sentinel-3 mission in synergy with the NASA and NOAA ocean colour missions (MODIS-Aqua, VIIRS and PACE), are the principal platforms to monitor changes in phytoplankton blooms and biomass from space over the next two decades. Determining the accuracy of R rs is paramount in providing precise Chl a concentrations from satellite ocean colour [2]. The accuracy of OLCI Chl a has been assessed for some open-ocean areas, where an underestimate in OLCI R rs has been reported [3]. In coastal areas, where the signal also comes from total suspended matter (TSM) and coloured dissolved organic matter (CDOM) as well as Chl a, a systematic underestimation of the second reprocessing

OLCI-A, VIIRS and MODIS-Aqua Processors
OLCI-A full-resolution data L1B and L2 products were downloaded from The Alg@line data set is described in [31] and the methods for processing shipborne data are described in detail in [30]. In brief, the system consists of three RAMSES spectroradiometers (TriOS, Rastede, Germany) mounted near the bow of the ferries M/S Finmaid and Transpaper. The azimuth angle of the instruments was kept as close to 135 • and always > 90 • , using a stepper motor platform with GPS time and location to compensate for the vessel heading [32]. The fingerprint method was used to determine the reflectance of sky radiance at the air-water interface (ρ s ) [32,33]. To eliminate spurious observations, the data underwent a rigorous screening procedure based on assumptions of the spectral shapes of reflectance in these highly absorbing and weakly scattering waters [30]. For these waters, we are also interested in the performance of OLCI in the NIR, which can be used in band ratios to estimate the Chl a concentration [16]. It is generally assumed in waters with low particle scattering that NIR reflectance is close to zero [34]. This assumption generally holds in the Baltic Sea outside peak productivity periods or close to rivers and shallow areas, when there can be higher concentrations of phytoplankton, detrital material or sediment in surface waters. Additionally, residual surface water effects such as spray, sun glint and whitecaps will elevate R rs in the visible and NIR. Individual spectra were inspected to evaluate the shape of the NIR signal for signs of high particle scattering. When no elevated particle scattering was observed, as evidenced by a spectrally flat NIR signal, any offset observed in the NIR was assumed to be caused by spectrally neutral effects and corrected for by subtracting the mean R rs (850-900) from the entire spectrum [30].
The coupled ocean-atmosphere algorithm POLYMER v4.13 models the contribution to TOA reflectance as a polynomial and a forward bio-optical model is used for the water component [39]. The coastal aerosol model C2R-CC [40] uses coastal AERONET-OC measurements [41], and a parameterised version of the successive order of scattering technique to compute the atmospheric radiative transfer [42], which is implemented as a neural network (NN) regression. The latest NASA Ocean Colour Reprocessing (R2018.0) for MODIS-Aqua and Suomi-VIIRS were used. Each of the AC processors applied to OLCI-A data uses a different system vicarious calibration (SVC). The OLCI pb 2.23-2.29 uses a climatological SVC [43] whereas OL_L2M.003.00, MODIS-Aqua and VIIRS implement SVC based on match-ups with MOBY. POLYMER uses an in situ based SVC designed for ocean-atmosphere coupled algorithms [44]. In order to exclude unreliable satellite measurements for each product, a set of recommended quality flags for each AC processor (Table 1) were applied as a mask to each pass.

Match-Up Procedure and Statistics
The method used for match-up analysis follows [45], and was adapted for highfrequency data following [46]. Satellite over-passes were within ±1 h of the in situ Alg@line and AERONET-OC measurements. The in situ data (1 min bins) were matched to individual satellite pixels. From the 3 × 3 pixels, the centre pixel was used for the validation procedure. All in situ data within a specific pixel were averaged, so that each matchup has an independent set of in situ data and there was no overlapping in situ data between match-ups. The validation statistics were computed on the centre pixel, to ensure that each matchup uses an independent satellite pixel. Additionally, the standard deviation around the matchup (from the 3 × 3 box) was computed as an index of the homogeneity of the matchup. The number of granules for each sensor was 38 for OLCI-A, 40 for VIIRS and 7 for MODIS-Aqua. The minimum distance between match-ups in the satellite image is reflected in the resolution of each sensor which for OLCI-A is 300 m, MODIS-Aqua is 1 km and VIIRS is 750 m. The satellite data were extracted from a 3 × 3-pixel box centred on the in situ observations and were excluded if the median coefficient of variation (CV) was >0.15 (from 412 to 555 nm) or when <50% of pixels were valid [45]. The CV criterion also removed data within 5 min of the satellite overpass, which contained the ship in one of the pixels. The following statistical metrics were used to evaluate algorithm performance following [47,48]: type-II regression slope (S), intercept (I), Pearson correlation coefficient (r), root-mean-square difference (RMSD-Ψ), the bias (δ), bias-corrected root-mean-square error (∆) and the relative percentage difference (RPD). Table 1. Flags used to process each ocean colour R rs product. If any of the flags listed were raised, data for each product were not included.

Processor
Flags Implemented

Results
The hyperspectral R rs from the shipborne measurements covered a wide range of signal amplitudes, varying from <0.002 sr −1 at 560 nm during April, to >0.01 sr −1 in June and August ( Figure 2). The change in R rs spectral shape reflected the seasonal variability in these highly absorbing, weakly scattering waters. In April and May, the low signal and flattening of the R rs spectra is a consequence of the Chl a and a CDOM concentrations during spring. The more prominent R rs peak at 560 nm during June to August corresponds to the period of summer phytoplankton blooms when the largest number of R rs spectra were available. The radiometric assessment of satellite products is therefore weighted towards late spring and summer, with the shipborne observations observing a wider variety of phytoplankton blooms and degrees of vertical mixing (thermal stratification is common in summer). The matchup procedure led to N = 208 in situ R rs coincident with OLCI-A pb2, OL_L2M.003.00, C2R-CC and POLYMER (N = 199 for Alg@line, N = 9 for AERONET-OC), 475 with Suomi-VIIRS (N = 429 for Alg@line, N = 46 for AERONET-OC) and 177 (N = 122 for Alg@line, N = 45 for AERONET-OC) with MODIS-Aqua ( Table 2, Figures 3 and 4). The AERONET-OC data consistently had lower R rs (λ) values than the Alg@line data ( Figure 3), reflecting differences in biogeochemical water constituents between the coastal AERONET-OC sites and the wider range of environmental conditions encountered along the deeper water Alg@line ferry transects ( Figure 1). OLCI-A pb 2.23-2.29, OL_L2M.003.00 and Suomi-VIIRS tended to underestimate R rs at all visible bands. The spectral shape of R rs for these Remote Sens. 2022, 14, 89 6 of 19 processors were similar in the green and red to in situ R rs , with some spectral differences in the blue with either an uncharacteristic peak at 412 nm or negative values ( Figure 3). 4). The AERONET-OC data consistently had lower (λ) values than the Alg@line dat ( Figure 3), reflecting differences in biogeochemical water constituents between the coasta AERONET-OC sites and the wider range of environmental conditions encountered alon the deeper water Alg@line ferry transects ( Figure 1). OLCI-A pb 2.23-2.29, OL_L2M.003.0 and Suomi-VIIRS tended to underestimate at all visible bands. The spectral shape o for these processors were similar in the green and red to in situ , with some spec tral differences in the blue with either an uncharacteristic peak at 412 nm or negative va ues ( Figure 3).  For OLCI-A pb 2.23-2.29 at 412 and 443 nm, there was a consistent underestimate at low values <0.001 sr −1 corresponding to both in situ AERONET-OC and Alg@line R rs and an overestimate at values >0.0015 sr −1 corresponding to the in situ Alg@line data only (Figure 4). At 560 nm with an increase in the R rs signal, pb 2.23-2.29 performed better, and had zero δ, but high Ψ and ∆, indicative of the high scatter that can be seen in Figure 4. At 665 and 709 nm, Ψ and ∆ were lower but δ increased, caused by the tendency to underestimate R rs at at these bands ( Figure 4), which is reflected in the negative I and δ ( Table 2). Generally the scatter around the 1:1 for OL_L2M.003.00 was similar to pb 2.23-2.29, but the offset from the 1:1 was greater (Figure 4), especially at 412 and 443 nm, which resulted in the largest S, I, Ψ and ∆ and smallest r of all the AC processors. At 412 nm and 443 nm, the RPD for OL_L2M.003.00 was~92 and 60%, respectively, which is the highest of all of the ACs evaluated. Similarly, for OL_L2M.003.00 at R rs (674) and R rs (709), the S was the lowest of all the AC processors ( Table 2).
The closest match to in situ R rs from both AERONET-OC and Alg@line was OLCI-A POLYMER, (Figure 3), which at 412, 442, 560, 665 and 709 nm was within 30% (Table 3). POLYMER also had the highest r, which is indicative of linear consistency between the in situ and satellite data ( Figure 5). At R rs (412), OLCI-A POLYMER consistently outperformed the other ACs having the lowest Ψ, δ and ∆, which indicate low scatter around the 1:1, few outliers and no systematic bias, respectively (Table 3). OLCI-A POLYMER also had the lowest δ at R rs (490) and R rs (665), which is reflected in the tight fit and proximity of the points to the 1:1 ( Figure 5). For all bands, there was an underestimate at high R rs values, which resulted in a low S, however (Table 3, Figure 5). POLYMER also exhibited an uncharacteristic peak in the red towards 700 nm ( Figure 3). C2R-CC tended to overestimate R rs at all bands ( Figure 5), and exhibited artefacts in the blue often with another peak at 490 nm, which does not correspond to either of the in situ datasets, plus it had a high offset from zero across the spectrum ( Figure 3). This gave rise to the high S, I and Ψ and low r at blue and blue-green bands resulting in RPD of between 66 and 80% ( performance of C2R-CC improved at 560, 665 and 709 nm, and the S and r were closer to 1 and I, Ψ, δ and ∆ were all lower and the RPD was between −1 and 37% (Table 3). At 560, 665 and 709 nm, even though the Ψ and ∆ were higher or similar for C2R-CC compared to POLYMER, the S were closer to the 1:1 and I were lower (Table 3). Table 2. Statistical results from the comparison between in situ and OLCI-A, pb 2.23-2.29 and OL_L2M.003.00 R rs (λ). The metrics were computed using type-II regression for the slope (S), intercept (I), Pearson correlation coefficient (r), root-mean-square difference (RMSD-Ψ), the bias (δ), bias-corrected root-mean-square error (∆) and the relative percentage difference (RPD). Metrics for the processors with the best performance at each band are given in bold. N is the number of match-ups, with TOT being the total, GDL is the AERONET-OC site the Gustav Dalen Lighthouse, HLT is the AERONET-OC site the Helsinki lighthouse and Ferry are the Alg@line data. Suomi-VIIRS returned the highest number of match-ups with R rs (λ) values covering a higher range than the other ACs ( Figure 6), but exhibited a consistent underestimate as indicated by the comparatively high Ψ, δ and ∆ (Table 4), especially at higher R rs values. The RPD for VIIRS varied from 22% at 560 nm to 38% at 671 nm and 67% at 412 nm (Table 4). For MODIS-Aqua the spectra reproduced the shape of the in situ AERONET-OC data well, but not for the Alg@line data ( Figure 6). MODIS-Aqua both overestimated and underestimated R rs (412) and R rs (443), as conveyed by the high S, I and RPD (Table 4). MODIS-Aqua performed better at 488, 560 and 667 nm with comparatively low I, Ψ, δ and ∆ and S close to 1, especially at 560 and 667 nm (Table 4).

Statistical
Composite ocean colour satellite images from OLCI-A using the four ACs, MODIS-Aqua and VIIRS for the Baltic Sea at 560 nm were processed for the period from 11 to 17 June 2016 (Figure 7). OLCI-A C2R-CC returned the highest R rs (560) followed by POLYMER and OL_L2M.003.00, especially in the southern Baltic Sea. C2R-CC also provided the greatest pixel coverage over the whole area during this period, whereas pb 2-23-2029 and OL_LM.0003.00 had the lowest coverage (Figure 7), presumably due to differences in the cloud mask flags. The MODIS-Aqua image had the lowest R rs (560). For each processor, data were extracted for R rs (443), R rs (560) and R rs (665) at every 20 km from north-south and east-west transects (shown as red lines on pb 2.23-2.29 image in Figure 7), to compare the different ACs over large spatial areas of the Baltic Sea. The pattern was the same for each waveband and transect: OLCI-A POLYMER and OL_L2M.003.00 were closest to the in situ R rs ferry data, whereas OL_L2M.003.00, MODIS-Aqua and VIIRS were closest to in situ AERONET-OC R rs . C2R-CC consistently returned the highest values, especially for R rs (443) in the southern part of the transect and for R rs (665) in the northern part of the transect. OLCI-A pb 2 and OL_L2M.003.00 had the lowest R rs along both transects and especially at R rs (443) for OLCI-A pb 2. MODIS-Aqua and VIIRS were similar and generally lower than OLCI-A POLYMER. The exception to this was for POLYMER R rs (665) that had a cluster of points at the northern-most part of the north-south transect which were lower than the in situ R rs (665). For OLCI-A pb 2.23-2.29 at 412 and 443 nm, there was a consistent underestimate at low values <0.001 sr −1 corresponding to both in situ AERONET-OC and Alg@line and an overestimate at values >0.0015 sr −1 corresponding to the in situ Alg@line data only (Figure 4). At 560 nm with an increase in the signal, pb 2.23-2.29 performed better, and had zero δ, but high Ψ and Δ, indicative of the high scatter that can be seen in Figure 4. At 665 and 709 nm, Ψ and Δ were lower but δ increased, caused by the tendency to underestimate at at these bands ( Figure 4), which is reflected in the negative I and δ ( Table 2). Generally the scatter around the 1:1 for OL_L2M.003.00 was similar to pb 2.23-2.29, but the offset from the 1:1 was greater (Figure 4), especially at 412 and 443 nm, which resulted in the largest S, I, Ψ and Δ and smallest r of all the AC processors. At 412 nm and 443 nm, the RPD for OL_L2M.003.00 was ~92 and 60%, respectively, which is the highest of all of the ACs evaluated. Similarly, for OL_L2M.003.00 at (674) and (709), the S was the lowest of all the AC processors ( Table 2).   .003.00 ( ). The metrics were computed using type-II regression for the slope (S), intercept (I), Pearson correlation coefficient (r), root-mean-square difference (RMSD-Ψ), the bias (δ), bias-corrected root-mean-square error (Δ) and the relative percentage difference (RPD). Metrics for the processors with the best performance at each band are given in bold. N is the number of match-  Table 3. Statistical results from the comparison of in situ and OLCI-A CR2-CC vSnap8 and POLY-MER v4.13 R rs (λ). The metrics were computed using type-II regression for the slope (S), intercept (I), Pearson correlation coefficient (r), root-mean-square difference (RMSD-Ψ), the bias (δ), biascorrected root-mean-square error (∆) and the relative percentage difference (RPD). Metrics for the processors with the best performance at each band are given in bold. N is the number of match-ups, with TOT being the total, GDLT is the AERONET-OC site the Gustav Dalen Lighthouse, HLT is the AERONET-OC site the Helsinki lighthouse and Ferry are the Alg@line data. POLYMER also had the highest r, which is indicative of linear consistency between the in situ and satellite data ( Figure 5). At (412), OLCI-A POLYMER consistently outperformed the other ACs having the lowest Ψ, δ and Δ, which indicate low scatter around the 1:1, few outliers and no systematic bias, respectively (Table 3). OLCI-A POLYMER also had the lowest δ at (490) and (665), which is reflected in the tight fit and proximity of the points to the 1:1 ( Figure 5). For all bands, there was an underestimate at high values, which resulted in a low S, however (Table 3, Figure 5). POLYMER also exhibited an uncharacteristic peak in the red towards 700 nm ( Figure 3). C2R-CC tended to overestimate at all bands ( Figure 5), and exhibited artefacts in the blue often with another peak at 490 nm, which does not correspond to either of the in situ datasets, plus it had a high offset from zero across the spectrum (Figure 3). This gave rise to the high S, I and Ψ and low r at blue and blue-green bands resulting in RPD of between 66 and 80% ( Table 3). The performance of C2R-CC improved at 560, 665 and 709 nm, and the S and r were closer to 1 and I, Ψ, δ and Δ were all lower and the RPD was between −1 and 37% (Table 3). At 560, 665 and 709 nm, even though the Ψ and Δ were higher or similar for C2R-CC compared to POLYMER, the S were closer to the 1:1 and I were lower (Table 3).    Table 4. Statistical results from the comparison of in situ and Suomi-VIIRS and MODIS-Aqua ( ). The metrics were computed using type-II regression for the slope (S), intercept (I), Pearson correlation coefficient (r), root-mean-square difference (RMSD-Ψ), the bias (δ), bias-corrected rootmean-square error (Δ) and the relative percentage difference (RPD). Metrics for the processors with the best performance at each band, are given in bold. N is the number of match-ups, with TOT being the total, GDLT is the AERONET-OC site the Gustav Dalen Lighthouse, HLT is the AERONET-OC site the Helsinki lighthouse and Ferry are the Alg@line data.   Table 4. Statistical results from the comparison of in situ and Suomi-VIIRS and MODIS-Aqua R rs (λ). The metrics were computed using type-II regression for the slope (S), intercept (I), Pearson correlation coefficient (r), root-mean-square difference (RMSD-Ψ), the bias (δ), bias-corrected root-mean-square error (∆) and the relative percentage difference (RPD). Metrics for the processors with the best performance at each band, are given in bold. N is the number of match-ups, with TOT being the total, GDLT is the AERONET-OC site the Gustav Dalen Lighthouse, HLT is the AERONET-OC site the Helsinki lighthouse and Ferry are the Alg@line data. in situ AERONET-OC . C2R-CC consistently returned the highest values, especially for (443) in the southern part of the transect and for (665) in the northern part of the transect. OLCI-A pb 2 and OL_L2M.003.00 had the lowest along both transects and especially at (443) for OLCI-A pb 2. MODIS-Aqua and VIIRS were similar and generally lower than OLCI-A POLYMER. The exception to this was for POLYMER (665) that had a cluster of points at the northern-most part of the north-south transect which were lower than the in situ (665).

Discussion
From the dawn of SeaWiFS through the maturing ocean colour age of MODIS-Aqua and MERIS, these satellite sensors have provided accurate Chl a retrieval in open-ocean, shelf-seas and many coastal environments for the past two decades [49,50]. Regions with high CDOM however, pose a particular challenge due to low signal amplitude with overlapping absorption signatures of CDOM and Chl a [51]. In the Baltic Sea, previous studies showed that SeaWiFS and MODIS-Aqua underestimated L W N in the blue and red and that the uncertainties and bias were high [52], potentially resulting in an overestimate in Chl a. This is due to the low L W N signal under the influence of high CDOM and an overestimate in the aerosol optical depth. AC models need to capture and reproduce this large variation in both atmospheric and oceanic conditions. The atmospheric masses over the region are influenced by both land and marine aerosols, which are highly variable. In the central part of the Baltic Sea, the average aerosol optical thickness is 1.3 [53]. The burning of agricultural straw in northern Europe and Russia during April is thought to increase the aerosol optical thickness [54]. The surface water inherent optical properties of the Baltic Sea are dominated by CDOM with the secondary, seasonal spring-summer influence of phytoplankton [54]. The variability in the surface water conditions is reflected in the spectral shape of the in situ R rs (λ), with the high CDOM causing a flattening of the spectra at 412 nm (Figure 2), low values at 443 nm, a high peak at 560 nm indicative of the spring bloom and a smaller peak at 709 nm due to a higher backscatter from small particles, such as cyanobacteria or TSM (Figure 2). Spring bloom Chl a in the Baltic Sea can be as high as 10 to 120 mg m −3 whereas during summer the range in Chl a is typically 1 to 3 mg m −3 , but can increase to between 5 and 30 mg m −3 in July and August when cyanobacteria bloom [24]. Some estuarine locations around the Baltic Sea can be influenced by TSM [54]. The variability in CDOM and TSM at river mouths is expected to decrease R rs (λ) at blue and blue-green bands as TSM and the ratio of backscatter to absorption increases, which would produce a higher slope in the R rs (λ) spectra from the blue to the green. An increase in TSM loads and therefore backscatter would also be observed in an increase in the offset in R rs (λ) in the NIR (Figure 2). For the match-ups in this study, the ferry tracks mostly traverse the deeper waters of the Baltic Sea and the AERONET-OC sites are located away from major rivers (Figure 1). The AERONET-OC sites are located in the northern Baltic Proper and the Gulf of Finland where a CDOM (412) can be between 0.8 to 1.6 m −1 [55]. The AERNOT-OC R rs (λ) provided 4% of the total match-ups with OLCI-A, 10% with VIIRS and 31% with MODIS-Aqua (Tables 2-4). The lower R rs (λ) in the blue and green at the AERONET-OC stations especially at the HLT (Figure 3a,b), mirrors the high absorption by CDOM and low scattering at these sites [12,56]. The riverine input of CDOM is particularly high in the eastern Gulf of Finland [11,12,57], and follows a dilution gradient towards the northern Baltic Proper, with the HLT located approximately halfway. The highest a CDOM (412) in the Baltic Sea can be found further east in the Gulf of Finland, towards St Petersburg in Neva Bay (not shown in Figure 1), where values are generally >2.5 m −1 and can reach 15 m −1 [11,12].
The objective of deploying autonomous measurement systems on the ferries was to provide more R rs (λ) data at other sites, to be able to capture the variability in both atmospheric and water conditions from Bothnian Bay in the north to the Bornholm Basin in the south (Figure 1). The ferry data provided the largest number of match-ups with the ocean colour sensors that were assessed (Tables 2-4). Most of the match-ups were between 54 and 56 • N in the Arkona and Bornholm basins (Figure 1), where a CDOM (412) is typically between 0.4 and 1.1 m −1 . In the deeper water, observed during the day-lit part of the Alg@line stations, the R rs (λ) signal in the green was far higher (Figure 3b), while the number of observations collected here was relatively small. Towards the southern Baltic Sea a CDOM (412) is lower (<0.6 m −1 ) while TSM and the associated scattering increase, causing a pronounced increase in R rs (560) [54]. There are large areas in the central part of the Baltic Sea where a CDOM (412) remains fairly homogeneous [56]. In the north in the Gulf of Bothnia, a CDOM (412) can be >1.8 m −1 [54,56], which correspond with the lowest R rs (λ) in the Alg@line data (Figure 3b). Towards the southern Baltic Sea a CDOM (412) is lower (<0.6 m −1 ) while Chl a and the associated scattering increases which causes a pronounced increase in R rs (560) [54] as seen in Figure 3b.
The precursor ocean colour satellite sensor to OLCI was MERIS, which had similar bands and characteristics. At the two AERONET-OC sites in the Baltic Sea, Zibordi et al. [27] reported differences of +15 to +42% for MERIS over the spectral range 443 to 555 nm compared to SeaPRISM L W N (N = 41). Using an updated version of the MERIS AC MEGS L W N product, Zibordi et al. [52] subsequently reported (N = 12 to 39) that the accuracy at 490, 560 and 665 nm was improved (ψ < 24%). Some studies reported that C2R-CC improved the performance of MERIS R rs [19,23,24]. For OLCI-A pb 2 in the Baltic Sea at the AERONET-OC sites (N = 42), Zibordi et al. [7] reported an underestimation in L W N at blue spectral bands due to an overestimate in the aerosol optical depth at 865 nm. In this study, we also observed underestimation by OLCI-A pb 2 in the blue at low R rs (λ) and an overestimate at the higher range of values (Figure 4). The updated OL_L2M.003.00 has been found to be more accurate than OLCI-A pb 2 in the oligotrophic waters of the Atlantic Ocean [3]. In the Baltic Sea, the OL_L2M.003.00 product underestimated R rs (λ) in the blue especially at low values, and also overestimated R rs (λ) in the green, red and NIR at the higher range of values (Figure 4), which resulted in a low linear regression S and an increase in the RPD (Table 2). OL_L2M.003.00 was therefore less accurate than OLCI-A pb 2 for the Baltic Sea. The OL_L2M.003.00 processing applies new system vicarious calibration gains based on the standard OC methodology [57], there has been an update to the bright pixel correction removing any residual water reflectance in the NIR, a spectrally resolved white cap correction has been applied which should improve the quality of the product at wind speeds between 6.3 and 12 ms −1 and an update to the cloud flags to remove potential pixels contaminated by clouds (https://www.eumetsat.int/media/47794 (accessed on 24 October 2021)). The poor performance of OL_L2M.003.00 for the Baltic Sea region is possibly due to difficulties with the AC in reproducing correctly the signal from the atmosphere and the highly absorbing water when R rs (λ) (particularly in the blue) is so low, the aerosol model library not being optimal for these waters and that bright pixel scheme implemented does not converge properly, which caused the underestimate in the NIR (Figure 4). Assessing other AC processors for OLCI-A, we found that POLYMER is the most accurate with differences of 22, 17 and 28% at 443, 560 and 665 nm, respectively (N = 208). Similarly, off the south-east Canadian coast, where a CDOM (442) can reach 4 m −1 , OLCI-A POLYMER has also been found to be the most accurate AC [4]. Alikas et al. [25] also found that POLYMER is the most accurate AC for the Baltic Sea and Estonian Lakes, but the difference was 57% at blue wavebands skewed by the data from the Lakeds. The superior performance of the version of POLYMER developed for OLCI in the Baltic Sea is probably due to (i) the polynomial atmospheric model reproduces well the scattering and absorption by the atmosphere under both sun glint and thin cloud conditions [39,58]; (ii) the quality of the atmospheric signal at high latitudes has been improved [44]; (iii) since the version of POLYMER developed for MERIS, there have also been updates to the water reflectance model for the derivation of both Chl a and the backscattering signal in both case 1 and 2 waters which also include bidirectional effects [44]. For the Baltic Sea, this may be somewhat surprising since the OLCI version of POLYMER has been calibrated more for Chl a and TSM dominated waters and is expected to perform less well in areas dominated by CDOM [4,44]. From the dispersion of the points at higher R rs (λ) values in the blue and the resulting low S, clearly improvements in OLCI-A POLYMER are required for the Baltic Sea region. Overall the results for OLCI-A POLYMER are encouraging and imply that this AC could also be accurate in regions with similar IOPs and range in a CDOM (λ) such as the Amazon River plume (a CDOM C2R-CC exhibited the worst performance for the Baltic Sea at blue and green bands, which were >65% at 412, 443 and 490 nm and >35% at 560 nm. Accounting for uncertainties in the in situ R rs in the Baltic Sea, Alikas et al. [25] also found that the OLCI-A C2R-CC processor exhibits a large difference (~107%) compared to in situ R rs (N = 15). For OLCI-A, two studies in the Baltic found that the previous version of C2R-CC performed well in retrieving the R rs spectral shape (N = 29) [26,62]. In this study, the performance of C2R-CC improved from the green to red to NIR, with an S closer to the 1:1, though the scatter was still relatively high ( Figure 4). This may suggest that C2R-CC could produce accurate Chl a values for the region when using red: NIR R rs (λ) band ratio algorithms. The calibration of the C2R-CC requires sufficient data to account for the effects of different aerosol types, cirrus clouds, sun and sky radiance, and the coupling between them and the air molecules [62]. Improvements to the C2R-CC have been made to both the atmospheric and in-water NNs, and for the water component, this includes a more extensive training range, which for a CDOM (442) is now from 0.001 to 22 m −1 (https://www.eumetsat.int/media/47794 (accessed on 24 October 2021)). There are two possible reasons why the C2R-CC does not perform well in blue bands for the Baltic Sea. The first is probably due to none convergent or optimal solutions caused by the in-water NN, as has been observed in other studies using different ocean colour sensors [30,63]. For MERIS, it was found that adding further training data could lead to 'overtraining' by offering multiple solutions to retrieve R rs or IOPs, which may not necessarily result in accurate R rs [63]. The second reason is that the calibration data used in the atmospheric NN still does not cover the variability in atmospheric conditions that occur over the Baltic Sea Previous studies on MODIS-Aqua L W N at Baltic Sea AERONET-OC sites reported large uncertainties (~60%) and biases (~20%) at blue bands and an overestimate in τa at 869 nm of~95%, indicative of errors in the AC aerosol model [18]. Extension of the aerosol model for MODIS-Aqua to include α values to 1.7 has been recommended [18], especially to capture the range in mixed continental-industrial type aerosols in summer [64]. The AC developed for MODIS-Aqua and VIIRS [65,66] applies a NIR correction that accounts for particle backscattering based on a variable slope plus an estimate for absorption at red and NIR bands [67]. There is often a negative bias across all wavelengths, which for MODIS-Aqua is most pronounced in the blue [50]. In highly absorbing waters, MODIS-Aqua can exhibit very high relative uncertainty at blue bands (up to 60%) where strong absorption makes the signal low, and in the red (up to 40%) where the signal is also low due to strong water absorption [50]. In the green, the uncertainty for MODIS-Aqua is~20% [50]. For the Baltic Sea specifically, Goyens et al. [68] reported that the relative error for MODIS-Aqua at 412 nm was 55%, at 547 nm was 15% and at 667 nm was 25%. Similarly, our data indicated a small negative bias across all wavelengths except 412 nm and that the RPD was even higher in the blue (~84% at 412 nm) and within the range reported by Moore et al. [50] for green (~16% at 560 nm) and red bands (~33% at 667 nm). The cause of the negative bias is attributed to L w being non-zero in the NIR which results in an overestimation of the aerosol optical thickness and an underestimation in L w [69], which when extrapolated from the NIR across the bands, the error is greater in the blue [50]. Above coastal waters, light-absorbing aerosols can also contribute to the negative bias in L w . In the Baltic Sea, the negative bias in MODIS-Aqua R rs is due to a systematic underestimation in the Ångström coefficient and an overestimation of the optical thickness [68]. An overestimate in MODIS-Aqua aerosol optical thickness of 101% at the GDL and 91% at the HLT has been reported [69]. Due to these errors in the standard MODIS-Aqua AC, a multilayer neural network (MLNN) AC method for MODIS-Aqua was subsequently developed and compared with the SeaDAS NIR and NIR/SWIR algorithms and a C2R-CC version for MODIS-Aqua [70]. For the Baltic Sea, the MLNN algorithm reduced the L W N APD by more than 60% for blue bands compared to the standard SeaDAS AC. In our study, using the latest reprocessing (R2018) for MODIS-Aqua (N = 177 match-ups) and VIIRS (N = 475), the RPD at 443 nm was 34% for both sensors and was 17% for MODIS-Aqua and 22% for VIIRS at 560 nm, but at 667 nm the RPD increased to 33 and 38%, respectively.
As discussed above, the AERONET-OC and Alg@line data are located in different environments; both being predominantly influenced by CDOM, which is generally higher at the AERONET-OC sites, whereas the Alg@line tracks pick up a stronger signal from spring-summer phytoplankton blooms. Some trends in the validation plots for the different ACs may also partly arise from differences in data sources between the in situ AERONET-OC and Alg@line R rs due to the nature of the sites, quality and processing of the data. The uncertainty of the TriOS-RAMSES system is greater in the blue (>6%) than in the green (3.5%) and red (4.5%), compared with CIMEL-SeaPRISM which has uncertainties of 4.5, 4 and 10%, respectively [71]. The TriOS-RAMSES radiometers deployed on Alg@line have been inter-compared with CIMEL-SeaPRISM at the stable AERONET-OC platform of the Aqua Alta Oceanographic Tower. The differences between TriOS-RAMSES and CIMEL-SeaPRISM were 8% difference at 443 nm, 6% at 555 nm and 10% at 667 nm [72]. Future studies should evaluate differences between the sensors using a common processor and for the TriOS-RAMSES, the sun-tracker stepper motor used in this study. There is one caveat, that the methodology for calculating E d and subsequent optimisation of R rs are not directly comparable between AERONET-AC and the automated TriOS-RAMSES (plus the fingerprint method), although both have been used in combination in previous studies without showing major biases [30,34].
Ocean colour is an Essential Climate Variable [73,74], the study of which requires long time-series data to assess climate-induced changes in phytoplankton. To this end, there have been a number of initiatives to merge multi-mission datasets [75,76]. The starting point for this is the best AC performance for a single sensor followed by the application of the R rs product to multi-mission data, computation of Chl a and analysis of reproducibility of patterns in the Chl a time series for a single sensor in the multi-mission data [77]. For this, the AC model needs to be as accurate as possible and have the largest number of data points in both space and time. In this study, we found that the consistency between different OLCI-A, MODIS-Aqua and VIIRS AC products was poor. We performed a comprehensive analysis of AC processors for OLCI-A, which showed that POLYMER is the most accurate over all bands and provides the largest number of complete and consistent data points for Baltic Sea images. The performances of Suomi-VIIRS in the blue, and OLCI-A C2RCC and MODIS-Aqua in the red and NIR were also good. For the Baltic Sea, there is growing consensus that POLYMER for OLCI-A [25] is the most accurate AC. Future studies should evaluate the performance of MODIS-Aqua and VIIRS with POLYMER and evaluate if it is the most accurate AC for generating ocean colour time series from multiple-satellite missions. The POLYMER AC may improve the consistency between OLCI-A, MODIS-Aqua and VIIRS.

Conclusions
The performance of four AC processors (pb 2, OL_L2M.003.00, C2R-CC and POLY-MER) for OLCI-A and standard processors for MODIS-Aqua and VIIRS were assessed in the Baltic Sea using in situ R rs from AERONET-OC and ferries. OLCI-A with POLYMER performed well at 412, 442 and 560 nm with ψ <30% and a δ of between −0.0001 and −0.0004, but all bands exhibited an underestimate as R rs values increased. The other OLCI-A AC processors showed relatively poor performance in the blue (412 and 443 nm), red and NIR wavebands, but better performances at 560 nm where the signal was highest. Of the OLCI-A processors, C2R-CC exhibited the worst performance in the blue, and generally, ψ was >30% for all wavebands but showed better performance at 665 and 709 nm. VIIRS underestimated R rs across all bands, which was notably large at 412 nm, where the differences with in situ R rs were >65%, and also especially at higher R rs values in green and red bands. MODIS-Aqua was more accurate in the blue-green to red bands compared to the blue, especially R rs (412) which had a difference of~85% compared to in situ R rs . Of the OLCI AC processors tested, the results suggest that OLCI POLYMER will generate the most accurate biogeochemical monitoring water quality parameters, though improvements in this AC are still required for the Baltic Sea. Funding: GT and SPA were supported by S3-EUROHAB (Sentinel-3 products for detecting EUtROphication and Harmful Algal Bloom events) from the European Regional Development Fund through the INTERREG France-Channel-England, as part of the assessment of transferability of the project outputs. PQ was funded by a scholarship from the Chinese Scholarship Program. SS and NS received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 776480 (MONOCLE).