Phytoplankton Bloom Dynamics in the Baltic Sea Using a Consistently Reprocessed Time Series of Multi-Sensor Reﬂectance and Novel Chlorophyll-a Retrievals

: A relevant indicator for the eutrophication status in the Baltic Sea is the Chlorophyll-a concentration ( Chl-a ). Alas, ocean color remote sensing applications to estimate Chl-a in this brackish basin, characterized by large gradients in salinity and dissolved organic matter, are hampered by its optical complexity and atmospheric correction limits. This study presents Chl-a retrieval improvements for a fully reprocessed multi-sensor time series of remote-sensing reﬂectances ( R rs ) at ~1 km spatial resolution for the Baltic Sea. A new ensemble scheme based on multilayer perceptron neural net (MLP) bio-optical algorithms has been implemented to this end. The study documents that this approach outperforms band-ratio algorithms when compared to in situ datasets, reducing the gross overestimates of Chl-a observed in the literature for this basin. The R rs and Chl-a time series were then exploited for eutrophication monitoring, providing a quantitative description of spring and summer phytoplankton blooms in the Baltic Sea over 1998–2019. The analysis of the phytoplankton dynamics enabled the identiﬁcation of the latitudinal variations in the spring bloom phenology across the basin, the early blooming in spring in the last two decades, and the description of the spatiotemporal coverage of summer cyanobacterial blooms in the central and southern Baltic Sea. writing— original draft preparation, V.E.B., D.D., and M.S.; writing—review and editing, J.A., M.B., S.C., A.D.C., and T.K.; visualization, V.E.B. and M.S.; supervision, V.E.B. and D.D.; project administration, V.E.B.; funding acquisition, V.E.B. and D.D. All authors


Introduction
The status of eutrophication in the Baltic Sea has been a constant concern [1][2][3][4][5]. The Baltic Sea Action Plan [3] by the Baltic Marine Environment Protection Commission (Helsinki Commission, HELCOM) presents actions to improve the sea condition. One of the indicators relevant to the eutrophication assessment is the Chlorophyll-a concentration (Chl-a), based on a combination of data from station sampling, automated ship measurements, and remote sensing [4,6,7]. As an example of Baltic Sea countries, Finland already utilizes remote sensing Chl-a in its national monitoring program, along with some other water quality parameters [8]. Although Earth Observation programs have significantly improved the retrieval of ocean color (OC) data products on a global scale [9][10][11][12][13], the quality of regional results, and more specifically OC product maps in optically complex waters such as the Baltic Sea, still challenge remote sensing capabilities [13][14][15]. The case study addressed in this work is the improved Chl-a determination based on regional bio-optical algorithms in view of supporting the environmental assessment of the basin. The Chl-a retrieval in the Baltic Sea is hampered by two complementary difficulties [24][25][26][27]. The first is the optical complexity of the basin, where the colored dissolved organic matter (CDOM) dominates the absorption budget in the blue spectral range with a high variability over short spatial scales, leading to low reflectance especially in the blue part of the spectrum [20,[27][28][29][30][31]. The second is the reduced performance of the atmospheric correction, often producing erroneously low or even negative R rs values, consequently limiting the capability to retrieve accurate R rs spectral images [24,27,[32][33][34]. As a result, standard algorithms based on a blue-green band-ratio polynomial regression developed for SeaWIFS (Sea-viewing Wide-Field-of-view Sensor) and MODIS (Moderate Resolution Imaging Spectroradiometer) tend to largely overestimate Chl-a in the basin [25][26][27][35][36][37].
Based on a series of radiative transfer simulations aimed at representing the biooptical characteristics of the Baltic Sea, including the seasonal variability in phytoplankton dominance [20], Ligi et al. [26] tested several global and regional Chl-a algorithms. Their results showed that the band-ratio algorithms that use R rs values in the blue-green spectral regions were inaccurate due to the water optical complexity, confirming the previous findings by several authors (e.g., [24,25]). However, these studies also showed that bandratio algorithms in the red and NIR spectral regions, mostly using MERIS (MEdium spectral Resolution Imaging Spectrometer) spectral bands, yielded high correlation on simulated data, but their performance produced at best a r 2 of 0.4-0.6 when tested on in situ data (e.g., [23][24][25]) for Chl-a in the range between ∼1 and 10 µgL −1 (i.e., at the lower part of the retrievable concentrations by these algorithms [37]). Neural Network algorithms relying on SeaWIFs, MERIS and OLCI (Ocean and Land Colour Instrument) band-sets were also tested but reported limited retrieval accuracy [24,[38][39][40][41][42].
The need to enhance the accuracy of Chl-a maps in the Baltic Sea is recognized by international space programs, and specifically prioritized by the data products release undertaken by the Copernicus Marine Environment Monitoring Service (CMEMS) [43,44]. Two operational Chl-a data-streams are available in CMEMS for the Baltic Sea: a near real-time based on OLCI, and a fully reprocessed multi-sensor time series produced with consolidated and consistent input datasets [15].
Recent Chl-a retrieval improvements for the Baltic Sea multi-sensor time series are presented in this study. Based on the multilayer perceptron neural net (MLP) developed with the in situ data acquired in the study area [29,45], a new bio-optical ensemble scheme, named ENS-MLP, is conceived in the present work. Performance benchmarks with standard and regional band-ratio algorithms for the Baltic Sea are presented [11,36,46,47]. A core feature of the study is to use independent in situ measurements collected in the Baltic Sea for the analysis and assessment of results. Finally, the fully consistently reprocessed R rs and novel Chl-a time series  are analyzed to provide a quantitative description of the phytoplankton bloom dynamics for spring and summer seasons at basin and sub-basin scales.
The remainder of this document is organized as follows. Data and methods are presented in Section 2 by describing the space-borne R rs dataset, the MLP and ENS-MLP regional algorithms, the reference field measurements, and the indicators for spring and summer phytoplankton blooms in the basin. Section 3 accounts for (1) assessment of R rs and Chl-a estimates in reference to in situ measurements, and (2) occurrence of spring and summer blooms in the Baltic Sea over 1998-2019. Study remarks and conclusions are finally addressed in Section 4.

Space-Borne R rs
Space-borne radiometric observations include R rs at 412, 443, 490, 510, 555, and 670 nm collected over the Baltic Sea for more than twenty years (1997 to 2019). A consistent data processing scheme has been adopted to produce and release this time series. Namely, R rs spectra acquired by MERIS/ENVISAT, MODIS/AQUA, SeaWiFS/SEASTAR, and VIIRS/SUOMI-NPP space sensors were merged into a unique CMEMS dataset with 1 km resolution utilizing the processor Ver. 4.2 of the Ocean Color Climate Change Initiative (OC-CCI) [12].
Space-borne R rs values have been derived by applying atmospheric corrections specific to the processing of data from different sensors. The NASA standard atmospheric correction [48] was applied to MODIS, SeaWiFS, and VIIRS data (R2018.0 reprocessing) [49][50][51]. The POLYMER atmospheric correction [52] was instead adopted for MERIS (R2012.0 reprocessing). To account for the different sensor spectral resolutions, a band-shifting procedure based on the inverse and direct application of the QAA algorithm [53][54][55] was utilized to recompute results at the SeaWiFS bands [56]. Finally, in the multi-sensor merging scheme, the band-shifted R rs spectra from the available sensors have been individually bias-corrected with respect to a reference sensor mission [12,15]. Notable is the resulting enhanced product map coverage, which can be twice that of a single-sensor image when three different space missions are combined [53]. Thisincreased number of valid pixels is of utmost relevance to lessen the constraint of severe cloud cover in the Baltic Sea.

The Ensemble of MLP Bio-Optical Algorithms
The regional MLP of reference for deriving Chl-a estimates in the Baltic Sea was developed within the BiOMaP program of JRC/EC [29,45]. The MLP allows using R rs at individual wavelengths as input to compute Chl-a values. In this respect, the MLP can outperform band-ratio regression algorithms, which account for spectral R rs slopes rather than R rs values. However, the MLP suffers from poor extrapolation capabilities and its validity range needs to be supported by a novelty detection scheme to verify the statistical consistency between training data and R rs input spectra from which the Chl-a is operationally computed [57,58].
The validity range of MLPs based on different sets of input bands can also vary depending on the spectral heterogeneity of uncertainties induced by the atmospheric correction process on R rs values. Former analyses [57] have shown that R rs (670) is required to improve the Chl-a retrieval in the Baltic Sea, and that the best performance is achieved utilizing all six SeaWiFS bands. However, the same study has also identified a substantial reduction of the MLP performance when the set of input R rs is affected by spectral perturbations statistically consistent with those introduced by the atmospheric correction.
To address this performance degradation, an ensemble solution, inspired by the socalled "mixture of expert" approach in statistical pattern recognition [59,60], is proposed in the present work. Rather than adopting only a single MLP based on all CCI spectral bands (or a fixed subset of wavelengths), the idea is to rely on several MLPs using different R rs band-sets as input. Specifically, the Chl-a ensemble estimate (Chl-a ENS ) is retrieved gathering individual MLPs that use different R rs spectral subsets by weighting their contribution through the corresponding novelty index ( Figure 2). The sets of R rs bands evaluated in this work as MLP input are: • The new ensemble scheme proposed in this study combines different MLP results through a dynamical determination of the input bands' relevance based on the novelty index. The mathematical expressions to implement the MLPs, to compute the novelty index, and to derive the ensemble Chl-a are reported next.

MLP Scheme
Based on previous analyses [57], the MLP architecture to retrieve Chl-a from input R rs has one hidden layer with ten neurons (see also Figure 2). The MLP algorithm development (a process referred to as MLP training) was based on 100 epochs using the Scaled Conjugate Gradient as a method for weights' optimization and applying a weight decay regularization coefficient equal to 0.05. The numerical MLP implementation relied on the Netlab neural net toolbox for MATLAB [61,62]. The MLP performance with respect to the BiOMaP data was defined with three-fold cross-validation [29,45,57]. All available data were then used to train the operational input band-specific MLPs used in this study to target the highest Chl-a retrieval accuracy. All parameters to compute the Chl-a MLP and the novelty index were determined with the BiOMaP measurements, building upon the approach detailed in [29,45,57] (see Supplementary Tables S1-S4). Computational steps including data pre-and post-processing as well as the definition of the MLP novelty index are summarized next. Data pre-processing: Input R rs values are firstly log-transformed, l = log10(R rs ), and then z-score scaled x = (l − µ l )/σ l , where µ l and σ l indicate respectively the mean and standard deviation of log-transformed R rs values used for the MLP training.
Data regression: The MLP scheme consists of successive layers of units, with each unit in one layer connected to each unit of the next layer. Let y = f (x) denote the MLP function, where the input vector x has entries x i for i = 1, . . . , n λ , where n λ indicates the number of input R rs wavelengths. The value of the hidden unit j, denoted a j , is a linear combination of the input quantities: where w (1) ji represents the weight linking the input unit i to the hidden unit j, and b (1) j is the bias adaptive parameter. The activation of the hidden unit j, identified with z j and representing the input for the next layer, is obtained as: with g indicating the hyperbolic tangent activation function. The output y is computed as: where M indicates the number of hidden units with weight w (2) j , and b (2) is the bias coefficient.
Data post-processing: Data post-processing converts the y value computed by the MLP into the final data product as Chl-a = 10 (y·σ c +µ c ) , where µ c and σ c are respectively the mean and the standard deviation of log-transformed Chl-a values used for the MLP training.

Novelty Index and MLP Applicability Range
The MLP applicability range has been identified by means of the novelty index η [57,58], defined as follows: where l is the logarithm of the input R rs values, A η and γ η array are respectively the eigenvectors and eigenvalues from the principal component analysis (PCA) of the logtransformed R rs data used for MLP training, ζ is the Euclidean norm of ζ, and n λ is the number of selected R rs wavelengths.

Ensemble Scheme
The ensemble scheme is described building on the equations reported above to derive Chl-a values with the MLP and determine the novelty index η ( Figure 2). Namely, the MLP configuration that uses the i-th bands set is denoted as MLP i , and corresponding Chl-a values and novelty index are indicated as Chl-a MLP i and η i , respectively. A weight w i = 1/η i is then assigned to Chl-a MLP i to compute the ensemble result: To combine Chl-a estimates obtained with different MLPs based on R rs at specific wavelengths' subsets, the weight definition is based on the assumption that the lower the novelty index η i , the more suitable the computed Chl-a MLP i value [57,58]. Among the possible MLP ensemble configurations, this study will focus on the performance of the ensemble solutions based on four MLPs (Chl-a ENS4 : Chl-a MLP_6b , Chl-a MLP_5b , Chl-a MLP_4b , Chl-a MLP_3b ) and on three MLPs (Chl-a ENS3 : Chl-a MLP_5b , Chl-a MLP_4b , Chl-a MLP_3b ).

The Band-Ratio Bio-Optical Algorithms
The band-ratio algorithms considered in this work to determine chlorophyll concentration from R rs values are variants of the maximum band-ratio (MBR) algorithm presented in [63]: (1) NASA's standard OC4v6 (Chl-a OC4v6 [47]), (2) a regional parametrization of OC4 based on field measurements (Chl-a OC4_Dar05 [46]), (3) a recalibration of the OC4v6 achieved with a bootstrapping-like approach with in situ data previously used in CMEMS (Chl-a OC4_Pit16 [36]), and (4) the latest version of NASA's global algorithm OC6v7 (Chl-a OC6v7 [11]).
The algorithm equation to determine the recalibrated Chl-a OC4_Pit16 value from the OC4v6 standard results is: with m = 0.5884 and n = 0.3751.

In Situ Automated Radiometry
Reference data for the assessment of space-borne R rs spectral values were derived from field measurements acquired by the Aerosol Robotic Network-Ocean Color (AERONET-OC) at three sites in the Baltic Sea  21.723 • E) were used for independent validation. Specifically, AERONET-OC Level-2 measurements of normalized water-leaving radiance, L wN , corrected for bi-directional effects and quality-controlled as part of the operational data processing [64][65][66], were divided by the extra-solar irradiance spectrum to calculate the in situ R rs [67]. These data were then band-shifted to the CCI spectral wavelengths by means of inverse and direct application of the QAA algorithm [54][55][56]. The QAA was modified to avoid retrievals of negative phytoplankton absorption at any band that could otherwise lead to malfunctioning of the band-shift procedure.
As the AERONET-OC radiometric systems automatically operate multiple times a day, in situ and satellite potential R rs match-up spectra were chosen as the median of records collected within two hours of local noon [32]. For the satellite data, median values were extracted from a 3 × 3 box centered on the AERONET-OC sites only when at least 6 valid pixels were available, and the coefficient of variation was smaller than 20%. This led to a match-up dataset of 674 space-borne R rs spectra with both multi-sensor CCI data and corresponding in situ measurements from 2005 to 2019.

In Situ Data for the Assessment of the Chl-a Retrieval
Two independent in situ datasets acquired to monitor the Baltic Sea water quality ( Figure 1) are utilized in the present work as a reference to assess the performance of the MLPs, their ensemble scheme, and the band-ratio regional algorithms for Chl-a retrieval.
The first set has been collected by the Finnish Environment Institute (SYKE) relying on the Alg@line acquisition system installed on ferries operating in the Helsinki-Travemünde, Helsinki-Stockholm, and Kemi-Travemünde transects [68,69]. Water samples were acquired with a flow-through setup (nominal inlet depth at 5 m), and then filtered with glass fiber filters (Whatman GF F). Chlorophyll-a was then extracted with ethanol to determine Chl-a fluorometric values [69]. The Alg@line data acquired from 1997 to 2017 were selected for this study.
The second set of records is the COMBINE database hosted by the International Council for the Exploration of the Sea (ICES), that includes Chl-a measurements collected by several institutions within the HELCOM environmental monitoring activities. Chl-a measurements were performed with techniques spanning from fluorimetry to spectrophotometry and HPLC, from the 1970s to date. All data and the different analytical protocols are subject to a well-established quality assurance policy regulated under the HELCOM COMBINE marine monitoring program [6,7]. Data acquired in the Skagerrak and Kattegat regions were not included in this analysis as the physical and optical water properties of these basins are deemed different from the actual Baltic Sea [20,26].
Space-borne Chl-a estimates for comparison with the reference field data were derived from the CCI dataset as the median of the 3 × 3 pixels box centered to the location of the in situ data point. Additional match-up criteria are: (1) a minimum of six valid pixels for the space-borne data, and (2) measurement execution for the field data between 9:00 and 15:00 (local time, i.e., 10-16 UTC) to account for the various acquisition times of the multi-sensor satellite dataset. The application of these co-location criteria resulted in 1735 records (658 and 1077 for Alg@line and COMBINE, respectively) well-distributed across the Baltic Sea ( Figure 1). Due to differences in sampling strategies and techniques used for the pigment quantification, the COMBINE measurements in coastal locations cover a wider dynamic range (average Chl-a value of 5.7 µgL −1 , 5th and 95th percentiles of 0.8 and 18.1, Table 1) than the data acquired along the Alg@line transects (average Chl-a value of 3.4 µgL −1 , 5th and 95th percentiles of 1.1 and 7.3).
The Chl-a variability in the in situ observations (Table 1) conforms with Chl-a ranges summarized for the Baltic Sea and its sub-basins in [27]. These co-located in situ Chl-a values, utilized as a reference for the assessment of the MLP and the band-ratio bio-optical algorithms surveyed in this study, are denoted as Chl-a REF .

Statistical Figures
Statistical figures to evaluate the quality of results are the determination coefficient (r 2 ), the absolute percent differences (APD), the relative percent differences (RPD), and the bias parameter between the expected x and the computed y values: where x and y are the mean of the expected and computed values, respectively. These parameters are applied for the assessment of Chl-a estimates, as well as R rs spectral values at each individual band. A good agreement between two datasets is achieved when r 2 is close to 1 and all the other parameters are close to 0.

Indicators of the Spring and Summer Phytoplankton Bloom Dynamics
The phytoplankton abundance and succession in the Baltic Sea are characterized by dinoflagellate-and diatom-dominated spring blooms and cyanobacterial subsurface and surface summer blooms [19,23,70]. Upon incorporating the processing improvements described in Sections 2.1 and 2.2, the consistently reprocessed time series from 1998 to 2019 of R rs and Chl-a are used to provide an up-to-date description of Baltic Sea eutrophication state. To this aim, the indicators of the spring and summer phytoplankton bloom dynamics adopted in the CMEMS Ocean State Report [71,72] were selected.
The spring bloom dynamics are described both in terms of spatiotemporal coverage of the subsurface blooms and with the statistics for the bloom onset proposed by Groetsch et al. [73]. The occurrence of spring bloom conditions (from February to early June, i.e., over days 31-160) is thus detected on a daily basis for each pixel in the basin using a spatially variable threshold set at 5% above the median Chl-a concentration of the spring observations [73,74]. The statistics for the bloom onset, i.e., the distribution of the start as well as the peak and the end days, were calculated for each pixel in the basin and summarized using the HELCOM sub-basins [75].
For the cyanobacterial summer blooms, the spatiotemporal coverage is aggregated from estimates of daily subsurface and surface bloom, similarly to the HELCOM environmental reporting [76,77]. Summer blooms are detected on daily R rs images from June to September (i.e., days 161 to 270) by applying the thresholds on R rs at the wavelength of 555 and 670 nm defined in [78] based on visual inspection of pseudo true-color composites of MODIS imagery. These thresholds were set at R rs (555) > 4.25 × 10 −3 sr −1 and R rs (670) > 1.22 × 10 −3 sr −1 for the subsurface and surface blooms, respectively. This R rs -based approach to detect cyanobacterial blooms allows overcoming the difficulty of retrieving accurate Chl-a values in summer bloom conditions, as highlighted in [79][80][81]. Moreover, as cyanobacterial blooms are extremely patchy and the surface scum is unlikely to totally cover a 1 km resolution pixel [79][80][81], it is possible for both thresholds to be exceeded concurrently.
Following Hansson and Håkansson [82], cumulative maps were computed from the daily subsurface and surface images for each year and then summarized as a spatiotemporal coverage (day·km 2 ) for both spring and summer.

Results
The comparisons of the space-borne R rs and the associated Chl-a estimates with respect to the in situ reference data are presented in Sections 3.1 and 3.2, respectively. Then, the application of the time series reprocessed with the regional algorithms for environmental reporting in the Baltic Sea is demonstrated in Section 3.3.

R rs Validation
The match-up scatterplots between the co-located multi-sensor satellite and in situ R rs values are presented in Figure 3. The achieved fitness is notable in such optically complex waters for the 490-670 nm spectral range (r 2 of 0.70-0.87, RPD of 1.3-18.0), while the 412 and 443 nm bands show higher dispersion (r 2 of 0.05 and 0.34, RPD of 86.9% and 14.1%). For the 412-555 nm spectral range, the multi-sensor satellite R rs are overall centered on the 1:1 line, as shown by the low bias in the range 1-9 10 −5 sr −1 , while a larger bias (i.e., 1.36 10 −4 sr −1 ) is observed for 670 nm. As a comparison, the global match-up summary statistics for the OC-CCI R rs are r 2 of 0.79-0.88 and bias between −7 × 10 −5 and −2 × 10 −4 sr −1 [12]. It should be noted that these global match-ups were based on disparate instrumentation and quality standards [12,83] featuring lower accuracy in comparison to the AERONET-OC data used as a reference in this study [84]. The satisfactory performance in the 490-670 nm spectral range and the lower accuracy reported for the blue spectral bands confirm the findings with other radiometric validation exercises carried out in the Baltic Sea [24,[32][33][34]42,64,65]. Specifically, the observed bias values for R rs (555) and R rs (670) (8.33 × 10 −5 and 1.36 × 10 −4 sr −1 , respectively) were an order of magnitude lower than the thresholds defined in [78] (i.e., 4.25 × 10 −3 and 1.22 × 10 −3 sr −1 for the 555 and 670 nm bands, respectively). The OC-CCI R rs were hence deemed adequate for the detection of summer subsurface and surface blooms, which will be presented in Section 3.3.2.   An additional element to consider is that the ℎ -_ statistical figures shown in Table 2 are not directly comparable with those reported in [36] due to the differences in the processing version (i.e., OC-CCI v3.1) of input values, as well as heterogeneities in the match-up database. In fact, that former study [36] did not include measurements from the Alg@line transects, although it included samples from the COMBINE database acquired in the Skagerrak and Kattegat regions with physical and optical water properties that differ from those typical of the Baltic Sea (i.e., they are more similar to oceanic conditions and hence more amenable to ℎretrieval with OC4-like algorithms [26,36]).

Chl-a Match-Ups with Alg@line Transects and COMBINE Dataset
The match-up analysis for two ensemble configurations ( ℎand ℎ -) shows a high density of points close to the 1:1 line for the 1.0-10.0 μgL ℎrange, and a better retrieval of ℎvalues in the 0.1-1.0 μgL range, than the MLP candidates ( Figure 6). The correlation coefficients were 0.238 and 0.229, the RPD were 40.9% and 42.9%, and the bias values were −0.78 and -0.87 μgL −1 ( Table 2). When considering the two in situ data sources separately, for both ensemble configurations, RPD was −24.1% and −27.4% for the Alg@line match-ups and 80.6% and 85.8% for the COMBINE dataset.
Based on the validation statistics associated with the analysis, the three-element ensemble ( ℎ -) was the chosen algorithm for further analyses, as it was characterized by the lowest RPD and APD compared to the individual MLP candidates, the MBR algo-  All proposed MLP regional algorithms present a high density of points close to the 1:1 line for the 1.0-3.0 µgL −1 Chl-a REF range, but they cannot accurately retrieve Chl-a REF values in the 0.1-1.0 µgL −1 range (Figure 4). Furthermore, Chl-a MLP_6b and Chl-a MLP_4b show some spurious retrievals of values in the 0.1-0.5 µgL −1 range, and Chl-a MLP_4b shows a tendency to saturate at 2 µgL −1 . Chl-a MLP_6b and Chl-a MLP_5b are characterized by the highest dispersion (RPD of 48% and 51% and correlation coefficient of 0.137 and 0.109). Chla MLP_4b and Chl-a MLP_3b provide better results: an APD of 90% and correlation coefficient of 0.190 are reported for Chl-a MLP_4b , while a bias of 0.15 µgL −1 and correlation coefficient of 0.204 characterize Chl-a MLP_3b .
The differences in the validation metrics of the four MLP candidates ( Table 2) depend on the combined effect of the heterogeneous uncertainties of the satellite R rs input spectra with the applicability range of each MLP. Recalling that the reference data to evaluate Chl-a retrievals from multi-sensor R rs spectra are the Alg@line and COMBINE measurements, it is reported for comparison that the ideal performance of Chl-a MLP_6b assessed with in situ measurements of Chl-a and R rs from the BiOMaP dataset attest RPD and APD values of~9% and~34% respectively, and a bias of −0.36 µgL −1 [45,57]. The use of R rs (412) and R rs (443) as part of the input bands for Chl-a MLP_6b and Chl-a MLP_5b is likely the major cause of performance degradation (note that the 412 and 443 nm bands show higher dispersion in the match-up analyses, with r 2 = 0.05 and r 2 = 0.34, respectively-see also Figure 3. When analyzing results from the two in situ datasets separately, higher uncertainty is estimated in the match-ups with the COMBINE measurements in comparison to the water samples collected along the Alg@line transects (Table 2). For the MLP candidates, APD ranged~42-50% for the Alg@line match-ups, while for the COMBINE dataset, it ranged 115-158%. The correlation coefficients were lower for the Alg@line (~0.05-0.12) than the COMBINE match-ups (0.095-0.193). The apparent incongruence of the opposite behavior of APD, RPD, and r 2 can be explained by different sampling strategies and the smaller dynamic range of the Alg@line observations reported in Table 1.
Among the MBRs, the match-up analyses for NASA's standard OC4v6 (Chl-a OC4v6 [47]) and OC6v7 (Chl-a OC6v7 [11]) show a high density of points close to the 1:1 line for the 1.0-3.0 µgL −1 Chl-a REF range, and a tendency to largely overestimate Chl-a REF for values higher than 3.0 µgL −1 , leading to a RPD of 152% and r 2 of 0.27 for the former, and a RPD of 130% with a r 2 of 0.23 for the latter (for details, see Figure 5 and Table 2). The regional parametrization of OC4 (Chl-a OC4_Dar05 [46]) underestimated Chl-a REF in the 1.0-3.0 µgL −1 range, showing a RPD of 21.6%, a bias of -1.36 µgL −1 , and an r 2 of 0.28. The regionally recalibrated OC4v6 previously used in CMEMS (Chl-a OC4_Pit16 [36]) showed the highest bias (6.61 µgL −1 ) as well as RPD and APD (243% and 292%, respectively). Similar to the MLP candidates, the match-up statistics for the MBRs also show higher APD but better correlation for the COMBINE dataset (APD = 110-437%, r 2 = 0.19-0.27), in comparison to the Alg@line transects (APD = 42-55%, r 2 = 0.08-0.13, Table 2).
An additional element to consider is that the Chl-a OC4_Pit16 statistical figures shown in Table 2 are not directly comparable with those reported in [36] due to the differences in the processing version (i.e., OC-CCI v3.1) of input R rs values, as well as heterogeneities in the match-up database. In fact, that former study [36] did not include measurements from the Alg@line transects, although it included samples from the COMBINE database acquired in the Skagerrak and Kattegat regions with physical and optical water properties that differ from those typical of the Baltic Sea (i.e., they are more similar to oceanic conditions and hence more amenable to Chl-a retrieval with OC4-like algorithms [26,36]).
The match-up analysis for two ensemble configurations (Chl-a ENS3 and Chl-a ENS4 ) shows a high density of points close to the 1:1 line for the 1.  (Table 2). When considering the two in situ data sources separately, for both ensemble configurations, RPD was −24.1% and −27.4% for the Alg@line match-ups and 80.6% and 85.8% for the COMBINE dataset. Based on the validation statistics associated with the analysis, the three-element ensemble (Chl-a ENS3 ) was the chosen algorithm for further analyses, as it was characterized by the lowest RPD and APD compared to the individual MLP candidates, the MBR algorithms, and Chl-a ENS4 both for the whole match-up dataset and for the match-ups computed separately for the Alg@line transects and the COMBINE dataset ( Figure 6, Table 2). The selection of the three-element ensemble relies on the full independence of the in situ Chl-a values utilized as a reference, i.e., the Alg@line and COMBINE measurement sets, with respect to the data used for the development of the MLPs as well as the weighting process underpinning the ensemble scheme. In fact, if only BiOMaP in situ data were considered for the algorithm selection, Chl-a MLP_6b , i.e., the MLP with all 6 bands, would be the most performing configuration (as reported in previous investigations, e.g., [57]). However, as this study confirms, the uncertainty affecting R rs in the blue and red spectral region tends to be significantly larger for the space-borne than for the in situ data (see details in Section 3).
As shown in Figure 4, restricting the MLP input bands to a subset of the space-borne R rs spectral range (i.e., the central spectral region of the visible domain in Chl-a MLP_4b and Chl-a MLP_3b ) can lead to better Chl-a estimates in the Baltic Sea. The ensemble scheme has been specifically designed to account for the need for a dynamical determination of the input bands' relevance and for spatial and temporal uncertainty of space-borne R rs values. The proposed solution automatically gives more credit, on a pixel-by-pixel basis, to the space-borne R rs spectrum (full or at a reduced number of wavelengths) more compatible with the BiOMaP in situ training data (i.e., low novelty index).

Application to Environmental Reporting
In this section, the consistently reprocessed time series of R rs and Chl-a taking-up the processing improvements presented earlier is used for environmental reporting. The same indicators of the spring and summer phytoplankton bloom dynamics adopted in the CMEMS Ocean State Report [71,72] were calculated from 1998 to 2019, thus providing an up-to-date description of the Baltic Sea eutrophication state based on previous versions of the multi-sensor time series.

Spring Bloom Dynamics
This section presents the spring bloom dynamics in terms of spatiotemporal coverage (Figures 7 and 8, and Figure S1) and timing (Figure 9) obtained from analyzing the Chl-a ENS3 time series from February to early June. The method for spring bloom detec-tion is based on a spatially explicit relative threshold derived from the Chl-a time series itself [9,73,74]. Hence, by design, the Chl-a retrieval uncertainities discussed above do not propagate to the spring bloom indicators presented in this section.
As shown in Figure 7, the bloom spatiotemporal coverage was characterized by values lower than 1.5 × 10 6 day km 2 for the years 1998-2001, then there was a general tendency to increase from 2002 to the high bloom coverage of years 2007 and 2008 (6.2 and 8.4 × 10 6 day km 2 , respectively), followed by fluctuations in the range 2.5-4.9 × 10 6 day km 2 . The overall behavior in spatiotemporal coverage is similar to the one reported for 1998-2017 using the same method in [71,72], but the intensities differ from the previous estimates, reflecting the Chl-a time series changes due to the improvements in the processing chain described above. The lower values observed for the spring bloom coverage in 1998-2000 compared with the coverage from 2002 onwards are consistent with the findings in [68] for GoF and NBP bloom intensity based on Alg@line Chl-a measurements. These lower estimates may also depend on the use of only SeaWiFS in 1998-2001, while two or three satellite sensors were available for the multi-sensor products merging from 2002 onwards, as a higher number of sensors could imply that a larger fraction of the basin with bloom events is captured.    As observed in previous satellite-based studies (e.g., [21,85,86]), the bloom spatiotemporal coverage varies across the sub-basins in the various years. Figure 8 Figure S1. Figure 9 provides an overview of the spring bloom timing in terms of start, peak, and end days for the Baltic Sea as a whole, and for the main sub-basins. The spatial distribution of the bloom onset within the basin for each year can be inferred by the variability in the occurrence in the start, peak, and end days, quantified by the length of the interquartile ranges (IQR) of the three dates.
Across the whole Baltic Sea (Figure 9a After dividing at the sub-basin level, the starting bloom date shows clear latitudinal variations, occurring most often in February-March (days  in the central and southern Baltic Sea, while the blooms start later in the two northern basins: March-April (days 60-120) in BoS and March-May (days 60-150) in BoB, due to the longer duration of the ice cover. Overall, there was a general trend for all sub-basins of earlier phytoplankton blooms by 1-3 weeks across the two decades in terms of start and peak dates that was clearly visible for the distribution of the bloom onset calculated for the whole basin. This confirms the lengthening of the bloom period observed in [73] for GoF, NBP, and SBP based on Alg@line Chl-a measurements, and the early blooming reported in [19] based on an intensive sampling of phytoplankton and hydro-chemical parameters at selected sites in GoF and NBP. The earlier appearance of spring blooms in the last decades has been linked to climate warming [19,68], although spatially explicit studies across the whole Baltic Sea are still missing. Figure 9 shows that 2008 was an anomalous year also in terms of the bloom timing, particularly for BoB, where the bloom started on day 60-90, almost a month earlier than usual. Furthermore, the peak day IQR for the whole Baltic Sea for 2008 was small as the bloom peaked almost simultaneously within most sub-basins, as shown by the individual IQRs. The division at the sub-basin level also highlights the early onset and peak of the 2015 bloom in the BoB: the starting bloom date was around day 70 and it was followed by an early peak (median peak day 77), consistent with the sea-ice coverage anomaly that occurred in the winter of 2014/2015, when BoB, for the first time, remained partially ice-free [87].

Surface and Subsurface Summer Blooms
In the Baltic Sea, the phytoplankton blooms occurring during summer are dominated by nitrogen-fixing cyanobacteria. These blooms can have subsurface and/or surface ac-cumulations, depending on the age of the bloom. This section presents the dynamics of cyanobacterial summer blooms with the spatiotemporal coverage estimated from the reprocessed time series of multi-sensor R rs , thus avoiding the possible low-accuracy retrievals for Chl-a in cyanobacterial bloom conditions.
The time series of total coverage of surface and subsurface blooms in summer is presented in Figure 10 for the whole basin. Overall, the summer bloom coverage increased from 1998 to 2005, decreased from 2005 to 2012, and then increased from 2012 to date, showing oscillations without a consistent decadal trend, as observed from 1979 to date [23,86,88].   At the sub-basin scale ( Figure 12 and Figure S1), several summer bloom events (e.g., in 2002, 2003, and 2005) occurred in the central regions (WGB, EGB, NBP), as also reported in [21,85]. In these central basins, the surface and subsurface summer blooms were of similar extents and the areas where both thresholds were exceeded ranged on average 10-20% of the total coverage. The extensive bloom that occurred in 2005, as shown by our analysis, is reinforced by the HELCOM report, that shows similar nutrient concentrations in 2004 and 2005, with double phosphate levels in the surface [89]. Nevertheless, the weather conditions (cold temperature and wind) of 2004 were not suitable for the onset of a cyanobacterial bloom. On the contrary, the warmer temperature, light conditions of summer 2005, in addition to the excess dissolved inorganic phosphorus, resulted in an intense bloom, covering the central area as WGB, EGB, and NBP [77].
The summer bloom events in GoF, GoR, and SBP differed from those observed in the central regions in terms of timing and of the surface and subsurface extents. The GoF summer blooms were dominated by the surface expression and occurred mostly in the first decade, with a peak in 2002. In GoR, summer blooms were characterized by a larger

Discussion and Conclusions
This study presented recent improvements for the 1997-2019 Chl-a time series determined for the Baltic Sea from space-borne multi-sensor data. The accuracy of OC-CCI R rs for the Baltic Sea was assessed with AERONET-OC data acquired from 2005 to 2019 at three sites located in different sub-basins [64][65][66]. In agreement with other studies [64,65], results from the analysis of R rs match-up data showed that the external bands (412 and 670 nm) are affected by larger uncertainties: R rs (412) was characterized by a high dispersion, while R rs (670) had the largest bias. The OC-CCI R rs were deemed adequate for the detection of summer surface blooms as the observed bias values for R rs (555) and R rs (670) were an order of magnitude lower than the adopted thresholds.
The proposed ensemble scheme represents a new paradigm with respect to the a priori classification of the input R rs and associated Chl-a retrieval algorithms [12,15,41,90,91]. Namely, the study considered that: (1) the R rs match-up analysis has shown that the atmospheric correction process can induce larger uncertainties in the blue spectral region, (2) when accurate, input bands in the blue spectral region can improve the MLP performance, but (3) degradation of results and reduction of the MLP applicability range occur when uncertainties increase. The ensemble solution was thus devised to exploit the information embedded in the blue spectral bands only when the overall space-born input R rs spectrum is compatible withthe in situ data used for MLP training, taking the agreement with the field measurements as an indication of well-performing atmospheric correction on a pixel-by-pixel basis. For low values of the novelty index, the MLPs based on the extended spectral range will contribute with high weights to define the ensemble result.
The ensemble scheme hence permits to account for the temporal and spatial variation of uncertainties induced by the atmospheric correction process. The result is a dynamical relevance determination of Chl-a values derived from R rs at different sets of bands. The effectiveness of the proposed scheme is heuristically endorsed by the higher quality of the retrieved Chl-a values, as assessed with the reference Alg@line and COMBINE measurements not considered for the MLP training. Results based on these two independent reference datasets have shown that the proposed ensemble scheme consistently outperformed the Chl-a retrievals by alternative schemes (i.e., standard and regional band-ratio algorithms, as well as individual MLPs) [11,36,46,47]. The matchup statistics achieved in this study with the ensemble scheme (correlation coefficient of 0.238, RPD of 40.9%, and bias value of −0.78 µgL −1 ) may appear as unsatisfactory, but represent a considerable reduction of the gross Chl-a overestimates induced by the high CDOM in the Baltic Sea documented by studies based on SeaWiFS and MODIS band-sets [25][26][27][35][36][37]. Raising the accuracy of Chl-a retrievals in the Baltic Sea above the level attained in this work with the ensemble scheme largely depends on atmospheric correction improvements in the blue and red spectral regions. Since the reprocessed multi-sensor time series of R rs spectra are determined at the wavelengths of the SeaWiFS sensor, the C2RCC and ONNS neural nets could not be applied in this work due to differences in input spectral bands [24,40,41]. Note also that the Chl-a assessments in the present study considered the whole basin, while several previous studies addressed a sub-basin level, possibly enhancing the reported performance at those scales [25,39,40,42,46].
The optical variability over short spatial scales, as well as the patchiness in Chl-a concentrations and of cyanobacterial blooms in the basin, are the likely sources of the high dispersion reported in the match-up analyses with the 1 km resolution pixel [30,31,42,[79][80][81]. The heterogeneity of statistical results is documented by the higher uncertainty estimated with the COMBINE dataset in comparison to the water samples collected along the Alg@line transects. An analogous finding was reported for OLCI imagery by Toming et al. [42], with different uncertainties between coastal waters and the Baltic Proper. A further factor contributing to the reported dispersion of the match-ups' data points is the large co-location time window due to the lack of exact acquisition time for the multi-sensor dataset.
The 1997-2019 time series of the R rs and Chl-a are freely available at the CMEMS portal. The time series will be extended as part of the CMEMS operational processing on a six-month basis. The use of a combined MERIS and OLCI time series at a spatial resolution of 300 m would most probably lead to increased accuracy on the Chl-a retrieval, as several algorithms make use of the extended spectral band-set [24,26,41]. However, this time series would span 2002-2012 and then 2016 onwards and would not benefit from the increased spatial coverage brought by the multi-sensor merging.
The consistently reprocessed time series of Chl-a and R rs presented in this study allowed for a comprehensive description of the phytoplankton bloom evolution for spring and summer seasons. The overall magnitude and the interannual variability of the spring and summer blooms described in this work is coherent with the literature [19,21,68,73,85,86]. The analysis of the phytoplankton dynamics at sub-basin levels enabled detailing the latitudinal variations in the spring bloom phenology across the basin, and offered the first documentation at basin and sub-basin scales of the earlier onset and the lengthening of the spring bloom over the last two decades. These spatially explicit findings confirm and extend the results based on in situ observations [19,68,73].
The quantification of the Baltic Sea surface and subsurface summer blooms was complemented for the first time by an assessment of the blooms with concurrent surface and subsurface expression. The summer blooms occurred more frequently in the central Baltic Sea, differing from Gulf of Finland, Gulf of Riga, and Southern Baltic Proper, in terms of timing as well as surface and subsurface extents. These findings are in overall accordance with previous satellite-based studies and the HELCOM results [21,76,77,85,86,89], even if due to changes in the input satellite imagery and detection method, the HELCOM remote sensing indexes computed from 2010 onwards cannot be directly compared with the 1997 to 2009 values [76,77,82].
Nowadays, satellite sensors provide data for extensive spatial and temporal monitoring of phytoplankton variability. In the context of the Good Environmental Status actions, the present study demonstrates how the eutrophication assessment in the Baltic Sea can be deeply reinforced by the use of Earth Observation data. The combined use of space-borne and in situ information (monitoring stations and field activities) has to be aimed to achieve a good status in all regions of this basin and prevent negative effects of eutrophication.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/rs13163071/s1. Figure S1: Spring and Summer Bloom spatio-temporal coverage from 1998 to 2019. First column: Spring bloom; second column: subsurface Summer bloom; third column: concurrent subsurface and surface summer bloom; fourth column: Surface summer bloom. Table S1: Parameters set for Chl-a MLP_6b .  Acknowledgments: R rs spectra at 1 km resolution were produced by the Plymouth Marine Laboratory (PML) using the OC-CCI processor within the Ocean Colour Thematic Assembly Centre activities. The AERONET Team is acknowledged for the continuous effort in supporting the AERONET-OC sub-network. Giuseppe Zibordi from the Joint Research Center of the European Commission is acknowledged for establishing and maintaining the Irbe Lighthouse, Gustav Dalen Tower, and Helsinki Lighthouse AERONET-OC sites, and for providing access to the BIOMAP dataset. ICES and HELCOM along with all single contributors are thanked for the COMBINE in situ dataset. Scientists and technical personnel at SYKE are thanked for the acquisition of the Alg@line in situ dataset. We are grateful to Vega Forneris and Flavio Lapadula for maintaining the satellite data processing and the satellite data archive at CNR-ISMAR. We thank the three reviewers and the academic editor for their insightful and constructive comments that helped strengthening the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.