Consistent Multi-Mission Measures of Inland Water Algal Bloom Spatial Extent Using MERIS, MODIS and OLCI

: Envisat’s MERIS and its successor Sentinel OLCI have proven invaluable for documenting algal bloom conditions in coastal and inland waters. Observations over turbid eutrophic waters, in particular, have beneﬁted from the band at 708 nm, which captures the reﬂectance peak associated with intense algal blooms and is key to line-height algorithms such as the Maximum Chlorophyll Index (MCI). With the MERIS mission ending in early 2012 and OLCI launched in 2016, however, time-series studies relying on these two sensors have to contend with an observation gap spanning four years. Alternate sensors, such as MODIS Aqua, offering neither the same spectral band conﬁguration nor consistent spatial resolution, present challenges in ensuring continuity in derived bloom products. This study explores a neural network (NN) solution to ﬁll the observation gap between MERIS and OLCI with MODIS Aqua data, delivering consistent algal bloom spatial extent products from 2002 to 2020 using these three sensors. With 14 bands of MODIS level 2 partially atmospherically corrected spectral reﬂectance as the NN input, the missing MERIS/OLCI band at 708 nm required for the MCI is simulated. The resulting NN-derived MODIS MCI (NNMCI) is shown to be in good agreement with MERIS and OLCI MCI in 2011 and 2017 respectively over the western basin of Lake Erie (R 2 = 0.84, RMSE = 0.0032). To overcome the impact of MODIS sensor saturation over bright water targets, which otherwise renders pixels unusable for bloom detection using R-NIR wavebands, a variant NN model is employed which uses the 9 MODIS bands with the lowest probability of saturation to simulate the MCI. This variant NN predicts MCI with only a small increase in uncertainty (R 2 = 0.73, RMSE = 0.005) allowing reliable estimates of bloom conditions in those previously unreported pixels. The NNMCI is shown to be robust when applied beyond the initial training dataset on Lake Erie, and when re-trained on different geographic areas (Lake Winnipeg and Lake of the Woods). Despite differences in spatial, temporal, and spectral resolution, MODIS algal bloom presence/absence was correctly classiﬁed in >92% of cases and bloom spatial extent derived within 25% uncertainty, allowing the application to the 2012–2015 time period to form a continuous and consistent multi-mission monitoring dataset from 2002 to 2020.


Introduction
Several satellite ocean color missions have been in operation over the last few decades (e.g., SeaWiFS, MODIS, MERIS, VIIRS, OLCI) forming an invaluable contiguous record of aquatic color radiometry since 1997. Although not necessarily optimized for inland water applications [1], these sensors have delivered widespread advances in lake water quality monitoring, particularly from the perspective of detecting potentially harmful algal blooms (HABs) [2][3][4]. Many freshwater systems have seen an increase in the severity of HABs, attributed in large part to cultural eutrophication, and climate change impacts on key in-lake and watershed drivers [5][6][7][8]. Lake Erie, for example, is the most biologically productive of North America's Laurentian Great Lakes and has suffered a resurgence of HABs in the last two decades [9] resulting in a well-documented ecosystem, public health, and socio-economic impacts [10][11][12]. Satellite remote sensing is now routinely used to complement in-lake monitoring programs to report on algal bloom conditions on Lake Erie, offering clear benefits of low-cost, near-real-time synoptic observations, and objective time-series records. The science maturity is such that several operational remotely sensed algal bloom products are now available for Lake Erie and distributed to the research and water-resource management stakeholder community [2,13,14]. Environment and Climate Change Canada's (ECCC) EOLakeWatch, for example, delivers near-real-time mapped bloom products and a suite of derived bloom indices documenting the spatial extent, intensity, duration, and severity of Lake Erie's blooms [2,15].
Quantifying temporal trends in bloom conditions is key to guiding water resource management [16], monitoring the effectiveness of implemented mitigation measures, and further understanding the impact of climate change on bloom occurrences. A central requirement for a robust long-term ecological indicator is a consistent and continuous time series of observations, to which the frequency and scale of satellite Earth Observation (EO) lends itself very well. However, obstacles exist in deriving robust continuous remote sensing products for time-series analyses due to the need to merge datasets from multiple satellite missions with significant differences in sensor characteristics. The lack of consistency in spectral and spatial resolution as well as radiometric sensitivity between remote sensing platforms means that products require continuity assessment before multisensor time-series can be relied upon. Otherwise, the potential for artifacts propagated into combined multi-mission data sets may ultimately affect the conclusions drawn from a time-series analysis [17,18].
Existing aquatic color algorithms for the detection of HABs typically retrieve Chlorophyll-a concentrations as an indicator of phytoplankton biomass [19,20], or some other property of the bloom such as phycocyanin concentration (as a specific indicator of cyanobacteria [21]) or cyanobacterial cell counts [22]. Algorithms include simple blue/green band ratios [23][24][25], red-near-infrared (R-NIR) band ratios [26] and line-height approaches [20,27,28] as well as semi-analytical [29] and neural network/machine learning approaches [30]. While some of the required wavebands are consistent among all sensors, others are not, creating difficulties for algorithm transferability between missions. Line-height algorithms such as the maximum chlorophyll index [MCI, [27]], cyanobacteria index [CI, [22]], and maximum peak height [MPH, [20]] are often favored for studying eutrophication and algal/cyanobacterial blooms in optically complex coastal and inland waters due to their relative insensitivity to the ongoing challenge of atmospheric correction over turbid waters. All three of those algorithms were defined originally for the spectral configuration of Envisat's MEdium Resolution Imaging Spectrometer (MERIS), relying on the waveband at 708 nm; the CI for its baseline anchor, MCI for its peak reflectance measure, and MPH for either the peak or the baseline depending on the position of the reflectance peak. MERIS operated from 2002 until April 2012, and there was a 4-year delay before its successor, the Ocean and Land Color Instrument (OLCI on Sentinel 3) was launched in February 2016. Continuity in the spectral bands previously available on MERIS [31] means those line-height approaches have the potential to provide consistent algal bloom products. There exists, however, a significant observation gap for which no remote sensing products are available with the same spectral configuration to enable those line-height algorithms to be derived as originally defined. NASA's Moderate Resolution Imaging Spectroradiometer (MODIS), on the Aqua and Terra platforms, provides alternate ocean color observations over this time period. However, the sensor has neither the same spectral configuration (lacking the key band near 708 nm) nor the same spatial resolution (1 km compared with the 300 m of MERIS and OLCI). Further challenging the use of MODIS imagery in turbid waters is the fact that MODIS R-NIR bands are prone to saturation over bright water targets [32,33], leaving large areas of intense algal blooms without valid retrievals.
ECCC's remote sensing algal bloom products were developed using the MCI applied to MERIS imagery and rendered operational for three turbid eutrophic Canadian lakes using the same approach with OLCI imagery [2,34]. It would be beneficial, therefore, to fill the observation gap for the bloom seasons of 2012 to 2015 between the MERIS and OLCI missions in order to more reliably determine temporal trends and lake responses to anthropogenic and environmental drivers. A number of strategies have been adopted to develop and assess continuity in multi-mission Chl-a and algal bloom products for Lake Erie. In a time-series assessment of bloom extent for western Lake Erie, Sayer, et al. [14] investigated the coherence of SeaWiFS and MODIS-based cyanoHAB products during their overlap years 2002-2007. Wynne, et al. [35] reconfigured the MERIS CI for MODIS, using the MODIS band at 759 nm to replace the MERIS 708 nm band [33]. Beyond Lake Erie, various initiatives have created global ocean colour records spanning successive missions [36][37][38][39][40][41]. Under the framework of the Ocean Colour Climate Change Initiative (OC-CCI) [42], product merging was conducted at the remote sensing reflectance level through a band-shifting scheme [43]. Melin et al. [18] further assessed temporal trends among multi-mission OC-CCI Chl-a products and found overall good agreement between single mission and the merged data products. Similarly, the Lakes-CCI project aims to deliver a consistent and homogenous data set of lake products for the Lakes Essential Climate Variable. However, acknowledging that the number of wavebands differs per sensor and each sensor has variations in where the bands are centered, the Lake Water-Leaving Reflectance product has not yet been harmonized (as of CDRP v1.0, [44]). Additional approaches including spectral matching techniques [45] and Quasi-Analytical Algorithm (QAA)-based techniques [46] have shown promise in delivering consistent multi-mission products, while Pahlevan, et al. [47] concluded that deep neural network (NN) approaches outperformed other band matching methods, particularly in highly eutrophic and turbid waters.
In this study, we address the gap in spectral resolution and product continuity between MERIS and OLCI by developing a NN approach to build spectral band consistency between MODIS and the MERIS/OLCI sensors in order to provide continuous multi-sensor algal bloom products with a focus on reporting bloom spatial extent. Specifically, this study: (a) presents a NN method applied to MODIS to simulate the 708 nm band required for the MCI, (b) addresses algal bloom detection for saturated MODIS pixels, (c) analyzes the performance of the NN model in determining algal bloom spatial extent on Lake Erie, (d) demonstrates the transferability of the method to other Canadian lakes, namely Lake Winnipeg and Lake of the Woods, and (e) applies the approach to deliver a continuous time-series of algal bloom spatial extent from 2002 to 2020 for all three lakes.

Satellite Imagery and Processing Steps
MODIS-Aqua (hereafter simply MODIS), MERIS, and OLCI, provide multispectral images for the periods 2002-present, 2002-2012, and 2016-present, respectively. The three sensors have a revisit frequency of~1-3 days, which adequately captures the temporal variability of algal blooms and is suitable for continuous time-series analyses. Their relative spectral responses (RSR) are shown in Figure 1; a critical difference is that MODIS lacks a band near 708 nm which is the key band used for line-height algorithms such as the MCI. While MERIS and OLCI have a spatial resolution of 300 m, MODIS bands consist of both "Land" bands at 250-500 m and "Ocean" bands at 1 km spatial resolution. These "Land" bands are designed to measure reflectance over land, clouds, and ice and so have a much larger radiometric range than the "Ocean" bands, which are designed to be more sensitive to optically darker aquatic signals [32]. The MODIS L2 products contain these higher resolution bands aggregated to 1 km resolution to correspond with the ocean bands.
MODIS L1A imagery was downloaded from NASA's Ocean Color Web [48] and processed to level-2 (L2) spectral remote sensing reflectance (R rs (λ)) and partially atmospherically corrected spectral reflectance (ρ s (λ)) products using the SeaDAS v.7.4. L2 processing variables included the multi-scattering with 2-band model for aerosol definition (aer_opt = 3), and cloud reflectance wavelength of 1240 nm with threshold of 0.04. MERIS L2 FRS imagery were downloaded from ESA's MERIS Catalogue and Inventory (MERCI) [49] and Sentinel 3A OLCI L2 Marine products were downloaded from EUMETSAT's Copernicus Online Data Access (CODA) data portal. Accurate atmospheric correction over turbid inland waters remains a challenge and requires better representation of aerosols, improvements in corrections for sky glint, and adjacency effects [50]. Here the standard L2 reflectance products were used, but the model could equally be trained on any one of a number of state-of-the-art atmospherically corrected products such as those assessed in Pahlevan et al. [50].
Derived Chl-a maps were mosaicked and cropped by the study area boundary to produce a daily Chl-a product. Potential false positive blooms due to extreme sediment scattering events were masked using the approach of [52]. Due to the coarser spatial resolution of MODIS, occasionally anomalous retrievals were found along the lake shorelines, where there may be contributions from either mixed land-water pixels or land adjacency effects. Such erroneous nearshore pixels were identified using a simple 3 × 3 nearshore pixel assessment, removing bloom pixels where less than 1/3 of their neighboring pixels were also flagged as in bloom. In order to obtain whole lake cloud-free imagery, daily products were composited over 14 days to create 14-day rolling-average products following [2]. This approach minimizes cloud-related data gaps and resulting uncertainties in bloom extent measures, but also results in temporal smoothing of high-frequency variability therefore some of the day-to-day bloom dynamics can be lost while retaining the low-frequency seasonal bloom progression. The bloom spatial extent was determined as all pixels with Chl-a > 10 µg/L, consistent with the World Health Organization (WHO) guidelines [53] of low to moderate risk under the assumption of cyanobacteria dominance.

Training and Validation Datasets
Training and validation of the NN used MODIS, MERIS, and OLCI data extracted from the Western basin of Lake Erie (WLE) (Figure 2). WLE exhibits a wide range of Chl-a and sediment concentrations during the year [51,52] making it well suited to the study of algorithm solutions for turbid eutrophic inland waters. To compile training data for the NN, grid points were generated at 5 km intervals within WLE as illustrated in Figure 2, with points very close to the shoreline removed manually. The interval distance of 5 km was selected as a compromise to minimize adjacent pixel correlations and maximize the training sample size. In total 143 grid points were selected for which MERIS/OLCI level 2 reflectance rw(λ) and MODIS R rs (λ) and s(λ) were extracted from all daily imagery available in 2011 and 2017. MERIS/OLCI and MODIS data were then matched by date and location. MODIS Aqua has an equatorial crossing time of 1:30 p.m. local time whereas both Envisat and Sentinel 3A have descending node equatorial crossing times of 10:00 a.m. local therefore coincidental images were typically less than four hours apart. The data extraction from all grid points obtained 39,086 and 40,226 MODIS spectra for 2011 and 2017 respectively, and 16,099 and 12,091 spectra for 2011 MERIS and 2017 OLCI respectively. MERIS and OLCI datasets were smaller due to the lower revisit frequency of those sensors. Extracted spectra from MODIS were pre-processed by masking for cloud and ice (CLDICE) or invalid band retrievals whereas MERIS and OLCI spectra were masked when suspect, invalid, cloud, high, or medium glint flags were raised. The extracted MODIS R rs and ρ s products highlighted notable differences in the data available for analysis ( Figure 3). Data availability across the spectrum (before quality flags are applied) is driven by the failure of R rs atmospheric correction under extremely turbid conditions, as well as the wider radiometric range of the land bands leading to less saturation over cloud and bright water targets. For example, if the bands used for aerosol selection in the NIR range (e.g., 748 nm and 869 nm) are saturated, then spectral R rs is reported as invalid. In contrast, the ρ s product only applies the Rayleigh scattering correction and still contains aerosol contributions, consequently retaining observations over those same turbid water pixels. The number of valid ρ s retrievals over water at the extracted grid points were significantly higher than R rs retrievals, by as much as 19% at 412 nm ( Figure 3). In order to maintain maximum coverage of training data especially over the full range of highly turbid eutrophic waters, we, therefore, chose MODIS ρ s as the input for the NN training.

Neural Network Parameterization and Workflow
The overarching goal of the NN was to use years where both MODIS and either MERIS or OLCI data was available as training input data, in order to output the MERIS or OLCI equivalent MCI bands from MODIS and a consistent MODIS MCI-derived bloom spatial extent product. MODIS spectral ρ s and coincident MERIS and OLCI rw, for the years 2011 and 2017, respectively, were used to train the NN. When MODIS ρ s was valid across all wavebands, the 14 bands centered from 412 nm to 1240 nm were used as the NN input. Due to sensor saturation in high biomass conditions, however, many spectra lacked valid ρ s for certain bands especially in the R-NIR range (e.g., 667 nm and 678 nm). In an effort to overcome the impact of band saturation to provide bloom detection over those bright water targets, the nine bands (including five 'land' bands) with the lowest probability of saturation were used as NN input instead. The NN model with 14 bands as input is denoted as NN14B, while that using nine bands is denoted as NN9B. For rare cases (<5%) where less than nine bands remained unsaturated, the pixel was not trained/processed. These processing components together make a NN model denoted as NNMCI. Following independent training of NN14B and NN9B, the NNMCI was then applied to MODIS ρ s imagery for years when MERIS/OLCI observations are absent, to simulate the 3 MCI bands, calculate the MCI-derived Chl-a, and report on algal bloom spatial extent according to the aforementioned steps.
A schematic describing the multiple steps of the neural network workflow is shown in Figure 4, with the key NN operating variables, as optimized through trial and error, defined in Table 1. Using the collected training dataset of coincident spectra from MODIS and MERIS/OLCI sensors, the input to the NN was the MODIS level 2 ρ s bands. The NN14B and NN9B were trained independently using 14 and 9 spectral bands as input respectively. The default training method in Matlab ('trainlm': Levenberg-Marquardt backpropagation) was used, along with its default training parameters. We tested the varied hidden layer settings (1 to 3 hidden layers with nodes ranging from 10 to 50) and found that one hidden layer with 10 hidden nodes performed the best ( Figure 5) in further performance evaluation. For the output layer, both the MCI and the three specific bands used to calculate the MCI were tested. For both NN14B and NN9B 70% of the paired spectra were assigned randomly for training purposes, while the remaining 30% were used for validation. Training steps were run five times, each time with different random samples, and the performance was found to be consistent. The NN was prepared and trained on MODIS and MERIS/OLCI data using Matlab, whereas the trained model was subsequently exported to run using Python.   Because our primary interest is in delivering a consistent measure of bloom spatial extent, rates of correct classification of bloom presence or absence are reported, as delineated by the thresholds Chl-a > 10 µg/L or Chl-a < 10 µg/L, respectively, according to the WHO guidelines under cyanobacteria dominance [53]. In addition, four metrics were used to evaluate the model performance: root mean square error (RMSE), error bias (BIAS), the median absolute percent error (MAPE), and the coefficient of determination (R 2 ) as defined in Equations (3)-(6).
where x and y are the observed and predicted/modelled values, respectively.

NN14B Performance
The performance of NN14B is demonstrated in Figure 5, both for individual simulated MODIS bands at 681, 708, and 753 nm, and the derived MODIS-MCI. NN14B reliably predicted the MERIS/OLCI equivalent MCI bands from MODIS ρ s (λ) and subsequently produced acceptable MODIS-MCI (Figure 5b). Importantly, with the bias close to zero, the NN MODIS band predictions and MODIS-MCI fall broadly along the 1:1 line compared with the MERIS/OLCI equivalents. Results also showed similar performance of the NN14B with MODIS level 2 R rs bands as input (R 2 = 0.81), but the application of R rs based NNMCI would naturally omit highly turbid or eutrophic water pixels due to atmospheric correction failure when N-NIR bands are saturated. Experiments with L1 top-of-atmosphere radiance products as the NN input failed to reach similar levels of retrieval accuracy (R 2 < 0.1 for the NN14B derived MCI). Poor agreement between MODIS and MERIS/OLCI L1 products could be due to changing weather, illumination conditions, viewing angles, and resulting sun and sky glint, within the few hours between satellite overpasses (MODIS typicallỹ 18:30 UTC, MERIS/OLCI typically~16:30 UTC).

Pixel Saturation and NN9B Performance
Since turbid or highly productive waters are often prone to saturation in R-NIR bands, this study explored solutions to address the saturated pixels and where possible retain them in the processing work-flow to maximize the applicability of methods to retrieve bloom extent over highly reflective waters. In addressing MODIS saturated pixels, ref. [33] employed a linear conversion between the CI and the height of the MODIS 859 nm band relative to bands at 1240 and 469 nm (Figure 6a). Here we tested a similar approach for MCI, but this produced a low level of agreement (R 2 < 0.2) between the MCI and the same 859 nm peak (Figure 6b), suggesting the MODIS 859 nm peak does not carry sufficient information to represent variability in the MCI peak.
Instead, here we propose the use of the reduced waveband NN9B to determine the MCI and extract bloom extent statistics over pixels where the R-NIR bands are saturated. Using the same sample dataset, the NN9B simulated MCI achieved an R 2 of 0.731 and RSME of 0.005 (Figure 7b). The NN9B uses 9 MODIS bands, ingesting more spectral information than the approach of [33] and therefore providing a more reliable estimation of the MERIS/OLCI MCI. It is reasonable that NN14B would perform better than NN9B, with more bands and associated spectral input information. For the WLE sample dataset used here, the number of valid retrievals increased from 2816 with NN14B ( Figure 5) to 3220 with NN9B (Figure 7). While NN14B shows a slight underestimation of MCI at high MCI (>~0.04), NN9B shows a more noticeable underestimation for MCI >~0.03, suggesting MODIS-MCI from NN9B may result in a potentially conservative Chl-a estimate under high bloom conditions. The output of NN14B and NN9B could be used, more simply, as a presence/absence bloom indicator rather than a quantitative chlorophyll retrieval, and is all that is required for a measure of bloom spatial extent. A two-class accuracy assessment (bloom or no-bloom, with L2MCI~= 0.0049, or Chl-a = 10 µg/L) resulted in a successful bloom classification (relative to MERIS or OLCI) in 93.4% of cases for the NN14B, with remaining cases resulting in 2.8% false-positive bloom reports and 3.8% falsenegative bloom reports. Likewise, the NN9B resulted in successful bloom classification for saturated pixels in 92.1% of cases, with 5.0% false positives and 2.9% false negatives ( Table 2). Results suggest that the NNMCI serves as a very reliable solution for detecting WLE blooms with MODIS, including images containing saturated pixels. As such this approach contributes directly to improving bloom spatial extent retrievals where previously large areas of saturated pixels would be masked.  An example of the MODIS saturated pixel processing for a single scene (5 October 2011) is shown in Figure 8. Differences in cloud cover between the two overpass times (MODIS: 18:55 UTC; MERIS: 16:17 UTC) are noted in the MERIS and MODIS imagery. The MODIS scene contained 503 pixels that were saturated such that the NN14B could not be applied, of which 437 pixels were routed to the NN9B and 66 remained unprocessed. As such, 87% of pixels that would otherwise be masked due to saturation, were here processed for algal bloom detection, resulting in the extracted MODIS daily bloom spatial extent increasing from 2294 to 2518 km 2 . The remaining difference between MODIS-derived bloom extent and MERIS extent can be attributed to cloud cover in the NE of the scene. Overall, the NN9B processing step to fill saturated pixels provides considerable improvements in the agreement between MODIS and MERIS/OLCI observations on daily scenes. Figure 8 also clearly demonstrates the impact of MODIS coarser spatial resolution in delineating these highly dynamic bloom events, particularly along the bloom edges.

Alternate Chlorophyll-Retrieval Algorithm Solutions
An alternate solution to the NNMCI would be the adoption of a different chlorophyllretrieval algorithm for the MERIS/OLCI observation gap using only the available MODIS bands. Should a different algorithm be adopted (one in which the simulation of the 708 nm band is not required), a consistent relationship between that algorithm and the MCI would be necessary in order to provide confidence in the continuity between the MCI-derived bloom products of MERIS and OLCI era observations and those from MODIS during the gap years 2012-15. It is therefore imperative to assess consistency in retrievals between the MCI and those algorithms. Here we explore a select number of algorithm approaches applied to MODIS to determine if they would produce retrievals consistent with the MERIS/OLCI MCI and superior to the NNMCI. Coincident MERIS/OLCI and MODIS spectra from Lake Erie in 2011 and 2017 (N = 9524) were used to assess the agreement between MERIS/OLCI MCI and: (1) a blue to green band ratio consistent with the standard OC4 MODIS Chlorophyll product [24,54], (2) a NIR/Red band ratio as a frequently proposed algorithm solution for turbid eutrophic waters [26], and (3) the Cyanobacteria Index (CI) as the approach adopted by Stumpf et al. (2012) for HAB detection on Lake Erie. The band ratio 488/555 is shown to be more sensitive to lower Chlorophyll concentrations than the MCI (Figure 9b), but at high chlorophyll concentrations, the variability between MCI and 488/555 is large and so suggests OC4 Chlorophyll would not provide consistent results under intense bloom conditions. This is in agreement with previous observations on Lake Erie, which showed saturation of the B/G index in highly eutrophic waters [55]. A NIR/Red band ratio suggests a good relationship with MCI for a subset of the dataset (as circled in Figure 9c) but uncertainties in atmospheric correction lead to a significant number of anomalous chlorophyll retrievals. The benefits of lineheight algorithms over band ratios with respect to minimizing the propagated errors in atmospherically corrected reflectance have been recognized elsewhere [56]. As shown in Figure 6, the MODIS CI is different from the CI originally defined for MERIS [35], measuring the depth of the reflectance trough at 678 nm relative to a linear baseline between 667 nm and 748 nm. Under intense bloom conditions, the CI and MCI have been previously shown to be in good qualitative agreement on Lake Erie [2], however, this analysis highlights some differences in the anticipated retrieval consistency. Mathematically, the MODIS CI is the same as the MODIS Fluorescence Line Height (FLH), which in low trophic conditions is positively correlated with chlorophyll due to the fluorescence peak at 678 nm. In eutrophic conditions, and under cyanobacteria dominance where fluorescence yield is lower [57], the index is driven more by chlorophyll absorption, and therefore the relationship is reversed. This is in agreement with the findings of Gons, et al. [58], who documented the effective use of the MERIS FLH product in oligotrophic waters of the Laurentian Great Lakes, but with FLH diminishing and becoming negative in mesotrophic and eutrophic waters. Figure 9d, therefore, captures that bimodal relationship with MCI, and so while there may be good agreement between the two indices under intense bloom conditions, there is significant divergence at lower biomass. Figure 9 shows that none of these alternate algorithms deliver a superior agreement with the MERIS/OLIC MCI compared with the MODIS NNMCI. Results suggest that adopting alternate algorithm solutions utilizing only those bands available on MODIS to retrieve chlorophyll and in turn bloom extent during the observation gap between MERIS and OLCI will likely introduce greater uncertainty than implementing the proposed NNMCI approach to simulate the MODIS bands required for consistent MCI retrievals.

Performance of Seasonal Bloom Extent Products
When applying the trained NNMCI to MODIS imagery over different years, MERIS/OLCIequivalent MCI bands are predicted, MCI and Chl-a are derived, daily and 14-day rolling average Chl-a products are generated, and algal bloom spatial extent is computed. Three years were chosen to validate the performance of the bloom extent retrievals; 2011 and 2017 as years on either side of the observation gap, and 2008 as a randomly selected fully independent validation year ( Figure 10). The NNMCI was trained using spectra extracted only from the 143 specific grid points in Figure 2 in 2011 and 2017, therefore the rest of the pixels over those two years, especially the vast central and east part of the lake, remain untrained, and as such it is still considered acceptable to validate the approach on imagery from those two years. The bloom extent reported from MODIS NNMCI is generally consistent with that of MERIS/OLCI in Figure 10, especially considering the spatial resolution and saturated pixel differences between the sensors. Despite the significant impact of sensor saturation during intense blooms captured in daily scenes (Figure 8), the impact on 14-day rolling average products will be limited to only those pixels persistently saturated over the composite period. Bloom extent in 2011 and 2017 agree very well, tracking the seasonal progression of the bloom and capturing the peak bloom extent of~5000 km 2 and~1500 km 2 , respectively. Deriving bloom extent for 2008 the NNMCI performs well despite the absence of this year in the training dataset. Results suggest that bloom spatial extent derived from MODIS can be reliably predicted within <25% of the reference extent captured by MERIS and/or OLCI.
With the higher revisit frequency of MODIS compared with MERIS and OLCI, the MODIS bloom extent time-series produces a smoother bloom progression. For example in 2017 there is a peak in OLCI bloom extent on 23 September which follows no observations between 16 and 20 September, whereas the more frequent observations afforded by MODIS act to smooth out that peak in the 14-day rolling average bloom extent. With the addition of Sentinel 3B, and future data streams to the image processing workflow, image frequency will be increased to further improve product consistency, and to allow a reduction in the number of days over which a rolling-average product is processed while minimizing cloud-related artifacts. Reducing the rolling average period will also better capture the high-frequency day-to-day dynamics of bloom variability, which are inherently smoothed using the current 14-day averaging.
Looking more closely at the bloom retrievals, Figure 11 captures the derived chlorophyll at the peak bloom extent for each of the 2008, 2011, and 2017 bloom seasons, and the corresponding masked bloom area for both MERIS/OLCI and MODIS. Derived Chlorophyll products and bloom masked areas are in excellent agreement. Much of the disagreement between the two retrievals is isolated to the very nearshore and the bloom edge due to differences in the spatial resolution. The larger discrepancy in 2008 can be attributed to temporal differences in cloud cover artifacts between the MERIS and MODIS overpasses, but also a stretch of bloom along the southern west basin which appears narrowly above the 10 µg/L threshold in MERIS and narrowly below that threshold in MODIS retrievals, leading to underestimation of the peak bloom extent observed in Figure 10c.

NNMCI Regional Transferability
In order to assess the applicability of the NNMCI beyond the Lake Erie system, the approach was tested further on two similarly turbid eutrophic Canadian lakes monitored for algal blooms by EOLakeWatch; Lake Winnipeg and Lake of the Woods. Lake Winnipeg (LW) is Canada's sixth-largest lake and experiences severe algal blooms due to excessive nutrient loading from its watershed [59]. Lake of the Woods (LoW) is a transboundary lake (bordering Ontario, Manitoba, and the U.S. state of Minnesota) that drains into Lake Winnipeg via the Winnipeg River. LoW is hydrologically complex, containing thousands of islands, and like LW experiences recurring severe algal blooms [51]. We again collected gridded data from coincident MODIS ρ s and MERIS/OLCI level 2 imagery: 6870 points in 2011 and 5011 points in 2017 for LW and 3937 in 2011 and 3706 in 2017 for LoW. These data trained lake-specific models for both NN14B and NN9B in the same manner as for Lake Erie. Using a randomly selected 30% of the sample dataset to evaluate performance, the LW model showed similar retrieval accuracy for the MODIS simulated MCI (R 2 = 0.785, RMSE = 0.0042 for NN14B and R 2 = 0.738, RMSE = 0.0047 for NN9B) resulting in successful bloom classification 90.3% of the time. Figure 12 shows the resulting LW algal bloom extent derived using the combined NN14B and NN9B, for a year from which the training dataset was extracted (2011) and a year with no training (2016), confirming satisfactory retrievals in both cases. Figure 12 does suggest, however, an apparent overestimate in the bloom spatial extent in early spring particularly in 2016. Binding, Zastepa, and Zeng [55] documented variable MCI per unit chlorophyll with varying phytoplankton community composition in Lake Erie. Here, on Lake Winnipeg, cyanobacteria biomass was greatest in 2011, comprising almost 83% of the total phytoplankton biomass, whereas diatoms constituted > 40% of the mean annual phytoplankton biomass in 2016 [60]. The difference shown in Figure 12 may therefore be driven by the full-spectrum impacts of diatom-dominated waters on the NNMCI following training on cyanobacteria-dominated observations, and could perhaps be corrected by extending the conditions over which the NNMCI is trained. The LoW model resulted in somewhat higher uncertainty (R 2 = 0.663, RMSE = 0.0042 for NN14B and R 2 = 0.638, RMSE = 0.0043 for NN9B) which is not surprising given the challenges in resolving this complex lake brought about by the coarse 1 km resolution and image distortions due to off-nadir sensor viewing geometry. Despite less than optimal spatial resolution the NNMCI was able to correctly classify bloom presence/absence on LoW 86.8% of the time. Figure 13 captures the satisfactory retrieval of the seasonal progression of blooms on LoW, with MODIS-derived bloom extent within <20% uncertainty compared with MERIS and OLCI. In 2011, MERIS suggests a dramatic drop-off in bloom extent towards the end of October, which in fact is due to an extended period with no valid imagery (10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21). In contrast, the higher revisit frequency of MODIS captured a cloud-free image on 15 October with a significant bloom extent. Therefore over this particular time interval, it is the MODIS time-series that is the more reliable.

Consistent Long-Term Time-Series of Bloom Spatial Extent
The strong agreement between MODIS NNMCI-derived bloom extent and the true MCI-derived bloom extent for both MERIS and OLCI suggests we can assume good continuity in the bloom extent measures across all three sensors. With the demonstration of reliable reproduction of bloom products in the overlapping years of MERIS/OLCI and MODIS, it is reasonable then to use the NNMCI to fill the gap in observations between 2012 and 2015. Figure 14, therefore, combines observations from MERIS, MODIS, and OLCI for a robust continuous indicator of bloom spatial extent from 2002-2020 for Lake Erie, Lake Winnipeg, and Lake of the Woods. Annual reporting metrics (June-October seasonmaxima and season-average bloom extent) consistent with those delivered operationally through EOLakeWatch [2] are also presented. On Lake Erie, the bloom of 2011 is the largest bloom of the time record, seen in imagery to expand well into the central basin into October that year and reaching a maximum extent of 5000 km 2 , in agreement with that reported by Stumpf, et al. [9]. In most years, bloom severity is explained well by phosphorus loads from the Maumee River in the west basin of Lake Erie [9,14,61]. The 2015 season saw the largest spring/early summer loading of total bioavailable phosphorus [13] and here is documented as resulting in the second-largest bloom of the time series. On Lake Winnipeg, bloom severity has been shown to depend on both nutrient loads and summer lake temperatures [19] and here blooms are shown at their peak to extend well over 20,000 km 2 , representing over 80% of the lake surface area. LoW has been under significant water quality pressures due to historical nutrient over-enrichment from industrial pulp and paper and domestic sewage facilities, but despite reductions in nutrient loads in the last few decades, severe annual cyanobacteria blooms persist [34,51]. Exacerbating factors such as warming temperatures and legacy nutrients stored in sediments are postulated to have, so far, limited a full recovery of the lake [62,63] although satellite-derived products are suggestive of recent declines in bloom intensity [2]. Moving forward, this consistent record of bloom spatial extent, among other bloom indices reported by EOLakeWatch, provides a valuable metric for monitoring and reporting on the effectiveness of implemented nutrient management practices and supporting adaptive management frameworks to reduce bloom occurrences on these three lakes.

Summary and Conclusions
Satellite Earth observations are proving essential for documenting temporal trends in inland water algal bloom occurrences. Reliable long-term records allow assessments of the roles of a myriad of climate and watershed stressors driving blooms, as well as further understanding the impact of blooms on biogeochemical processes and lake ecosystems. However, finite satellite mission life-cycles mean long-term datasets consist of data from multiple sensors which may or may not have consistent specifications in terms of spectral bands, spatial resolution, and revisit frequency. Sensor discontinuities and subsequent artifacts introduced when merging data from multiple sensors may therefore create uncertainty in metrics for algal bloom assessment and ultimately in estimated long-term trends. Efforts to account for sensor discontinuities in global ocean multi-sensor chlorophyll-a datasets (e.g., ESA's OC-CCI products) typically account for these differences using band-shifting and bias-correction techniques [64].
This study explored solutions to fill the observation gap between the MERIS and OLCI sensors to provide continuity in assessments of algal bloom spatial extent in three turbid eutrophic Canadian lakes. While MODIS Aqua provides reliable ocean color observations during the 2012-2015 gap between MERIS and its successor OLCI, MODIS does not provide the same spectral configuration to ensure consistency in chlorophyll retrievals using algorithm approaches requiring a waveband at~708 nm (e.g., MCI, CI, MPH). A NN solution is proposed, which uses 14 MODIS bands to simulate the required bands utilized in the MCI to derive a MODIS MCI-derived chlorophyll and subsequent algal bloom spatial extent metric consistent with MERIS and OLCI. The challenge of dealing with saturated MODIS pixels, known to obstruct retrievals over bright targets such as intense algal blooms, was addressed by a second NN model using those nine bands with the lowest probability of saturation. This added functionality allowed reliable bloom retrievals for pixels in MODIS scenes that would otherwise be rendered invalid due to saturation of R-NIR bands. Results combining the NN14B and NN9B for MCI-derived chlorophyll and subsequent bloom spatial extent, demonstrate reliable retrievals that allow for consistent multi-mission time-series analyses. The developed NNMCI resulted in excellent agreement between the MERIS/OLCI and MODIS bloom presence/absence classification required to document bloom spatial extent. Nevertheless, some differences remained in the presented time-series of seasonal bloom progression, which can be attributed to the differences in sensor spatial resolution, revisit frequencies, the level of image distortion due to viewing geometry, differences in observation times of the two sensors, as well as variability in phytoplankton community composition and resulting inherent optical properties. The coarser spatial resolution of MODIS results in reduced bloom-edge definition and contamination of nearshore pixels, whereas the higher revisit frequency results in a smoother time-series of bloom spatial extent when using rolling composite products.
The approach presented here provides a sound best-estimate of bloom conditions to deliver continuous observations from 2002 to 2020 that agree well with known environmental drivers on each lake. Results suggest that the agreement is equally strong between MODIS-MERIS and MODIS-OLCI, lending confidence to the assumption of continuity between the MERIS and OLCI sensor products. The same approach could be adapted to derive a SeaWiFS-MCI product to allow a retrospective assessment of bloom extent dating back to 1997. This method may be more broadly applicable to inland water and coastal ocean applications to combine data products from MERIS, MODIS, and OLCI by allowing continuity in algorithms requiring the 708 nm band (MCI, FLH, MPH, band ratios). The model may provide satisfactory results in other regions based on the existing training but it would probably be advantageous to re-train the NN if optical water types vary significantly from those reported here. Retraining the model for regional applications would reduce uncertainties due to the impact of, for example, regionally variable IOPs or complicating factors such as atmospheric conditions or adjacency effects, particularly if the application is over small water bodies or complex landscapes/hydrology where the differences between MODIS and MERIS/OLCI products may be more pronounced. Moving forward, it is anticipated that future generations of ocean color sensors will deliver improved continuity in sensor spectral configuration to enable consistent algorithm solutions for reliable contiguous observations of inland water quality.