Evaluation and Refinement of Chlorophyll-a Algorithms for High-Biomass Blooms in San Francisco Bay (USA)

: A massive bloom of the raphidophyte Heterosigma akashiwo occurred in summer 2022 in San Francisco Bay, causing widespread ecological impacts including events of low dissolved oxygen and mass fish kills. The rapidly evolving bloom required equally rapid management response, leading to the use of near-real-time image analysis of chlorophyll from the Ocean and Land Colour Instrument (OLCI) aboard Sentinel-3. Standard algorithms failed to adequately capture the bloom, signifying a need to refine a two-band algorithm developed for coastal and inland waters that relates the red-edge part of the remote sensing reflectance spectrum to chlorophyll. While the bloom was the initial motivation for optimizing this algorithm, an extensive dataset of in-water validation measurements from both bloom and non-bloom periods was used to evaluate performance over a range of concentrations and community composition. The modified red-edge algorithm with a simplified atmospheric correction scheme outperformed existing standard products across diverse conditions, and given the modest computational requirements, was found suitable for operational use and near-real-time product generation. The final version of the algorithm successfully minimizes error for non-bloom periods when chlorophyll a is typically <30 mg m − 3 , while also capturing bloom periods of >100 mg m − 3 chlorophyll a .


Introduction
San Francisco Bay (Bay) is the largest estuary system on the west coast of North America, draining about 40% of California's land area [1], while the surrounding population of ~7 million people has the potential to put considerable stress on the ecosystem through nutrient pollution [2].As a result, monitoring water quality is a vital aspect of Bay-wide management.One emerging concern is the proliferation of potentially harmful algal blooms (HABs).HABs have great potential to impact estuaries such as the Bay, because estuarine systems are both highly populated by humans and highly productive.Estuaries provide numerous valuable ecosystem functions and are sites of intensive aquaculture, subsistence, and commercial fisheries, all of which are threatened by HABs [3].
As described in [2], nutrient over-enrichment has led to ecosystem impairments in the majority of the world's estuaries [4][5][6][7].This impairment often includes the presence or expansion of HAB organisms responding directly to nutrient enrichment [8].The Bay has largely been resistant to proliferation of HABs, in part due to vigorous tidal mixing and flushing and light limitation that precludes biomass accumulation [2,9].Despite the historical resistance to large HAB events in the Bay, multiple HAB threats exist.In particular, the dinoflagellates Heterocapsa (fish-killing; [10]), and Akashiwo sanguinea (harmful to birds; [3,11]) have on occasion produced expansive blooms in the Bay [12].The raphidophyte Heterosigma akashiwo (fish-killing; exact toxic mechanism is presently unknown; [13]) is also present in the Bay but has not bloomed significantly since summer 2002 [14,15].Heterosigma akashiwo re-emerged in the Bay in summer 2022, and again in summer 2023, suggesting that recent conditions may be favoring more frequent blooms.

Remote Sensing of Coastal Waters
Several water quality indicators including turbidity, dissolved organic matter [16], and chlorophyll a (chla) directly affect the optical properties of water, and can therefore be estimated using satellites.However, satellite remote sensing has thus far generally been under-utilized by resource managers in the Bay region.This is due in part to the optical complexity of estuarine waters and the need for sensors with high spectral and spatial resolution to adequately disentangle the remote sensing signal and achieve quantifiable results (e.g., [16,17]).The most commonly used algorithms for retrieval of chla, including standard ocean color (OC) products from the National Aeronautics and Space Administration (NASA) and the European Space Agency (ESA), generally rely on the ratio of blue to green reflectance spectrum and were developed for the open ocean [18].Coastal waters exhibit unique challenges for these algorithms because of elevated levels of colored dissolved organic matter (CDOM) that absorbs strongly in blue reflectance bands but does not necessarily covary with chla.Additionally, elevated concentrations of suspended solids in coastal waters tend to elevate reflectance in all wavelengths, but particularly longer wavelengths, like green and red bands.
To address these issues for coastal and inland waters, chla algorithms of varying complexity (e.g., [17,[19][20][21][22]) have been developed, and several have been proposed specifically for the identification of harmful algal blooms (e.g., [23][24][25][26]).A recent analysis [22] directly compared multiple models grouped loosely into blue-green band ratios, such as the standard OC products from NASA and ESA and Red-Near-Infrared approaches from various authors.That analysis demonstrated that blue-green models perform poorly in turbid (e.g., coastal) waters, while the family of "red-edge" algorithms exhibits similar performance after tuning with in situ matchups.

Red-Edge Algorithms
One particularly effective method for retrieving chla in coastal waters relies on the use of the red-edge feature, a peak in reflectance occurring around 700 nm, which is insensitive to CDOM and slightly sensitive to suspended solids [20,21].The red part of the visible spectrum is also less sensitive to issues with atmospheric correction, making these algorithms easier to implement operationally while reducing potential sources of error.A significant disadvantage is that water strongly absorbs in the red and near-infrared part of the spectrum, so in the absence of significant chla biomass, there can be very little signal, resulting in poor performance below about 1 mg m −3 chla [20].
With this background in mind, we target one specific algorithm, referred to as Red Edge 10 (RE10), proposed by [20] and modified for Chesapeake Bay [21] to adapt for use in the Bay.This algorithm was chosen because it can be adjusted for the optical properties of specific blooms, it is computationally inexpensive, and it is amenable to operational use [21].For comparison, we provide results from the standard OC-type algorithms [18], and also evaluate three atmospheric correction methods commonly used for operational products: the Case 2 Regional CoastColour (C2RCC) atmospheric correction, as implemented in the Sentinel Applications Platform (SNAP) toolbox, the standard neural network approach used by ESA, and a simple black pixel assumption [27][28][29].We utilize ocean color data from the Ocean and Land Color Instrument (OLCI) aboard the Sentinel-3 series (S3A and S3B) of satellites.OLCI has reasonable spatial and temporal resolution (nominally 300 m, ~daily imagery from S3A and S3B), good signal-to-noise ratios, and appropriate bands for red-edge algorithms.Our primary goals are to demonstrate the efficacy of a modified RE10 algorithm tuned for the Bay and to highlight potential for operational use to support management decisions.

Study Area
The Bay can be simplified into a series of connected study areas: North Bay, comprising San Pablo and Suisun Bays to the north, Central Bay, which provides direct connection to the Pacific Ocean through the Golden Gate, and South Bay (Figure 1).The Bay is connected to the Sacramento-San Joaquin Delta (Delta) to the northeast, and the Pacific Ocean to the west.North Bay is characterized by a strong salinity gradient, and river water is the primary regulator of bay-wide salinity [1].North Bay is river-dominated, with salinities ranging from 0 to 15, while South Bay is a marine lagoon, with salinities ranging from 5 to 35.North Bay also tends to have the most riverine-derived suspended sediment.RE10 algorithm tuned for the Bay and to highlight potential for operational use to support management decisions.

Study Area
The Bay can be simplified into a series of connected study areas: North Bay, comprising San Pablo and Suisun Bays to the north, Central Bay, which provides direct connection to the Pacific Ocean through the Golden Gate, and South Bay (Figure 1).The Bay is connected to the Sacramento-San Joaquin Delta (Delta) to the northeast, and the Pacific Ocean to the west.North Bay is characterized by a strong salinity gradient, and river water is the primary regulator of bay-wide salinity [1].North Bay is river-dominated, with salinities ranging from 0 to 15, while South Bay is a marine lagoon, with salinities ranging from 5 to 35.North Bay also tends to have the most riverine-derived suspended sediment.The North, Central, and South Bay regions can be approximately defined by the location of bridges crossing the Bay; North Bay is the region of the Bay north of the Richmond Bridge, Central Bay is the region between the Richmond and Bay Bridges, and South Bay is the region south of the Bay Bridge.The Bay overall is a shallow wetland system, with most of the Bay being less than 3 m deep, except for a dredged shipping channel which runs through the middle of the Bay from north to south, reaching depths greater than 10 m.Based on field observations, microscopic identification, and remote sensing imagery, the onset of the 2022 H. akashiwo bloom occurred in Central Bay and spread into South and North Bay.The North, Central, and South Bay regions can be approximately defined by the location of bridges crossing the Bay; North Bay is the region of the Bay north of the Richmond Bridge, Central Bay is the region between the Richmond and Bay Bridges, and South Bay is the region south of the Bay Bridge.The Bay overall is a shallow wetland system, with most of the Bay being less than 3 m deep, except for a dredged shipping channel which runs through the middle of the Bay from north to south, reaching depths greater than 10 m.Based on field observations, microscopic identification, and remote sensing imagery, the onset of the 2022 H. akashiwo bloom occurred in Central Bay and spread into South and North Bay.

Data Overview
To evaluate whether remote sensing could provide reasonable estimates of chla during extreme events in an operational context, we focus on the H. akashiwo bloom that occurred from about August to September 2022.Satellite data were obtained from the European Space Agency as full-resolution L1A (no atmospheric correction) or L2 (atmospherically corrected, chla-derived) data products and processed in SNAP version 8.0 (see Supplementary Table S1).The ground-based data used for validation and algorithm tuning included data from discrete samples collected at the stations indicated in Figure 1, and high-resolution underway mapping data from the U.S. Geological Survey (USGS) collected for June, August, and September 2021, and July, August, and September 2022 [30,31].

Model Tuning Data
High-resolution mapping data include continuous (1 hz) underway chlorophyll fluorescence (fChl; YSI EXO2) collected from near the surface (~1 m depth) through a pressurecompensated manifold [26].The in situ fChl measurements were median-filtered (20 s) and compared to results from discrete laboratory measurements of >0.7 micron chla (collected from ~1 m depth with a submerged centrifugal pump; ref. [31]) (Supplementary Figure S1; Supplementary Table S2).High-resolution mapping data were further median-binned to 300 m spatial resolution, and the corresponding point location to the satellite overpass was used for spatial matchups with a ±15 min window applied.Pixels overlapping land in the satellite imagery were discarded, as well as pixels with Rayleigh-corrected top-ofatmosphere (TOA) reflectance (ρ, dimensionless) >0.5, indicative of pixels too bright to be water (e.g., physical structures).A total of 426 potential matchups from 2021-2022 were used in this analysis (see Supplementary Table S3).For any individual algorithm, some matchup points were removed due to various failures, typically related to atmospheric correction (see Supplementary Figure S2).

Model Validation Data
Water quality measurements used for qualitative model validation included fChl collected at regular point stations at 1 m depth for the USGS water quality cruises, corrected with discrete chla [32].Quality control data from the USGS are available for public use [32].

Chla Algorithms
Three atmospheric correction schemes and four chla algorithms were evaluated.For the atmospheric correction, the standard ESA processing was used for Level 2 water products (OL_2_WFR), which uses a neural network approach as part of the "Alternative Atmospheric Correction".Level 1 files (OL_1_EFR) for the same satellite overpasses were separately processed using the Case 2 Regional CoastColour (C2RCC) atmospheric correction as implemented in the Sentinel SNAP toolbox.SNAP was also used to provide Rayleigh-corrected TOA reflectances (ρ, dimensionless) for bands at 665, 708.75, and 885 nm from the C2RCC-processed imagery.The ρ(885) values were subtracted from ρ(665) and ρ(708.75)prior to use in the red-edge algorithms described below.Chla was extracted from ESA OL_2_WFR standard products using the OC4Me maximum band ratio semianalytical algorithm [33], and from ESA OL_1_EFR files using the C2RCC neural networkderived chla [28].Variations of the red-edge algorithms were used to estimate chla with the reflectance data from the C2RCC processor.
For the red-edge algorithms, following [20], we started with the following equation: where R 2 is the ratio of dimensionless water reflectance (λ 2 /λ 1 ), using the red bands λ 1 = 665 nm and λ 2 = 708.75nm, and the reflectances are based on the TOA, dark-pixel corrected data.We refer to this original formulation as RE10.Wynne et al. [21] adjusted RE10 by treating the offset correction (19.3 in RE10) as a tunable parameter and proposed 14.30 as optimal for Chesapeake Bay: [chla] = [35.75× R 2 − 14.30] 1.124  (2) We refer to this version of the algorithm as RE22.For this analysis, we treated the offset as a tunable parameter, and also considered the exponential term (1.124 in Equations ( 1) and ( 2)) as tunable based on the theoretical description provide by [20].In the original formulation, the power function was derived from the relationship between the phytoplankton-specific absorption coefficient, a*ph (m 2 (mg chl) −1 ), and the measured chla concentration (mg m −3 ).Those authors noted that the RE10 algorithm is sensitive to a*ph, and used representative spectra from various algal functional groups to derive the exponential term.
We directly measured a*ph using standard methods [34] for H. akashiwo cultures that were isolated from the bloom in 2022 (Figure 2).Lower a*ph values result in a proportionally higher exponential term.We therefore allowed the exponential to vary up to a value of 1.375.For the final algorithm, based on the optimization of multiple statistical criteria, we chose a switching version of the red-edge algorithm that applies different tuning parameters to chla <30 mg m −3 and >30 mg m −3 , with a smoothing function to avoid a discontinuous transition when 28 < chla < 32 mg m −3 : We refer to this algorithm as red-edge-San Francisco Bay (RE-SFB).

Algorithm Performance
The accuracy of each tested algorithm was evaluated by calculating a series of commonly used statistical methods.These include the coefficient of determination (R 2 ) for a linear fit, which also provides the slope and intercept for the relationship, plus five other performance parameters.The other methods are the root mean square error (RMSE), the mean absolute error (MAE), the mean bias (MBIAS), the median absolute error (MedAE), and the median bias (MedBIAS).These methods were selected following recommenda-

Algorithm Performance
The accuracy of each tested algorithm was evaluated by calculating a series of commonly used statistical methods.These include the coefficient of determination (R 2 ) for a linear fit, which also provides the slope and intercept for the relationship, plus five other performance parameters.The other methods are the root mean square error (RMSE), the mean absolute error (MAE), the mean bias (MBIAS), the median absolute error (MedAE), and the median bias (MedBIAS).These methods were selected following recommendations outlined by [35] to evaluate satellite and in situ matchups.As noted by [35], MAE is preferable to RMSE, as MAE is less sensitive to the distribution of error magnitudes and sample size; we report RMSE primarily for comparison with previously published methods.The MedAE and MedBIAS capture the typical error, while the MAE and MBIAS indicate whether a method has outlier errors.
The RMSE follows standard statistical formulation, while the MAE and MBIAS were derived following Seegers et al. 2018 [35]: where X and Y are the fitted and independent variables, respectively (i.e., the derived chla and the field validation chla values).MedBIAS and MedAE were calculated using the same equations as for MBIAS and MAE, with the substitution of median values for mean values.
Since the RE-SFB algorithm includes a 2-step fit, statistics were calculated for the ranges 0-30 mg m −3 and the full chla datasets.

Field chla and Spectral Absorption Data
The H. akashiwo bloom resulted in considerably higher chla levels than typically observed in SFB, which led to the development of the RE-SFB algorithm.Figure 2 provides log-transformed histogram distributions of data from 2021, when there were no blooms, and 2022, during the anomalous bloom event.
The data from 2022 show a clear bimodal distribution with "typical" chla values < 30 mg m −3 , similar to 2021, and strongly elevated chla concentrations associated with the bloom, exceeding 280 mg m −3 in surface waters.A value of 30 mg m −3 was therefore chosen as the breakpoint in the RE-SFB algorithm, allowing tuning for the unusually high bloom concentrations separate from the more typical chla concentrations recorded in the region.
The breakpoint (Equation ( 3)) allows tuning of the algorithm for high biomass that was dominated by a single species during these events.Gilerson et al. [20] optimized RE10 by adjusting the exponential term by ~6.5% from empirically fit data and provided confidence intervals that result in ~50% variability at high (>10 mg m −3 ) chla concentrations.Those authors noted that the exponential term is sensitive to the chlorophyll-specific absorption (a*ph) of the dominant biomass.Figure 3 shows the a*ph spectra from the 2022 H. akashiwo bloom versus the spectra used in the original RE10 algorithm (refer to [20] Figure 9 for more details).H. akashiwo cultures exhibited much lower a*ph (675) absorption, which would result in a proportionally higher exponential term when fitted to those data.We therefore used the published exponential term from RE10 below the transition, and the optimized exponential term for chla > 30 mg m −3 .

Matchup between Field chla and OC4Me, RE10, RE22, RE-SFB
The matchups between field data (2021-2022) for the Bay and previously published algorithms are shown in Figure 4 (observed versus modeled data) and Figure 5 (quantilequantile plots of the same data).Three of the algorithms, OC4Me, C2RCC, and RE10, exhibit saturation at high chla concentrations.RE22 is better but tends to underestimate values higher than ~30 mg m −3 chla.OC4Me shows the least coherence (more random distribution about the 1:1 line) compared to the other algorithms.

Matchup between Field chla and OC4Me, RE10, RE22, RE-SFB
The matchups between field data (2021-2022) for the Bay and previously published algorithms are shown in Figure 4 (observed versus modeled data) and Figure 5 (quantilequantile plots of the same data).Three of the algorithms, OC4Me, C2RCC, and RE10, exhibit saturation at high chla concentrations.RE22 is better but tends to underestimate values higher than ~30 mg m −3 chla.OC4Me shows the least coherence (more random distribution about the 1:1 line) compared to the other algorithms.
The quantile-quantile plots (Figure 5) present the data with observed and modeled data sorted from lowest to highest values.These plots provide a graphical representation of the fidelity of the satellite algorithms across the range of observed chla.Values falling below the 1:1 line represent underestimates, while values above the 1:1 line represent overestimates.This graphical representation highlights the saturation of OC4Me, C2RCC, and RE10, i.e., above a certain concentration, the data flatline.RE22 and C2RCC show the best performance with good coherence (close to the 1:1 line) but with overestimates at low concentrations and underestimates at high concentrations for RE22.C2RCC is robust at low concentrations (up to ~10 mg m −3 chla) but underestimates high concentrations, while OC4Me consistently underestimates chla across the range of observed values, and RE10 is reasonably accurate (but offset high) up to about 30 mg m −3 chla.The quantile-quantile plots (Figure 5) present the data with observed and modeled data sorted from lowest to highest values.These plots provide a graphical representation of the fidelity of the satellite algorithms across the range of observed chla.Values falling below the 1:1 line represent underestimates, while values above the 1:1 line represent overestimates.This graphical representation highlights the saturation of OC4Me, C2RCC, and RE10, i.e., above a certain concentration, the data flatline.RE22 and C2RCC show the best performance with good coherence (close to the 1:1 line) but with overestimates at low concentrations and underestimates at high concentrations for RE22.C2RCC is robust at low concentrations (up to ~10 mg m −3 chla) but underestimates high concentrations, while OC4Me consistently underestimates chla across the range of observed values, and RE10 is reasonably accurate (but offset high) up to about 30 mg m −3 chla.Figure 6 provides the same plots for the optimized RE-SFB algorithm.Compared to the other algorithms, the fit is qualitatively better, with no saturation at high chla and with the closest correspondence to the 1:1 line for the quantile-quantile plots.There is slight underestimation at the lowest chla values and slight to moderate overestimation for values > 30 mg m −3 .

Error Metrics for the Algorithms
Summary statistics (described in Section 2.5) are provided in Tables 1-3 for each algorithm, with Table 1 including the full dataset, Table 2 using data up to 30 mg m −3 , and Table 3 using data greater than 30 mg m −3 .For the full dataset, RE-SFB consistently outperforms the other algorithms, with low MBIAS and MedBIAS (values close to 1.0).The median bias describes the central tendency of the data, while the mean bias describes whether matchups with large errors are unevenly distributed.The MAE and MedAE provide metrics of overall algorithm performance (MAE for the full range, MedAE for typical chla concentrations).The MAE and MedAE can be interpreted as approximately the percent error associated with a particular algorithm; for example, the value of MAE = 1.469 for RE-SFB (Table 1) implies that a typical matchup has ~46.9% error when all matchups are considered.An algorithm used operationally should show good performance during typical (non-bloom) conditions as well as during bloom conditions.Table 2 shows that RE-SFB outperforms the other algorithms for this dataset over the typical range (Figure 2) of <30 mg m −3 chla.MBIAS and MedBIAS are close to unity, while the overall fit (MAE) improves slightly compared to the full dataset (Table 1).
Table 3 again shows the best performance at high chla using RE-SFB, with considerably improved RMSE and bias compared to the other algorithms.Overall, the error metrics provide quantitative evaluation of what can be observed qualitatively in Figures 4-6.The unoptimized algorithms suffer from saturation at high chla, and both over-and underestimation across the full range of values.RE-SFB minimizes overall error and performs well at both moderate and elevated chla.Further improvements could be made to optimize any specific subset of the error metrics, but we elected to generally optimize RE-SFB so that it is not over-fitted to a specific bloom event.

Discussion
Development of an accurate chla algorithm for San Francisco Bay is a critical requirement for event response to unusual bloom events, such as the H. akashiwo bloom of summer 2022, and routine monitoring of non-bloom conditions.More broadly, chla is commonly used to assess potential impairment of the system [2], and satellite imagery is routinely used for monitoring and assessment of other large estuaries (e.g., [36][37][38][39]).Given the optical complexity of estuaries, it is perhaps not surprising that red-edge algorithms generally outperform MBR algorithms such as OC4Me, which are optimized for the global oceans [18].One caution in using the red-edge family of algorithms is that they perform poorly below chla concentrations of ~1 mg m −3 chla [21].The lowest observed chla value used for this analysis was 2.73 mg m −3 , so no potential matchups were removed, but the RE-SFB algorithm does show increasing bias at low chla levels (Figure 6).
A second potential bias in RE-SFB is that red-edge algorithms are sensitive to the specific chla absorption term (a*ph).Gilerson 2010 [20] and Wynne et al. 2023 [21] used a fixed exponential term that was generally optimized for coastal organisms, while RE-SFB adjusted that term specifically for the H. akashiwo bloom.This may result in non-optimal performance if large blooms of other organisms occur in the future (e.g., Akashiwo sanguinea, [12]), although published absorption spectra suggest that A. sanguinea exhibits similar a*ph(670) as H. akashiwo [40].This potential bias is somewhat mitigated with the use of a two-step algorithm, as the values for chla < 30 mg m −3 are not optimized to a specific organism.

Discussion
Development of an accurate chla algorithm for San Francisco Bay is a critical requirement for event response to unusual bloom events, such as the H. akashiwo bloom of summer 2022, and routine monitoring of non-bloom conditions.More broadly, chla is commonly used to assess potential impairment of the system [2], and satellite imagery is routinely used for monitoring and assessment of other large estuaries (e.g., [36][37][38][39]).Given the optical complexity of estuaries, it is perhaps not surprising that red-edge algorithms generally outperform MBR algorithms such as OC4Me, which are optimized for the global oceans [18].One caution in using the red-edge family of algorithms is that they perform poorly below chla concentrations of ~1 mg m −3 chla [21].The lowest observed chla value used for this analysis was 2.73 mg m −3 , so no potential matchups were removed, but the RE-SFB algorithm does show increasing bias at low chla levels (Figure 6).
A second potential bias in RE-SFB is that red-edge algorithms are sensitive to the specific chla absorption term (a*ph).Gilerson 2010 [20] and Wynne et al. 2023 [21] used a fixed exponential term that was generally optimized for coastal organisms, while RE-SFB adjusted that term specifically for the H. akashiwo bloom.This may result in nonoptimal performance if large blooms of other organisms occur in the future (e.g., Akashiwo sanguinea, [12]), although published absorption spectra suggest that A. sanguinea exhibits similar a*ph(670) as H. akashiwo [40].This potential bias is somewhat mitigated with the use of a two-step algorithm, as the values for chla < 30 mg m −3 are not optimized to a specific organism.
A major difference between OC4Me, C2RCC, and the red-edge algorithms was the atmospheric correction.The standard ESA processing for OC4Me uses a neural network for the atmospheric correction but then switches to the standard MBR algorithm to derive chla (there is also a neural network-derived chla product which was not evaluated).C2RCC is different in that both the atmospheric correction and derived chla product are based on neural networks.Past evaluations of atmospheric correction schemes have demonstrated C2RCC to perform well in optically complex coastal waters [41], as well as in San Francisco Bay [42].It was therefore surprising that it performed relatively poorly in this analysis with the worst MAE values of the algorithms tested.All of the red-edge algorithms used a variation of the black pixel assumption [29], which subtracts the radiance from the darkest nearby pixel or subtracts the radiance at a presumed zero water-leaving radiance band in the near-infrared from the other bands.As noted by Wynne et al. 2022 [21], this atmospheric correction scheme tends to retrieve the most pixels in complex coastal and estuarine waters and is fairly insensitive to residual aerosol error in the red and near-infrared bands because ratios of spectrally close bands are used.This is evident in the variable number of successful matchups, with both OC4Me and C2RCC exhibiting failed matchups due to atmospheric correction issues (Table 1).A significant advantage for an operational algorithm is that this atmospheric correction is fairly simple compared to many other algorithms, reducing the processing burden and therefore speeding up data production.More computationally intense processing associated with typical Level 2 ocean color products often results in delays for operational products.
While RE-SFB was developed and tested with a relatively small dataset and two years of data, it is based on the successful application of red-edge algorithms to complex coastal waters.Future work could examine the sensitivity of the a*ph exponential term (Equation ( 3)) and/or addition of more matchup data.However, the results presented are robust with a reasonably sized matchup dataset (n = 425) across a wide dynamic range of chla concentrations.The final algorithm provides typical accuracy of ±~46-47% based on MAE.While this certainly leaves room for improvement, it is comparable to other operational algorithms in both the Chesapeake Bay (MAE for RE22 is 1.57; [21]), and for analysis of global datasets using standard algorithms (MAE of 1.62-2.05for eutrophic waters; [35]).

Conclusions
This analysis demonstrates the efficacy of red-edge algorithms for the detection of extreme bloom (chla) events in optically complex coastal waters.Tuning for San Francisco Bay improves performance compared to previous versions [20,21] and benefits from a simplified atmospheric correction scheme, allowing for rapid data processing and dissemination.The use of OLCI provides a reasonable compromise between satellite return rate (~daily) and spatial resolution (300 m) for San Francisco Bay.
While higher resolution sensors, such as the Operational Land Imager aboard Landsat-8/9 [43] and the Multi Spectral Instrument (MSI) aboard Sentinel-2, provide greatly improved spatial resolution [43,44], the return rate (~7-14 days) and lack of appropriate bands for OLCI preclude operational use for events such as the H. akashiwo bloom, which evolved over a period of days.Wynne et al. [21] argued for the inclusion of RE10 or similar algorithms as a standard product for the operational NOAA CoastWatch program; this analysis provides further evidence that the broad application of red-edge algorithms to complex coastal waters would greatly benefit end-users.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/rs16061103/s1, Figure S1: discrete vs. underway chla; Figure S2: map showing underway mapping stations used for matchups; Table S1: satellite imagery used in this analysis; Table S2: discrete matchup data; Table S3: High resolution in-water 300 m median binned matchup chla.Reference [45] is cited in the Supplemental Material.

Figure 1 .
Figure 1.Map of the study site, showing San Francisco Bay (Bay) and the three major basins as defined in the text.Major U.S. Geological Survey (USGS) water quality monitoring stations are shown as filled circles, while USGS underway mapping data are shown as blue segments.The direction of the upstream location Sacramento-San Joaquin Delta is indicated by the arrow labeled "Delta".

Figure 1 .
Figure 1.Map of the study site, showing San Francisco Bay (Bay) and the three major basins as defined in the text.Major U.S. Geological Survey (USGS) water quality monitoring stations are shown as filled circles, while USGS underway mapping data are shown as blue segments.The direction of the upstream location Sacramento-San Joaquin Delta is indicated by the arrow labeled "Delta".

Figure 2 .
Figure 2. Distribution of field chla data from 2021 and 2022 (plotted on a log10 x-axis).The cutoff of 30 mg m -3 is shown as a vertical dashed line.

Figure 2 .
Figure 2. Distribution of field chla data from 2021 and 2022 (plotted on a log10 x-axis).The cutoff of 30 mg m −3 is shown as a vertical dashed line.

Figure 3 .
Figure 3. Specific absorption coefficient spectra of cryptophytes (black), diatoms (red), and green algae (green) compared to the raphidophyte H. akashiwo (blue), the species that bloomed in the Bay in 2022.The absorption for H. akashiwo beyond 750 nm was zero or negative, resulting in a discontinuous line.

Figure 3 .
Figure 3. Specific absorption coefficient spectra of cryptophytes (black), diatoms (red), and green algae (green) compared to the raphidophyte H. akashiwo (blue), the species that bloomed in the Bay in 2022.The absorption for H. akashiwo beyond 750 nm was zero or negative, resulting in a discontinuous line.

Figure 6
Figure6provides the same plots for the optimized RE-SFB algorithm.Compared to the other algorithms, the fit is qualitatively better, with no saturation at high chla and with the closest correspondence to the 1:1 line for the quantile-quantile plots.There is slight underestimation at the lowest chla values and slight to moderate overestimation for values >30 mg m −3 .

Figure 6
Figure6provides the same plots for the optimized RE-SFB algorithm.Compared to the other algorithms, the fit is qualitatively better, with no saturation at high chla and with the closest correspondence to the 1:1 line for the quantile-quantile plots.There is slight underestimation at the lowest chla values and slight to moderate overestimation for values > 30 mg m −3 .

Figure 7
Figure7provides an example of the spatial variability of the bloom, as observed using RE22 and RE-SFB.The differences in spatial patterns are consistent with algorithm performance, i.e., RE-SFB reduces chla relative to RE10 and RE22 below 30 mg m −3 , while better representing extreme bloom concentrations for values > 30 mg m −3 .

Figure 7
Figure 7 provides an example of the spatial variability of the bloom, as observed using RE22 and RE-SFB.The differences in spatial patterns are consistent with algorithm performance, i.e., RE-SFB reduces chla relative to RE10 and RE22 below 30 mg m −3 , while better representing extreme bloom concentrations for values > 30 mg m −3 .

Figure 7 .
Figure 7. RE22 chla (A) for August 7, 2022, plotted in log chla, RE-SFB (B), and RE-SFB minus RE22 (C) plotted in a linear chla scale.RE-SFB (C) reduces the chla in North Bay, while providing greater delineation (e.g., nearshore intensification) of the initiation site.The bloom was first reported near the "bloom initiation" (B) site in July 2022.The very high chla in the far south is associated with the South Bay salt ponds.

Figure 7 .
Figure 7. RE22 chla (A) for August 7, 2022, plotted in log chla, RE-SFB (B), and RE-SFB minus RE22 (C) plotted in a linear chla scale.RE-SFB (C) reduces the chla in North Bay, while providing greater delineation (e.g., nearshore intensification) of the initiation site.The bloom was first reported near the "bloom initiation" (B) site in July 2022.The very high chla in the far south is associated with the South Bay salt ponds.

Table 1 .
Summary statistics for the five algorithms plotted in Figures4-6using the full dataset.
* Variable n is a result of (typically) atmospheric correction failure.

Table 2 .
Summary statistics for the five algorithms plotted in Figures 4-6 using the data < 30 mg m −3 .
* Variable n is a result of (typically) atmospheric correction failure.

Table 3 .
Summary statistics for the five algorithms plotted in Figures4-6using the data > 30 mg m −3 .
* Variable n is a result of (typically) atmospheric correction failure.