Multi-Spectral Remote Sensing of Phytoplankton Pigment Absorption Properties in Cyanobacteria Bloom Waters: A Regional Example in the Western Basin of Lake Erie

: Phytoplankton pigments absorb sunlight for photosynthesis, protect the chloroplast from damage caused by excess light energy, and inﬂuence the color of the water. Some pigments act as bio-markers and are important for separation of phytoplankton functional types. Among many efforts that have been made to obtain information on phytoplankton pigments from bio-optical properties, Gaussian curves decomposed from phytoplankton absorption spectrum have been used to represent the light absorption of different pigments. We incorporated the Gaussian scheme into a semi-analytical model and obtained the Gaussian curves from remote sensing reﬂectance. In this study, a series of sensitivity tests were conducted to explore the potential of obtaining the Gaussian curves from multi-spectral satellite remote sensing. Results showed that the Gaussian curves can be retrieved with 35% or less mean unbiased absolute percentage differences from MEdium Resolution Imaging Spectrometer (MERIS) and Moderate Resolution Imaging Spectroradiometer (MODIS)-like sensors. Further, using Lake Erie as an example, the spatial distribution of chlorophyll a and phycocyanin concentrations were obtained from the Gaussian curves and used as metrics for the spatial extent of an intense cyanobacterial bloom occurred in Lake Erie in 2014. The seasonal variations of Gaussian absorption properties in 2011 were further obtained from MERIS imagery. This study shows that it is feasible to obtain Gaussian curves from multi-spectral satellite remote sensing data, and the obtained chlorophyll a and phycocyanin concentrations from these Gaussian peak heights demonstrated potential application to monitor harmful algal blooms (HABs) and identiﬁcation of phytoplankton groups from satellite ocean color remote sensing semi-analytically. the MuPI model was validated for obtaining the peak heights of Gaussian curves ( a Gau ( λ )) from multi-spectral satellite remote sensing data. The model performance was validated in the retrieval accuracy of a Gau ( λ ) with R rs ( λ ) of six multi-spectral band conﬁgurations, and the spectral requirements were discussed. Less than 35% of mean unbiased absolute percentage differences were achieved for a Gau ( λ ) from R rs ( λ ) spectra with OLCI, MERIS and MODIS bands, and less than 45% for VIIRS, MSI and OLI bands. Using data from the western basin of Lake Erie as an example, the R rs ( λ ) obtained from HICO and MODIS satellites were validated with in situ data over cyanobacterial bloom waters in Lake Erie, and the spatial distributions of a Gau ( λ ) and the concentrations of chlorophyll a and phycocyanin were obtained, where the patches of cyanobacteria bloom were clearly presented. A seasonal distribution of pigment absorption coefﬁcients was obtained from MERIS seasonal composed imagery of 2011 for Lake Erie. These results demonstrate that, with MuPI, it is possible to analytically retrieve information of not only Chl-a, but also PC and potentially other pigments, which will signiﬁcantly enhance our capability to characterize and evaluate the status of phytoplankton blooms and discriminate phytoplankton groups using satellite ocean color remote sensing.


Introduction
Spectral light absorption and backscattering are the two inherent optical properties directly controlling the light field in water and further influencing water color. Optical properties of phytoplankton, specifically the absorption coefficients of the pigments inside them, play a key role in determining both the penetration of radiant energy in water and the use of this radiant energy for photosynthesis. These pigment absorption coefficients and their concentrations are important for understanding photosynthetic rate [1,2], identifying and quantifying phytoplankton functional groups [3] and determining size class distributions ( [4,5] and references therein). These properties of phytoplankton and their associated backscattering, along with colored dissolved organic matter absorption and non-algal particle absorption and scattering directly control the light field of water. Lake Erie dataset (LE): This dataset is composed of 36 Rrs(λ), ad(λ), ag(λ), and aph(λ) samples, and 20 Chl-a and PC concentrations. Cyanobacteria blooms dominated by Microcystis aeruginosa have been seasonally reoccurring in the western basin of Lake Erie since the mid-1990s [19]. The field measurements were collected during cruises from 20-23 August 2013 and 18-21 August 2014 in western Lake Erie at 16 and 20 stations, respectively, during the peak cyanobacteria bloom period [20,21], with chlorophyll concentrations ranging from >1 to 100s of mg•m −3 .
Since the ponds are too small for the MODIS and MERIS 1 km spatial resolution imagery, Lake Erie was selected as a local example for application of the method to HICO, MODIS and MERIS imageries to obtain pigment concentrations and Gaussian absorption distribution. The cyanobacteria bloom which occurred in Lake Erie in August 2014 was one of the most serious, causing disruption to the drinking water supply in Toledo, U.S.A. The 2014 cruise was conducted during this event and the sampling locations along with the Chl-a and PC concentrations at each station, are shown in Figure 1, where PC concentrations as high as ~200 mg•m −3 were noted in the southern portion of Lake Erie.

Remote Sensing Reflectance
The remote sensing reflectance of the MS dataset was acquired in the range of 400-900 nm with a sampling interval of 0.3 nm by deploying a dual sensor system with two inter-calibrated ocean optics spectroradiometers (Ocean Optics Inc., Dunedin, FL, USA). The in situ Rrs(λ) from LT and LE were measured with a hand-held ASD spectroradiometer (Analytical Spectral Device, Inc. Boulder, Colorado, U.S.A.), from 350 to 1050 nm at 1.4 nm intervals.

Absorption Coefficients
Surface water samples were collected and filtered immediately after the Rrs(λ) measurements and analyzed on the same day in the laboratory. The particulate absorption coefficient was quantified utilizing the transmittance or transmittance/reflectance method of Tassan and Ferrari [22] along with the NASA Ocean Optics Protocols, Revision 4, Volume IV protocol [23]. Percent transmittance and reflectance were measured for each sample filter to calculate the particulate absorption spectrum. Filters were then bleached, and the transmittance and reflectance measured to calculate the absorption coefficients of non-algal particles. The phytoplankton absorption coefficients (aph(λ)) at each station were calculated by subtraction of non-algal particle absorption coefficients from total particle absorption coefficients. The Gaussian peak heights (aGau(λ)) were obtained from aph(λ) using the Gaussian decomposition scheme described in Wang et al. [9], which used a least-square curve fitting technique in Matlab to obtain the aGau(λ). These aGau(λ) were used as ground truth to validate the outputs from Rrs(λ) inversions.

Remote Sensing Reflectance
The remote sensing reflectance of the MS dataset was acquired in the range of 400-900 nm with a sampling interval of 0.3 nm by deploying a dual sensor system with two inter-calibrated ocean optics spectroradiometers (Ocean Optics Inc., Dunedin, FL, USA). The in situ R rs (λ) from LT and LE were measured with a hand-held ASD spectroradiometer (Analytical Spectral Device, Inc., Boulder, CO, U.S.A.), from 350 to 1050 nm at 1.4 nm intervals.

Absorption Coefficients
Surface water samples were collected and filtered immediately after the R rs (λ) measurements and analyzed on the same day in the laboratory. The particulate absorption coefficient was quantified utilizing the transmittance or transmittance/reflectance method of Tassan and Ferrari [22] along with the NASA Ocean Optics Protocols, Revision 4, Volume IV protocol [23]. Percent transmittance and reflectance were measured for each sample filter to calculate the particulate absorption spectrum. Filters were then bleached, and the transmittance and reflectance measured to calculate the absorption coefficients of non-algal particles. The phytoplankton absorption coefficients (a ph (λ)) at each station were calculated by subtraction of non-algal particle absorption coefficients from total particle absorption coefficients. The Gaussian peak heights (a Gau (λ)) were obtained from a ph (λ) using the Gaussian decomposition scheme described in Wang et al. [9], which used a least-square curve fitting technique in Matlab to obtain the a Gau (λ). These a Gau (λ) were used as ground truth to validate the outputs from R rs (λ) inversions.
For the MS dataset, a Perkin Elmer Lambda 850 Spectrophotometer (Perkin Elmer Inc., Waltham, MA, USA) was used to measure the absorption coefficient of phytoplankton, detrital matter, and gelbstoff in the 380-750 nm range at 1 nm spectral resolution. Detailed information regarding the environmental characteristics and measurement methods can be found in Mishra et al. [18]. A Shimadzu UV-2401 Spectrophotometer was used for LT. More information about LT and details about water sample collection, measurement protocols, and processing methods can be found in Duan et al. [24] and Ma et al. [25]. Samples from LE were measured with a Perkin-Elmer Lambda 35 UV/Vis Dual-beam Spectrophotometer from 300-800 nm at 1 nm resolution as described in Mouw et al. [26].

Pigment Concentrations and Group Composition
Water samples for Chl-a and PC concentration from Lake Erie were filtered through 0.7 and 0.2 µm glass fiber filters (Whatman GF/F, 25-mm and 47-mm diameter), respectively. The Chl-a was estimated using a Turner Designs 10-AU Fluorometer in the laboratory following the NASA Ocean Optics Protocols, Revision 5, Volume V [23]. Following NASA protocols [27], filters for phycocyanin concentration determination were extracted in phosphate buffer (Ricca Chemical, Arlington, TX, U.S.A. pH 6.8) using two freeze-thaw cycles, followed by sonication. Relative fluorescence was measured on a Turner Aquafluor fluorometer and converted to PC using a series of dilutions of a commercial standard (Sigma-Aldrich, Saint Loise, MO, USA) [21]. Phytoplankton populations for the Lake Erie stations in 2013 were also used in this study which were identified and counted using standard light microscopy [21].

Satellite Imagery
The Hyperspectral Imager for the Coastal Ocean (HICO) was the first space-borne hyperspectral imaging spectrometer designed to sample the coastal ocean. HICO covered selected coastal regions at 90 m spatial resolution with full spectral coverage (380 to 960 nm sampled at 5.73 nm intervals). HICO imagery of the same time frame as in situ measurements in Lake Erie in 2014 was downloaded from the Oregon State University website (http://hico.coas.oregonstate.edu/), and following the steps provided on the website the level 1B (L1B) imagery was atmospherically corrected to level 2 (L2) data using the online version of Tafkaa_6s model, followed by georectification. The HICO R rs (λ) data have about the same spectral resolution as the in situ measured reflectance (5.73 nm for HICO and 5 nm for original R rs (λ)). These data were used as an example for MuPI application in hyperspectral satellite imagery.
The Moderate Resolution Imaging Spectroradiometer onboard the Aqua satellite (MODIS-Aqua) launched in 2002 is one of the contemporary satellite ocean color missions. The L1B imagery with 1 km spatial resolution was downloaded from the National Aeronautics and Space Administration (NASA) website (https://ladsweb.nascom.nasa.gov/). The L2 imagery (with 748 nm included in R rs (λ)) from the same time frame as in situ measurements in Lake Erie in 2014 was obtained using the data processing L2gen from the SeaWiFS Data Analysis System (SeaDAS) software using the standard near infrared (NIR) scheme [28][29][30] for atmospheric correction. MODIS imagery was used as an example of MuPI application to multi-spectral satellite remote sensing data.
MEdium Resolution Imaging Spectrometer (MERIS) on board the European Space Agency's Envisat platform, was launched in 2002 and ended its mission in May 2012. The L3 seasonal composed imagery and L2 reduced resolution imagery (1 km spatial resolution) was obtained from NASA ocean color website (https://oceancolor.gsfc.nasa.gov/). Lacking coincidence with in situ measurements, the same day imagery of MODIS and MERIS on 3 September 2011 was used for the comparison of the retrieving results. The MERIS seasonal composed imagery in Lake Erie was used to obtain a general seasonal variation of Gaussian absorption coefficients.

MuPI Model
To obtain a Gau (λ) from remote sensing reflectance, a multi-pigment inversion method [9] was used. The corresponding functions used in this method are included in Table 1. In this method, the Gaussian scheme, which uses 13 Gaussian curves to reconstruct the spectrum of phytoplankton absorption coefficient, was incorporated into the relationship in which R rs (λ) is a function of total spectral absorption (a(λ)) and backscattering (b b (λ)) coefficients [31,32]. With an R rs (λ) spectrum as the only input, the function was solved with a spectral optimization method which returns the values of unknowns (a Gau (λ), absorption of non-algal particles and gelbstoff: a dg (λ), and backscattering coefficient of particles, b bp (λ)), that minimizes the difference between the estimated and input R rs (λ) spectra. A summarized flowchart of the method is shown in Figure 2. The parameters used for the 13 Gaussian curves in this study are shown in Table 2, which includes the peak locations, widths and empirical relationships between the peak heights. Table 1. Summary of the functions and symbols used in multi-pigment inversion model (MuPI).

Index
Function Description References Remote sensing reflectance as a function of a(λ) and b b (λ) Cost function for spectral optimization " R rs (λ) and R rs (λ): Estimated and measured remote sensing reflectance [36][37][38][39] Remote Sens.  Utilizing satellite or in situ measured remote sensing reflectance (Rrs(λ)), a spectral optimization method is used to minimize the differences between estimated and measured Rrs(λ) spectra, output values for Gaussian curves (aGau(λ)), non-algal particles and gelbstoff (adg(λ)) and particulate attenuation coefficients (cp) are obtained.

Gaussian Parameters
As with other semi-analytical algorithms (e.g., HOPE [38]; GSM [40], GIOP [36]), aGau(λ) in MuPI were retrieved using a spectral optimization method. The Gaussian scheme used in MuPI plays a critical role in aGau(λ) estimation, and the Gaussian peak heights, locations, and widths are very important factors in the Gaussian scheme. To determine the peak locations and widths of the Gaussian curves, the peaks and troughs of aph(λ) spectra were used as the initial inputs in the optimization process. As shown in Table 2 and Figure 3, a total of 13 Gaussian curves, corresponding to peaks or shoulders of measured aph(λ) spectra, were used to represent the absorption of phytoplankton pigments at different wavelengths: chlorophyll a: 386.6, 414, 435, 677, and 693.5 nm; chlorophyll b: 653 nm; chlorophyll c: 451.7, 584.4, and 636 nm; carotenoids: 484 and 515.6 nm; phycoerythrin: 548.8 nm; and phycocyanin: 617.6 nm. These peak locations and widths (the second Figure 2. Flowchart of the inversion method. Utilizing satellite or in situ measured remote sensing reflectance (R rs (λ)), a spectral optimization method is used to minimize the differences between estimated and measured R rs (λ) spectra, output values for Gaussian curves (a Gau (λ)), non-algal particles and gelbstoff (a dg (λ)) and particulate attenuation coefficients (c p ) are obtained.

Gaussian Parameters
As with other semi-analytical algorithms (e.g., HOPE [38]; GSM [40], GIOP [36]), a Gau (λ) in MuPI were retrieved using a spectral optimization method. The Gaussian scheme used in MuPI plays a critical role in a Gau (λ) estimation, and the Gaussian peak heights, locations, and widths are very important factors in the Gaussian scheme. To determine the peak locations and widths of the Gaussian curves, the peaks and troughs of a ph (λ) spectra were used as the initial inputs in the optimization process. As shown in Table 2 and Figure 3, a total of 13 Gaussian curves, corresponding to peaks or shoulders of measured a ph (λ) spectra, were used to represent the absorption of phytoplankton pigments at different wavelengths: chlorophyll a: 386.6, 414, 435, 677, and 693.5 nm; chlorophyll b: 653 nm; chlorophyll c: 451.7, 584.4, and 636 nm; carotenoids: 484 and 515.6 nm; phycoerythrin: 548.8 nm; and phycocyanin: 617.6 nm. These peak locations and widths (the second and third columns of Table 2) were potentially influenced by pigment composition. Upon further analysis, we found that using fixed values for peak locations and widths in Table 2 cause no more than 5% differences in a ph (λ) estimation versus when these peak locations and widths were set as free variables. Comparatively, the peak height of Gaussian curves influenced by pigment concentrations play a critical role in a ph (λ) estimation. Thus, in the MuPI method, the peak locations and widths presented in Table 2 were fixed as constants, and the peak heights of these 13 Gaussian curves, containing bio-optical information of phytoplankton pigments, are the parameters to be obtained from remote sensing reflectance.
Remote Sens. 2017, 9,1309 7 of 21 and third columns of Table 2) were potentially influenced by pigment composition. Upon further analysis, we found that using fixed values for peak locations and widths in Table 2 cause no more than 5% differences in aph(λ) estimation versus when these peak locations and widths were set as free variables. Comparatively, the peak height of Gaussian curves influenced by pigment concentrations play a critical role in aph(λ) estimation. Thus, in the MuPI method, the peak locations and widths presented in Table 2 were fixed as constants, and the peak heights of these 13 Gaussian curves, containing bio-optical information of phytoplankton pigments, are the parameters to be obtained from remote sensing reflectance. It is impractical to retrieve 13 peaks simultaneously from an Rrs(λ) spectrum via spectral optimization because more variables in the function usually means much higher uncertainties in the outputs. In determining the number of independent aGau(λ) to be retrieved from the MuPI model, a series of tests were conducted, and the mean, median, maximum and minimum unbiased absolute percentage difference (UAPDs, Equation (1)) were compared for different datasets when the number of independent aGau(λ) vary among 13, 5 (Peak 2, 5, 7, 9 and 12), 3 (Peak 3, 5 and 9) and 2 (Peak 3 and Peak 9). The combinations of independent aGau(λ) were determined based on the highest correlation coefficients (R 2 ) with other aGau(λ). As expected, an obvious decrease in the percentage differences are noticed when the number of independent aGau(λ) are reduced from 13 to 2 for all three in situ datasets, as shown in Table 3, although the retrieval accuracy varies from dataset to dataset because of the different contributions of phytoplankton to the total absorption (100% × aph(λ)/a(λ), Figure 4). Thus the height of two peaks, here aGau(435) and aGau(617.6), instead of 13, were retrieved as free variables from an Rrs(λ) spectrum to achieve the highest retrieval accuracy from MuPI. This also controls the number of unknowns in the model to maximize its application to different satellite remote sensing data and improve computation efficiencies.

( )
where Ŝ is the estimated value and S is the measured value. (C) Correlation matrix shows the co-variances among peaks (R 2 ).
It is impractical to retrieve 13 peaks simultaneously from an R rs (λ) spectrum via spectral optimization because more variables in the function usually means much higher uncertainties in the outputs. In determining the number of independent a Gau (λ) to be retrieved from the MuPI model, a series of tests were conducted, and the mean, median, maximum and minimum unbiased absolute percentage difference (UAPDs, Equation (1)) were compared for different datasets when the number of independent a Gau (λ) vary among 13, 5 (Peak 2, 5, 7, 9 and 12), 3 (Peak 3, 5 and 9) and 2 (Peak 3 and 9). The combinations of independent a Gau (λ) were determined based on the highest correlation coefficients (R 2 ) with other a Gau (λ). As expected, an obvious decrease in the percentage differences are noticed when the number of independent a Gau (λ) are reduced from 13 to 2 for all three in situ datasets, as shown in Table 3, although the retrieval accuracy varies from dataset to dataset because of the different contributions of phytoplankton to the total absorption (100% × a ph (λ)/a(λ), Figure 4). Thus the height of two peaks, here a Gau (435) and a Gau (617.6), instead of 13, were retrieved as free variables from an R rs (λ) spectrum to achieve the highest retrieval accuracy from MuPI. This also controls the number of unknowns in the model to maximize its application to different satellite remote sensing data and improve computation efficiencies.
Remote Sens. 2017, 9, 1309 8 of 22 whereŜ is the estimated value and S is the measured value.   Once aGau(435) and aGau(617.6) were obtained from Rrs(λ), the other peaks could be estimated from these two values through a set of relationships shown in Table 2. Whether these relationships can represent data from different regions over different seasons presents uncertainty and a potential error source for aGau(λ) retrieval from MuPI. In our analysis of three datasets, we found that the empirical relationships among aGau(λ) to be very stable. The data from different regions over different seasons follow the trend predicted by the regression line based on one dataset. Figure 5 shows two examples in which all the data points from different datasets scattered tightly around the regression lines. Once a Gau (435) and a Gau (617.6) were obtained from R rs (λ), the other peaks could be estimated from these two values through a set of relationships shown in Table 2. Whether these relationships can represent data from different regions over different seasons presents uncertainty and a potential error source for a Gau (λ) retrieval from MuPI. In our analysis of three datasets, we found that the empirical relationships among a Gau (λ) to be very stable. The data from different regions over different seasons follow the trend predicted by the regression line based on one dataset. Figure 5 shows two examples in which all the data points from different datasets scattered tightly around the regression lines. In contrast with the parameters determined in Wang et al. [9], to reduce potential errors updates were made in this study regarding the independent peak heights and their empirical relationships with other peaks. These modifications were made for the following reasons: (1) when further checking the co-variances among aGau(λ) from the MS and LT datasets, it was found that, sometimes, power-law relationships among the peak heights were more robust and representative than the previously adopted linear relationships ( Figure 3); and (2) to make the retrievals more meaningful, aGau(435) and aGau(617.6) instead of aGau(515.6) and aGau(584.4) were used as the two independent variables. This change was made because aGau(435) showed a robust relationship with chlorophyll a concentration, and aGau(617.6) with phycocyanin. Further, their absorption coefficients or concentrations directly retrieved from Rrs(λ) can be more widely used in the estimation of phytoplankton biomass, productivity and cyanobacteria bloom detection and monitoring. Further, because of the high co-variation among the peak heights (R 2 > 0.92), the selection of different peaks as free variables caused no more than 10% difference in the retrieved results.

aGau(λ) Spectra Shape
In exploring the aGau(λ) application in HABs detection, a spectral shape analysis was conducted, in which aGau (435) where the index N represents the total number of wavelengths, and λi corresponds to the wavelengths of 435, 584.4 and 617.6 nm. The naGau(λ) spectra vary over the range between 0 and 1, while it retains the "shapes" pertaining to the original aGau(λ) spectra, i.e., the band ratios of naGau(λ) retain the same as aGau(λ).

Multi-Spectral Rrs(λ)
To characterize the model performance with multi-spectral Rrs(λ) data as inputs, a set of different spectral band configurations were used (Table 4). This included the existing sensors of Ocean and Land Color Instrument (OLCI), MERIS, MODIS, Visible Infrared Imaging Radiometer Suite (VIIRS), Sentinal-2 Multispectral Instrument (MSI), Landsat-8 Operational Land Imager (OLI) and an Allband configuration that combined all of the existing spectral bands of the different sensors. The inclusion of the All-band configuration was intended to test the sensitivity of the algorithm to specific spectral bands of the existing sensors. The Rrs(λ) of these multiple spectral bands were obtained by applying their spectral response functions (RSF) (https://oceancolor.gsfc.nasa.gov/docs/rsr/ In contrast with the parameters determined in Wang et al. [9], to reduce potential errors updates were made in this study regarding the independent peak heights and their empirical relationships with other peaks. These modifications were made for the following reasons: (1) when further checking the co-variances among a Gau (λ) from the MS and LT datasets, it was found that, sometimes, power-law relationships among the peak heights were more robust and representative than the previously adopted linear relationships ( Figure 3); and (2) to make the retrievals more meaningful, a Gau (435) and a Gau (617.6) instead of a Gau (515.6) and a Gau (584.4) were used as the two independent variables. This change was made because a Gau (435) showed a robust relationship with chlorophyll a concentration, and a Gau (617.6) with phycocyanin. Further, their absorption coefficients or concentrations directly retrieved from R rs (λ) can be more widely used in the estimation of phytoplankton biomass, productivity and cyanobacteria bloom detection and monitoring. Further, because of the high co-variation among the peak heights (R 2 > 0.92), the selection of different peaks as free variables caused no more than 10% difference in the retrieved results.

a Gau (λ) Spectra Shape
In exploring the a Gau (λ) application in HABs detection, a spectral shape analysis was conducted, in which a Gau (435), a Gau (584.4), and a Gau (617.6) were normalized by their respective root of sum of squares, where the index N represents the total number of wavelengths, and λ i corresponds to the wavelengths of 435, 584.4 and 617.6 nm. The na Gau (λ) spectra vary over the range between 0 and 1, while it retains the "shapes" pertaining to the original a Gau (λ) spectra, i.e., the band ratios of na Gau (λ) retain the same as a Gau (λ).

Multi-Spectral R rs (λ)
To characterize the model performance with multi-spectral R rs (λ) data as inputs, a set of different spectral band configurations were used (Table 4). This included the existing sensors of Ocean and Land Color Instrument (OLCI), MERIS, MODIS, Visible Infrared Imaging Radiometer Suite (VIIRS), Sentinal-2 Multispectral Instrument (MSI), Landsat-8 Operational Land Imager (OLI) and an All-band configuration that combined all of the existing spectral bands of the different sensors. The inclusion of the All-band configuration was intended to test the sensitivity of the algorithm to specific spectral bands of the existing sensors. The R rs (λ) of these multiple spectral bands were obtained by applying their spectral response functions (RSF) (https://oceancolor.gsfc.nasa.gov/docs/rsr/rsr_ tables/, https://earth.esa.int/web/sentinel/user-guides/sentinel-2-msi/document-library/-/asset_ publisher/Wk0TKajiISaR/content/sentinel-2a-spectral-responses) to the in situ hyperspectral data.

a Gau (λ) from Multi-Spectral R rs (λ)
The MuPI performance was validated for multi-spectral R rs (λ) at six different band configurations. Table 5 shows the mean UAPD of 13 a Gau (λ) for all three in situ datasets with the spectral bands of different sensors, respectively. Table 6 provides detailed information for LE 2014 data regarding the variation of the mean UAPD in the a Gau (λ) retrieval with R rs (λ) at different spectral configurations. OLCI and MERIS produced similar mean UAPD, which is around 35% for all three datasets and 28% for the LE 2014 data. Compared with MODIS, VIIRS showed slightly higher mean UAPD (All data: VIIRS: 36%, MODIS: 34%; LE 2014: VIIRS: ≤40%, MODIS: ≤37%), which is potentially caused by the lack of a spectral band around 665 nm in VIIRS. MSI and OLI produced the highest uncertainties for a Gau (λ) at wavelength longer than 550 nm, but, surprisingly, a ≤50% mean UAPD was achieved for the four spectral bands of the Landsat-8 OLI sensor with the data used here, which demonstrates the potential of applying MuPI to these high spatial resolution satellite remote sensing data.
By removing and adding specific bands such as 645 and 748 nm for MODIS, and 754 nm in MERIS, differences in the a Gau (λ) accuracy were observed. A 10% increase in mean UAPD was observed when the 748 nm band was removed from MODIS, and a 3% decrease in the mean UAPD when adding the 745 nm band to MERIS. However, with the existence of the 748 nm band, the adding and removing of the 645 nm band did not show a large influence (<3% mean UAPD variation). In addition, due to the existence of the 746 nm band in VIIRS, reasonable results were obtained in a Gau (λ) retrievals.
Such a result suggests, at least for this dataset of bloom waters, with MuPI there is little impact on the retrieval of a Gau (λ) when hyperspectral R rs (λ) is degraded to multiple-spectral measurements such as MERIS and MODIS-like sensors. This is consistent with conclusions from previous studies that R rs (λ) data do not need to be as fine as 1 nm in spectral resolution to obtain reliable retrievals of inherent optical properties [41][42][43].

Chl-a and PC from a Gau (λ) for Lake Erie
As one of the most widely used indices for phytoplankton, concentration and light absorption coefficients of chlorophyll a (Chl-a) have been the focus of many studies [33,[44][45][46][47]. Previous studies [9,10,17] have indicated that some Gaussian peak heights (a Gau (λ)) obtained from a ph (λ), such as peaks around 390, 413, 435 and 675 nm, represent the absorption properties of chlorophyll a. Phycocyanin (PC), the bio-marker of the blue-green cyanobacteria, is an important indicator for cyanobacteria biomass [6,7,48]. Based on the regression analysis between a Gau (λ), Chl-a and PC from the LE dataset, the Gaussian peaks at 386.6, 414, 435 and 677 nm showed high correlation coefficients with Chl-a concentration with R 2 of 0.95, 0.96, 0.97, and 0.98, respectively; and the Gaussian peaks at 617.6 nm showed a 0.93 correlation coefficient with PC concentration. The empirical relationships for Chl-a and PC estimation from a Gau (677) and a Gau (617.6) were obtained, as shown in Figure 7. The above results further highlight the values of using a Gau (λ) as a proxy to obtain pigment absorption or concentration, which may be used to map cyanobacteria bloom waters from ocean color imagery.

Chl-a and PC from aGau(λ) for Lake Erie
As one of the most widely used indices for phytoplankton, concentration and light absorption coefficients of chlorophyll a (Chl-a) have been the focus of many studies [33,[44][45][46][47]. Previous studies [9,10,17] have indicated that some Gaussian peak heights (aGau(λ)) obtained from aph(λ), such as peaks around 390, 413, 435 and 675 nm, represent the absorption properties of chlorophyll a. Phycocyanin (PC), the bio-marker of the blue-green cyanobacteria, is an important indicator for cyanobacteria biomass [6,7,48]. Based on the regression analysis between aGau(λ), Chl-a and PC from the LE dataset, the Gaussian peaks at 386.6, 414, 435 and 677 nm showed high correlation coefficients with Chl-a concentration with R 2 of 0.95, 0.96, 0.97, and 0.98, respectively; and the Gaussian peaks at 617.6 nm showed a 0.93 correlation coefficient with PC concentration. The empirical relationships for Chl-a and PC estimation from aGau(677) and aGau(617.6) were obtained, as shown in Figure 7. The above results further highlight the values of using aGau(λ) as a proxy to obtain pigment absorption or concentration, which may be used to map cyanobacteria bloom waters from ocean color imagery.

aGau(λ) versus Group Composition
In analysis of the aGau(λ) and cell counts from LE 2013 measurements, we found the spectral shapes formed by aGau(435), aGau(584.4), and aGau(617.6) vary with the composition of cyanobacteria species at different locations in western basin of Lake Erie, as shown in Figure 8. As shown in Table  2, the aGau(435), aGau(584. 4), and aGau(617.6) contain absorption properties of pigments Chl-a, PE, and PC, respectively. The variation of pigment compositions, especially the different intracellular Chla:PE:PC ratios of the cyanobacteria at the species level is the main reason of the spectral shape variation. This can potentially be used in separating different cyanobacteria species in the bloom waters.

a Gau (λ) versus Group Composition
In analysis of the a Gau (λ) and cell counts from LE 2013 measurements, we found the spectral shapes formed by a Gau (435), a Gau (584.4), and a Gau (617.6) vary with the composition of cyanobacteria species at different locations in western basin of Lake Erie, as shown in Figure 8. As shown in Table 2, the a Gau (435), a Gau (584.4), and a Gau (617.6) contain absorption properties of pigments Chl-a, PE, and PC, respectively. The variation of pigment compositions, especially the different intracellular Chl-a:PE:PC ratios of the cyanobacteria at the species level is the main reason of the spectral shape variation. This can potentially be used in separating different cyanobacteria species in the bloom waters.

Validation of Satellite Remote Sensing Data
Before applying MuPI to satellite imagery for aGau(λ) retrieval, the Rrs(λ) data from HICO and MODIS imagery for Lake Erie were first assessed with in situ measurements.
Three criteria were employed to find matchups: (1) Within ±2 days; (2) median of a 3 × 3 box, with no masks of land or clouds; and (3) coefficients of variance smaller than 0.15. The spectral response function was applied to the in situ Rrs(λ) spectrum before comparing it with satellite data. Most of the bands have a mean difference within 35% for Rrs(λ) for both sensors. At the shorter wavelengths (≤500 nm for HICO, and 412 and 443 nm for MODIS), the mean UAPD was as high as 65% and some satellite Rrs(λ) were even negative values because of the poor atmospheric correction for optically complex inland waters [49,50]. For MODIS, the mean UAPD of Rrs(λ) at 748 nm was 52%, possibly due to low Rrs(λ) values at this band and the influence from residual and likely uncorrected oxygen and water vapor absorption in the atmosphere.
Since the main focus of this work is the retrieval of aGau(λ) from Rrs(λ), we took advantage of the aGau(λ) information from the Rrs(λ) spectrum in the longer wavelengths (>500 nm for HICO and >480 nm for MODIS) in the model application, to minimize the influence of ineffective atmospheric correction on the shorter bands of satellite Rrs(λ). To validate this adjustment, MuPI was first applied to in situ measured Rrs(λ) with HICO and MODIS bands but without data from the shorter wavelengths (≤500 nm for HICO and 412 and 443 nm for MODIS). The retrieved aGau(λ) from such Rrs(λ) agreed quite well with aGau(λ) from in situ aph(λ) decomposition, with the mean UAPD ≤38% for HICO, and ≤48% for MODIS bands ( Table 7). The estimated Chl-a concentration has a mean UAPD of 28% which is better than the results from the standard products [45] and those shown in Pan et al. [8], and the mean UAPD for the estimated PC concentration is 32% which is consistent or even better than that reported in the literature [6,7,48]. Figure 9 presents one match-up of satellite and in situ Rrs(λ) spectra and the estimated Rrs(λ) spectra from MuPI for both HICO and MODIS, as well as, the

Validation of Satellite Remote Sensing Data
Before applying MuPI to satellite imagery for a Gau (λ) retrieval, the R rs (λ) data from HICO and MODIS imagery for Lake Erie were first assessed with in situ measurements.
Three criteria were employed to find matchups: (1) Within ±2 days; (2) median of a 3 × 3 box, with no masks of land or clouds; and (3) coefficients of variance smaller than 0.15. The spectral response function was applied to the in situ R rs (λ) spectrum before comparing it with satellite data. Most of the bands have a mean difference within 35% for R rs (λ) for both sensors. At the shorter wavelengths (≤500 nm for HICO, and 412 and 443 nm for MODIS), the mean UAPD was as high as 65% and some satellite R rs (λ) were even negative values because of the poor atmospheric correction for optically complex inland waters [49,50]. For MODIS, the mean UAPD of R rs (λ) at 748 nm was 52%, possibly due to low R rs (λ) values at this band and the influence from residual and likely uncorrected oxygen and water vapor absorption in the atmosphere.
Since the main focus of this work is the retrieval of a Gau (λ) from R rs (λ), we took advantage of the a Gau (λ) information from the R rs (λ) spectrum in the longer wavelengths (>500 nm for HICO and >480 nm for MODIS) in the model application, to minimize the influence of ineffective atmospheric correction on the shorter bands of satellite R rs (λ). To validate this adjustment, MuPI was first applied to in situ measured R rs (λ) with HICO and MODIS bands but without data from the shorter wavelengths (≤500 nm for HICO and 412 and 443 nm for MODIS). The retrieved a Gau (λ) from such R rs (λ) agreed quite well with a Gau (λ) from in situ a ph (λ) decomposition, with the mean UAPD ≤38% for HICO, and ≤48% for MODIS bands ( Table 7). The estimated Chl-a concentration has a mean UAPD of 28% which is better than the results from the standard products [45] and those shown in Pan et al. [8], and the mean UAPD for the estimated PC concentration is 32% which is consistent or even better than that reported in the literature [6,7,48]. Figure 9 presents one match-up of satellite and in situ R rs (λ) spectra and the estimated R rs (λ) spectra from MuPI for both HICO and MODIS, as well as, the 13

Chl-a and PC from HICO and MODIS Imagery
We further explored the aGau(λ) distribution obtained from HICO and MODIS of the western basin of Lake Erie. The MuPI scheme was applied without using Rrs(λ) data at the shorter spectral Figure 9. (A) Example spectra of in situ and HICO measured (R rs (λ)), and their corresponding estimated spectra from MuPI; (B) example spectra of in situ and MODIS measured (R rs (λ)), and their corresponding estimated spectra from MuPI; and (C,D) the 13 a Gau (λ) derived from in situ, HICO and MODIS R rs (λ) were plotting against the a Gau (λ) from decomposition of in situ measured a ph (λ).

Chl-a and PC from HICO and MODIS Imagery
We further explored the a Gau (λ) distribution obtained from HICO and MODIS of the western basin of Lake Erie. The MuPI scheme was applied without using R rs (λ) data at the shorter spectral bands (those <500 nm for HICO and <480 nm for MODIS). The power-law relationships obtained in Section 3.4 between a Gau (677) and Chl-a, and a Gau (617.6) and PC concentration were applied to the obtained a Gau (λ) images to map the spatial distribution of Chl-a and PC concentration ( Figure 10).
The estimated spatial distributions of Chl-a and PC from MODIS showed a similar pattern with those from HICO. A two-day difference exists between the HICO (15 August 2014) and MODIS (13 August 2014) observations due to the availability of satellite image, which explains the slightly different locations of the high biomass patches as shown in Figure 9. The non-value patches with in HICO image was a result of the poor R rs (λ) quality due to the failing of the atmospheric correction.

MODIS and MERIS Imagery over Lake Erie
MERIS imagery did not coincide with the in situ observations used in this study. Thus, to provide an example of MERIS imagery, aGau(λ) were retrieved from MERIS imagery on 3 September 2011. MODIS imagery of this day was also considered.
As shown in Figure 11, similar patterns for high and low absorption patches were noticed for the two sensors, but the two images showed different magnitudes of aGau(λ). In further analysis of the Rrs(λ) from these two sensors at the same locations, good agreement was observed, as shown in Figure  12. The existence of 709 nm band in MERIS could be the main reason for the differences in the retrieval results. The patch of high values in the MERIS (marked with solid star: ★) imagery was not included in MODIS because the standard L1B to L2 processing in SeaDAS masked those pixels as clouds. Further validation and evaluation of MERIS and MODIS Rrs(λ) with in situ data are necessary to have a better understanding of the differences in the results, which is beyond the scope of the current work.

MODIS and MERIS Imagery over Lake Erie
MERIS imagery did not coincide with the in situ observations used in this study. Thus, to provide an example of MERIS imagery, a Gau (λ) were retrieved from MERIS imagery on 3 September 2011. MODIS imagery of this day was also considered.
As shown in Figure 11, similar patterns for high and low absorption patches were noticed for the two sensors, but the two images showed different magnitudes of a Gau (λ). In further analysis of the R rs (λ) from these two sensors at the same locations, good agreement was observed, as shown in Figure 12. The existence of 709 nm band in MERIS could be the main reason for the differences in the retrieval results. The patch of high values in the MERIS (marked with solid star: ) imagery was not included in MODIS because the standard L1B to L2 processing in SeaDAS masked those pixels as clouds. Further validation and evaluation of MERIS and MODIS R rs (λ) with in situ data are necessary to have a better understanding of the differences in the results, which is beyond the scope of the current work.

MODIS and MERIS Imagery over Lake Erie
MERIS imagery did not coincide with the in situ observations used in this study. Thus, to provide an example of MERIS imagery, aGau(λ) were retrieved from MERIS imagery on 3 September 2011. MODIS imagery of this day was also considered.
As shown in Figure 11, similar patterns for high and low absorption patches were noticed for the two sensors, but the two images showed different magnitudes of aGau(λ). In further analysis of the Rrs(λ) from these two sensors at the same locations, good agreement was observed, as shown in Figure  12. The existence of 709 nm band in MERIS could be the main reason for the differences in the retrieval results. The patch of high values in the MERIS (marked with solid star: ★) imagery was not included in MODIS because the standard L1B to L2 processing in SeaDAS masked those pixels as clouds. Further validation and evaluation of MERIS and MODIS Rrs(λ) with in situ data are necessary to have a better understanding of the differences in the results, which is beyond the scope of the current work.

The Seasonal Variation of aGau(λ) from MERIS Imagery
We further explored the seasonal variation of aGau(435) and aGau(617. 6) in Lake Erie retrieved from MERIS seasonal composed 4 km Rrs(λ) imagery ( Figure 13). The differences in the spatial distribution of high pigment absorption patches in Spring, Summer, Autumn and Winter were captured. An obvious summer bloom in western basin of Lake Erie is shown in the figure.

The Seasonal Variation of a Gau (λ) from MERIS Imagery
We further explored the seasonal variation of a Gau (435) and a Gau (617.6) in Lake Erie retrieved from MERIS seasonal composed 4 km R rs (λ) imagery ( Figure 13). The differences in the spatial distribution of high pigment absorption patches in Spring, Summer, Autumn and Winter were captured. An obvious summer bloom in western basin of Lake Erie is shown in the figure.

The Seasonal Variation of aGau(λ) from MERIS Imagery
We further explored the seasonal variation of aGau(435) and aGau(617. 6) in Lake Erie retrieved from MERIS seasonal composed 4 km Rrs(λ) imagery ( Figure 13). The differences in the spatial distribution of high pigment absorption patches in Spring, Summer, Autumn and Winter were captured. An obvious summer bloom in western basin of Lake Erie is shown in the figure.

Spectral Requirements for a Gau (λ) Retrieval
In the retrieval of a Gau (λ) from in situ R rs (λ) data, we found that a band around 695-715 nm is important for accurate a Gau (λ) estimation in bloom waters. This is consistent with previous studies for the estimation of Chl-a in turbid productive waters [51][52][53][54]. Fundamentally, for phytoplankton bloom waters, the reflectance at wavelengths 695-715 nm can be augmented in the same way as occurs with terrestrial plants [51][52][53]. Lacking a proper spectral band within this spectral region for MODIS, the near-infrared band at 748 nm was included when inverting MODIS R rs (λ). Although higher uncertainties at 748 nm band were noticed (Section 3.4.1), the inclusion of the band at 748 nm results in much more reasonable a Gau (λ) retrieval from MODIS-Aqua measured R rs (λ) (Figure 9, Table 7) versus without R rs (λ) at this band. The same results were noticed in Section 3.1 for the a Gau (λ) retrieval from R rs (λ) at VIIRS and MSI spectral bands.

The seasonal a Gau (λ) Variation in Lake Erie
The MERIS derived seasonal a Gau (435) and a Gau (617.6) variation in Lake Erie follows the pattern recorded in the literature [55,56]. In the central basin of Lake Erie, the different patterns, as shown in Figure 13, are due to the greater nutrients in the spring and autumn, and lower availability in the summer as a result of water stratification. The nutrient inputs due to agricultural activities as well as an expanding non-native mussel population, along with the light and temperature changes in different seasons form the main drivers for the different algal bloom patterns in Lake Erie during the four seasons [55][56][57][58][59]. As discovered in Moon [55], the biomass and taxonomic composition and the dominant taxa of surface assemblages varied in different seasons, which can be explained by the light, temperature and nutrient combinations in different seasons, and the strong spatial variability associated with mesoscale physical processes such as upwelling and basin-scale circulation [59].

Pigment Retrieval and HABs Detection
In this study, MuPI as a semi-analytical inversion scheme was applied to retrieve multiple Gaussian curves from satellite remote sensing data. As demonstrated in previous study [10], these Gaussian curves are related to the different phytoplankton pigments.
This phytoplankton pigment information has been used in the quantification of phytoplankton community composition, at least to a functional group level ( [5] and the references therein), because many pigments are particular to specific taxonomic groups or even species [1]. However, only limited work has been conducted to obtain these phytoplankton pigments from satellite remote sensing data. Pan et al. [8] and Moisan et al. [60], as two of them, attempted to obtain 12 and 18 different phytoplankton pigments from satellite remote sensing data, respectively, but both of them are empirical approach based. Pan et al. [8] proposed using empirical relationships of 12 different phytoplankton pigments with the same band ratios of satellite R rs (λ) around 490 nm to 550 nm. However, using 490 and 550 nm alone is not a good strategy for multiple pigments, as different pigments have different absorption peaks and troughs at different wavelengths [1]. Compared with Pan et al. [8], Moisan et al. [60] directly used the Chl-a product of satellite remote sensing to estimate a ph (λ), then decompose this a ph (λ) to 18 pigments based on their specific absorption coefficients. However, this satellite Chl-a product used similar empirical algorithm and spectral bands as in Pan et al. [8].
As both works were focused on coastal waters, another large uncertainty comes from non-algal particles and gelbstoff in these waters, which have a big influence on wavelengths shorter than 550 nm, and this influence cannot be eliminated by the band-ratio based algorithm. Compared with these works, MuPI not only considers the contribution of non-algal particles and gelbstoff in the coastal or inland waters, but also the different absorption properties of pigments, as shown in Tables 1 and 2. Another application of the pigment information is in HAB detection. The algorithms for HAB detection are usually based on Chl-a and its anomalies [47], or marker pigments, such as phycocyanin (PC) for cyanobacteria [6,7,18,48,61]. With MuPI, a successful discrimination of phycocyanin (PC) and Chl-a was achieved in this study. Limited by the dataset, the estimation of other pigment concentrations (phycoerythrin, chlorophyll b and c, carotenoids) from a Gau (λ), as demonstrated in Hoepffner and Sathyendranath [10], could not be conducted here.
Cyanobacteria dominated HABs are increasing globally and presenting a major environmental and human health issue. Extensive Microsystis blooms with toxin production occur during summer and fall in different regions around the world and Microcystis contamination has been documented at many regions including Pinto Lake (California), Lake Erie (U.S.A.) and Lake Taihu (China) [6,7,21,61]. Other common bloom-forming pelagic genera include Aphanizomenon, Anabaena, Rhodomonas and Planktothrix. However, since toxicity is primarily associated with Microcystis, these other cyanobacteria blooms are generally considered nuisance blooms which will not cause acutely dangers to humans and wildlife [62], but they are frequently present in impacted water bodies ( Figure 8). However, the ability to discriminate the different bloom-causing species is one of the challenges that existing algorithms are facing. In our analysis with a Gau (λ) in Section 3.3, the possibility of discriminating different cyanobacteria species was shown. The variation of the spectral shape defined by a Gau (435), a Gau (584.4), and a Gau (617.6) was found vary with different species and their composition in the water body as a result of the variation in pigment ratios [6]. This result showed the potential of MuPI in the application of separating species in HAB waters, which will be useful in detecting and monitoring potential toxin producers. However, because of data limitation, this potential is not fully addressed in this study, but it will be further explored with a larger dataset in the future.

Conclusions
In this study, the MuPI model was validated for obtaining the peak heights of Gaussian curves (a Gau (λ)) from multi-spectral satellite remote sensing data. The model performance was validated in the retrieval accuracy of a Gau (λ) with R rs (λ) of six multi-spectral band configurations, and the spectral requirements were discussed. Less than 35% of mean unbiased absolute percentage differences were achieved for a Gau (λ) from R rs (λ) spectra with OLCI, MERIS and MODIS bands, and less than 45% for VIIRS, MSI and OLI bands. Using data from the western basin of Lake Erie as an example, the R rs (λ) obtained from HICO and MODIS satellites were validated with in situ data over cyanobacterial bloom waters in Lake Erie, and the spatial distributions of a Gau (λ) and the concentrations of chlorophyll a and phycocyanin were obtained, where the patches of cyanobacteria bloom were clearly presented. A seasonal distribution of pigment absorption coefficients was obtained from MERIS seasonal composed imagery of 2011 for Lake Erie. These results demonstrate that, with MuPI, it is possible to analytically retrieve information of not only Chl-a, but also PC and potentially other pigments, which will significantly enhance our capability to characterize and evaluate the status of phytoplankton blooms and discriminate phytoplankton groups using satellite ocean color remote sensing.