A Remote Sensing Algorithm of Column-Integrated Algal Biomass Covering Algal Bloom Conditions in a Shallow Eutrophic Lake

Column integrated algal biomass provides a robust indicator for eutrophication evaluation because it considers the vertical variability of phytoplankton. However, most remote sensing-based inversion algorithms of column algal biomass assume a homogenous distribution of phytoplankton within the water column. This study proposes a new remote sensing-based algorithm to estimate column integrated algal biomass incorporating different possible vertical profiles. The field sampling was based on five surveys in Lake Chaohu, a large eutrophic shallow lake in China. Field measurements revealed a significant variation in phytoplankton profiles in the water column during algal bloom conditions. The column integrated algal biomass retrieval algorithm developed in the present study is shown to effectively describe the vertical variation of algal biomass in shallow eutrophic water. The Baseline Normalized Difference Bloom Index (BNDBI) was adopted to estimate algal biomass integrated from the water surface to 40 cm. Then the relationship between 40 cm integrated algal biomass and the whole column algal biomass at various depths was built taking into consideration the hydrological and bathymetry data of each site. The algorithm was able to accurately estimate integrated algal biomass with R2 = 0.89, RMSE = 45.94 and URMSE = 28.58%. High accuracy was observed in the temporal consistency of satellite images (with the maximum MAPE = 7.41%). Sensitivity analysis demonstrated that the estimated algal biomass integrated from the water surface to 40 cm has the greatest influence on the estimated column integrated algal biomass. This algorithm can be used to explore the long-term variation of algal biomass to improve long-term analysis and management of eutrophic lakes.


Introduction
Inland freshwater ecosystems play an important role in the economic, cultural, aesthetic, scientific and educational aspects of their local and regional populations [1].However, due to the sharply increasing human demands for freshwater in the past century, inland freshwater ecosystems are globally undergoing increased modification [2,3].An increased anthropogenic input of nitrogen and phosphorus from agriculture, industrial waste and sewage has caused widespread eutrophication [4][5][6][7].Affected locations include the Baltic Sea [8], Lake Garda in Italy [9,10], the East China Sea [11], Lake Taihu in China [12], Lake Bogoria in Kenya [13,14], Lake Victoria in Africa [15,16], Lake Erie in the United States [17,18], Lake Michigan in the United States [19,20], Lake Columbia in Canada [21,22], Bahía Blanca Estuary in Argentina [23,24] and the Great Barrier Reef in Australia [25,26].Inland eutrophication has led to increasing occurrences of cyanobacteria blooms, with impact on water, fisheries and biodiversity [27,28].
The limited relationship between surface and total column algal biomass has been evidenced by rapid changes observed in remote sensing of surface investigation over a short time.For instance, the bloom area has been observed to double even within 6 h in the East China Sea from Geo-stationary Ocean Color Imager (GOCI) observation [11].Neither algal growth rate nor environmental conditions would allow bloom areas to expand at such a rapid pace, indicating that vertical migration of algae is a likely cause [46].Thus, column integrated algal biomass is considered to be a more robust indicator of eutrophication.
Empirical estimation of heterogeneous column-integrated algal biomass is often conducted using in situ measurements, such as the vertical profile of Chl-a.For Case I waters, for example, open sea, where the optical signal is only dominated by phytoplankton and related materials, a Gaussian profile is often observed for profile description [47].Specific parameters describing the vertical profiles are retrieved from their linkage with surface Chl-a concentrations, remote sensing reflectance and other environmental parameters, such as wind speed [48][49][50].However, these algorithms are appropriate for Case I waters only, where water leaving reflectance signal is mainly determined by phytoplankton and related breakdown products.Additionally, a large sampling dataset is needed to obtain the profile parameters [8,51].For Case II waters, that is, coastal and inland waters, the optical signal is more complicated and the influences of phytoplankton, total suspended matter and Colored Dissolved Organic Matter (CDOM) need to be considered [52].In eutrophicated Case II waters, Chl-a concentration can be extremely high, further complicating the vertical distribution.Past studies have shown that surface information and local physical/hydrologic conditions, such as water depth, influences the vertical profile for Case II waters [53,54].However, this relationship has been retrieved limited to the non-bloom condition.During algal blooms, higher surface Chl-a concentrations occur, with a sharp decrease of Chl-a concentration, especially for depths below 1 m.An exponential or hyperbolic vertical profile type has been observed, making the use of a Gaussian profile inappropriate due to overestimation of total column-integrated algal biomass.
Estimations of column-integrated algal biomass under algal bloom conditions commonly assumes homogenous mixing within a nutrient-rich upper layer [55].Integrating surface Chl-a concentrations and a homogenous layer "critical" depth allows for algal biomass estimation.This "critical depth" replaces the mixed layer depth used in the classic assumption (suitable for most of the non-algal bloom conditions).The critical depth can be estimated from surface irradiance and the diffuse attenuation coefficient, which can be further retrieved from satellite observation [55][56][57].However, the homogeneous layer assumption is not suitable for eutrophic inland lakes, where high variability can occur in the upper 0.1 m [58].A more comprehensive approach is needed to better estimate column integrated algal biomass under these conditions.
This study focuses on developing a new remote sensing-based algorithm to determine the column integrated algal biomass during algal bloom conditions.Validation was conducted using in situ measurement and image inversion.Sensitivity analysis and comparison with other algorithms were all conducted.The limitations and further applications in column-integrated algal biomass estimation are also discussed.

Study Site
Lake Chaohu is located in Anhui province, the southeast part of China (Figure 1).It is the fifth largest freshwater lake in China, with an average area of 770 km 2 and a mean water depth of 3.3 m [59,60].Thirty-three rivers discharges in Lake Chaohu.These rivers belong to seven river systems: the Hangbuhe, the Fenglehe Rivers, the Paihe River, the Nanfei River, the Dianbuhe River, the Baishishanhe River, the Zhegaohe River, the Yuxihe River and the Zhaohe River (Figure 1).The first four rivers contribute more than 90% of the total runoff volume to Lake Chaohu.Finally, the Yuxi River discharges into the Yangtze river [61].The lake serves as an essential water resource to two large cities, Hefei and Chaohu with nearly 1 million people between them [62].The lake also plays an important role in tourism and recreation.
The lake has been heavily polluted and suffered from increasing harmful algal blooms since the 1990s.Although several algae species have been observed, cyanobacterial blooms commonly occur from May to November of each year throughout the lake [63,64].The lake is commonly divided into three parts based on their eutrophication levels; the western part is in severely eutrophic status while the central and eastern parts are considered to be mesotrophic.[65,66].

Study Site
Lake Chaohu is located in Anhui province, the southeast part of China (Figure 1).It is the fifth largest freshwater lake in China, with an average area of 770 km 2 and a mean water depth of 3.3 m [60,61].Thirty-three rivers discharges in Lake Chaohu.These rivers belong to seven river systems: the Hangbuhe, the Fenglehe Rivers, the Paihe River, the Nanfei River, the Dianbuhe River, the Baishishanhe River, the Zhegaohe River, the Yuxihe River and the Zhaohe River (Figure 1).The first four rivers contribute more than 90% of the total runoff volume to Lake Chaohu.Finally, the Yuxi River discharges into the Yangtze river [62].The lake serves as an essential water resource to two large cities, Hefei and Chaohu with nearly 1 million people between them [63].The lake also plays an important role in tourism and recreation.
The lake has been heavily polluted and suffered from increasing harmful algal blooms since the 1990s.Although several algae species have been observed, cyanobacterial blooms commonly occur from May to November of each year throughout the lake [64,65].The lake is commonly divided into three parts based on their eutrophication levels; the western part is in severely eutrophic status while the central and eastern parts are considered to be mesotrophic.[66,67].

In Situ Data
Five surveys were conducted in Lake Chaohu during May, July and October 2013, May 2015 and November 2017 (Figure 1).For each survey, field measurements focused on the vertical distribution of Chl-a concentration and remote sensing reflectance.Water samples to retrieve Chl-a concentrations were collected at nine different depths (surface, 0.1, 0.2, 0.4, 0.7, 1.0, 1.5, 2 and 3 m) using a customized

In Situ Data
Five surveys were conducted in Lake Chaohu during May, July and October 2013, May 2015 and November 2017 (Figure 1).For each survey, field measurements focused on the vertical distribution of Chl-a concentration and remote sensing reflectance.Water samples to retrieve Chl-a concentrations were collected at nine different depths (surface, 0.1, 0.2, 0.4, 0.7, 1.0, 1.5, 2 and 3 m) using a customized profiling instrument [54].The instrument consists of four parts: a graduated profiling sampling wire, a small vacuum pump, a connective tube and a portable battery.The graduated profiling sampling wire was used to collect water at specific depths.Surface water was collected directly through the sampling bottle.Water samples were filtered on board using 25-mm Whatman GF/C glass fiber filters (pore size of 1.2 µm) and then stored in darkness at 4 • C until laboratory analysis.Besides water samples, wind speed and direction were also measured by a handheld anemometer.Additionally, cloud conditions were also recorded simultaneously.In total, 54 water samples were collected in specific locations (Figure 1).
Chl-a concentrations (µg/L) was measured according to NASA recommended protocols [67].Filtered water was extracted with 90% acetone for 24 h in the dark at 4 • C before analysis.Chl-a concentrations were calculated using absorbance at 630, 645, 663 and 750 nm measured by a UV-2600 spectrophotometer (Shimadzu Corp., Kyoto, Japan) [68].
Column-integrated algal biomass was calculated based on the nine discrete Chl-a concentrations.For each sampling site, nine depth Chl-a concentrations were evaluated using regression analysis for profile shape: Gaussian, exponential and hyperbolic.Root mean square error and coefficient of determination (R 2 ) for each fitted model were recorded and compared with the Curve Fitting Toolbox of MATLAB R2016a software (The Math Works, Inc., Natick, MA, USA).The fitted model with the lowest RMSE and highest R 2 (Coefficient of determination) were selected to describe the corresponding vertical profile of Chl-a.Column integrated algal biomass was calculated through the integration of each determined vertical profile.The result has been validated by in situ measurements, with a mean absolute error less than 5%.

Remote Sensing Data
In situ remote sensing reflectance R rs was measured by the a FieldSpec Pro Dual VNIR (Analytical Spectral Devices, Boulder, CO, USA) following NASA Ocean optics protocols.The instrument has a spectral wavelength range from 350 to 1050 nm.Two probes have a 25-degree field of view.Water leaving radiance L t , the radiance of gray reference panel L g and sky radiance L s were measured via this instrument.Under extreme bloom conditions, the algae can be easily disturbed by boat movement.To solve this difficulty, only large and homogenous lake area with algal bloom were selected for field measurement.For these sites, the R rs measurements were conducted 5 times to obtain the average value.
Moderate Resolution Imaging Spectroradiometer (MODIS) 250-m and 500-m resolution Level-0 data from 2003 to 2016 were downloaded from the U.S. NASA Goddard Space Flight Center (http://oceancolor.gsfc.nasa.gov).Level-0 data were processed to calibrated radiance (Level-1B) within SeaDAS software (version 7.0).Rayleigh-correction, which corrected for gaseous absorption and Rayleigh scattering effects, was conducted on Level-1B data.Partial atmospheric correction was used to avoid incorrect data masking when using SeaDAS to do a full atmospheric correction [69].Over highly turbid coastal and inland waters, saturation of the MODIS ocean band is possible [70]; therefore, we used the terrestrial instead of ocean band.

Algorithm Approach (Flow Chart)
The general approach consisted of a multistep process (Figure 2).MODIS L1b data was processed to determine reflectance R rc .The floating algae index (FAI) was applied to each satellite image to identify algal bloom conditions for each pixel.FAI has been successfully applied to identify surface bloom in coastal and inland waters under various environmental and observation conditions.For MODIS, FAI was described as: where R rc (645), R rc (859), R rc (1240) stands for Rayleigh-corrected reflectance with the central band at 645, 859 and 1240 nm respectively, referring to band 1, band 2 and band 5 for MODIS.Following an earlier study, the time-independent threshold FAI = 0.0006 was adopted to identify algal bloom conditions in Lake Chaohu.
Then, column integrated algal biomass estimation was applied to the extract algal bloom conditions only.An algorithm was built based on an empirical relationship between surface investigation and depth-integrated algal biomass.Parameters to describe the relationship were retrieved from the regression model.Finally, the algorithm was used in combination with the bathymetry data and remote sensing data to estimate column-integrated algal biomass.A more detailed algorithm building process is described in the following paragraph.
Following an earlier study, the time-independent threshold FAI = 0.0006 was adopted to identify algal bloom conditions in Lake Chaohu.
Then, column integrated algal biomass estimation was applied to the extract algal bloom conditions only.An algorithm was built based on an empirical relationship between surface investigation and depth-integrated algal biomass.Parameters to describe the relationship were retrieved from the regression model.Finally, the algorithm was used in combination with the bathymetry data and remote sensing data to estimate column-integrated algal biomass.A more detailed algorithm building process is described in the following paragraph.

Depth-Integrated Algal Biomass Retrieval from the Surface Investigation (1) Principles of the algorithm
Based on the field observation, Rrs under algal bloom conditions has two minima around 440 and 625.This was due to the absorption by Chl-a and phycocyanin respectively.Additionally, significantly higher Rrs was also observed at NIR range compared to non-algal bloom conditions.This is contributed by the dense surface phytoplankton under algal bloom conditions.Therefore, bands of blue, red and NIR were considered to characterize the column integrated algal biomass under algal bloom conditions.

Depth-Integrated Algal Biomass Retrieval from the Surface Investigation (1) Principles of the algorithm
Based on the field observation, R rs under algal bloom conditions has two minima around 440 and 625.This was due to the absorption by Chl-a and phycocyanin respectively.Additionally, significantly higher R rs was also observed at NIR range compared to non-algal bloom conditions.This is contributed by the dense surface phytoplankton under algal bloom conditions.Therefore, bands of blue, red and NIR were considered to characterize the column integrated algal biomass under algal bloom conditions.
The BNDBI index uses a combination of the wave bands that has been shown to accurately estimate algal biomass under algal bloom conditions.The index was built based on the normalized difference of R rc at 555 nm and a corrected R rc 645 nm, derived from the baseline by R rc at 469 nm and 859 nm [71].
The relationship between BNDBI index and Chl-a is nonlinear and the column integrated algal biomass algorithm was developed with a non-linear relationship.
(2) Depth-integrated algal biomass algorithm procedure Based on sampling results under algal bloom conditions, the relationship between surface information and column integrated algal biomass was retrieved.A high correlation was observed between the Baseline Normalized Difference Bloom Index (BNDBI) and algal biomass present in the surface layers (from the surface to 10, 20 and 40 cm respectively) (Figure 3).
Algal biomass integrated to 40 cm had the highest correlation with the BNDBI index.Hence this layer, surface to 40 cm, was selected to retrieve column algal biomass at other depths.The relationship between BNDBI index and algal biomass integrated to 40 cm (Bio(40))can be described as: As the water depth of Lake Chaohu is less than 6 m, the algorithm was built to include algal biomass up 6 m [54].A good linear relationship was observed between Bio(40) and column integrated algal biomass at different depth layers, with R 2 = 0.81, NRMSD = 0.43.The specific relationship was illustrated in Figure 4a-f).For instance, from the surface to depth of 3 m in Figure 4c, the relationship can be quantified as y = 1.702x + 16.740 with R 2 of 0.82.
ISPRS Int.J. Geo-Inf.2018, 7, x FOR PEER REVIEW 6 of 17 The BNDBI index uses a combination of the wave bands that has been shown to accurately estimate algal biomass under algal bloom conditions.The index was built based on the normalized difference of Rrc at 555 nm and a corrected Rrc 645 nm, derived from the baseline by Rrc at 469 nm and 859 nm [72].where Rrc(469), Rrc(555), Rrc(645), Rrc(859) are Rayleigh-corrected reflectance with the wavebands centered at 469 nm, 555 nm, 645 nm and 859 nm, corresponding to band 3, band 4, band 1 and band 2 of MODIS, respectively.
The relationship between BNDBI index and Chl-a is nonlinear and the column integrated algal biomass algorithm was developed with a non-linear relationship.
(2) Depth-integrated algal biomass algorithm procedure Based on sampling results under algal bloom conditions, the relationship between surface information and column integrated algal biomass was retrieved.A high correlation was observed between the Baseline Normalized Difference Bloom Index (BNDBI) and algal biomass present in the surface layers (from the surface to 10, 20 and 40 cm respectively) (Figure 3).
Algal biomass integrated to 40 cm had the highest correlation with the BNDBI index.Hence this layer, surface to 40 cm, was selected to retrieve column algal biomass at other depths.The relationship between BNDBI index and algal biomass integrated to 40 cm (Bio(40))can be described as: As the water depth of Lake Chaohu is less than 6 m, the algorithm was built to include algal biomass up 6 m [55].A good linear relationship was observed between (40) and column integrated algal biomass at different depth layers, with R 2 = 0.81, NRMSD = 0.43.The specific relationship was illustrated in Figure 4a-f).For instance, from the surface to depth of 3 m in Figure 4c, the relationship can be quantified as y = 1.702x + 16.740 with R 2 of 0.82.Therefore, the relationship between algal biomass integrated from the surface to 40 cm and column integrated algal biomass can be described as follows: where AI (mg• m −2 ) is column integrated algal biomass at different water depths.a, b are empirical coefficients related to water depth.
(3) Parameters retrieval The coefficient a and b were obtained based on in situ measurements.For each sampling depth, the corresponding a and b were estimated following Equation (4) using Matlab (R2015a).For the depths where in situ data were not covered, a linear interpolation was used to estimate the column algal biomass and further derived coefficients a and b (Figure 5).Multiple regression models were tested and the one with maximum R 2 was adopted.Finally, a log model was selected to derive coefficient a from water depth.The good fitness was ascribed to the shape of Chl-a's vertical distribution under algal bloom conditions.A linear model was used to describe coefficient b, which is closely related to water depth.The good correlation was attributed to a relatively stable background Chl-a concentration at different depth intervals.The specific relationship between coefficients a and b and water depth can be described as: a = 0.337 × ln(z) + 1.312 b = 6.453 × z − 2.028 (5) where z refers to water depth.It was calculated based on hydrological data and DEM data of Lake Chaohu.More details on water depth calculation were described in the previous paper [55].Therefore, the relationship between algal biomass integrated from the surface to 40 cm and column integrated algal biomass can be described as follows: where AI (mg•m −2 ) is column integrated algal biomass at different water depths.a, b are empirical coefficients related to water depth. (

3) Parameters retrieval
The coefficient a and b were obtained based on in situ measurements.For each sampling depth, the corresponding a and b were estimated following Equation (4) using Matlab (R2015a).For the depths where in situ data were not covered, a linear interpolation was used to estimate the column algal biomass and further derived coefficients a and b (Figure 5).Multiple regression models were tested and the one with maximum R 2 was adopted.Finally, a log model was selected to derive coefficient a from water depth.The good fitness was ascribed to the shape of Chl-a's vertical distribution under algal bloom conditions.A linear model was used to describe coefficient b, which is closely related to water depth.The good correlation was attributed to a relatively stable background Chl-a concentration at different depth intervals.The specific relationship between coefficients a and b and water depth can be described as: where z refers to water depth.It was calculated based on hydrological data and DEM data of Lake Chaohu.More details on water depth calculation were described in the previous paper [54].

Performance Evaluation
The accuracy of algal biomass estimation was evaluated using four statistical indicators: mean absolute percentage error (MAPE), root mean square error (RMSE), unbiased RMSE (URMSE) and normalized root-mean-square deviation (NRMSD): where  and  refer to the measured and estimated values for the ith sample and N is the sample number.URMSE was chosen to avoid deviations by skewed error distributions.

Field Observation of Column-Integrated Algal Biomass under Algal Bloom Conditions
Chl-a and column integrated algal biomass were retrieved based on the methods mentioned in Section 2.2.Table 1demonstrated the statistical result of the Chl-a concentration and the corresponding column integrated algal biomass at each depth layer.For instance, Chl-a concentration at 0.2 m is retrieved from field measurement directly, while algal biomass at 0.2 m refers to algal biomass integrated from 0.15 to 0.25 m.Both Chl-a and column integrated algal biomass exhibited large vertical variability.For Chl-a, the maximum concentration was located at the surface layer with the average value of 136.40 μg/L, with a large range of 55.43 to 313.85 μg/L.The Chl-a concentration generally decreases with the water depth.For algal biomass, the maximum average value was located around 0.7 m.This demonstrated that higher Chl-a concentration had a weak correlation to increasing algal biomass.Additionally, the CV value of both Chl-a and algal biomass increased with water depth,

Performance Evaluation
The accuracy of algal biomass estimation was evaluated using four statistical indicators: mean absolute percentage error (MAPE), root mean square error (RMSE), unbiased RMSE (URMSE) and normalized root-mean-square deviation (NRMSD): where x i and y i refer to the measured and estimated values for the ith sample and N is the sample number.URMSE was chosen to avoid deviations by skewed error distributions.

Field Observation of Column-Integrated Algal Biomass under Algal Bloom Conditions
Chl-a and column integrated algal biomass were retrieved based on the methods mentioned in Section 2.2.Table 1demonstrated the statistical result of the Chl-a concentration and the corresponding column integrated algal biomass at each depth layer.For instance, Chl-a concentration at 0.2 m is retrieved from field measurement directly, while algal biomass at 0.2 m refers to algal biomass integrated from 0.15 to 0.25 m.Both Chl-a and column integrated algal biomass exhibited large vertical variability.For Chl-a, the maximum concentration was located at the surface layer with the average value of 136.40 µg/L, with a large range of 55.43 to 313.85 µg/L.The Chl-a concentration generally decreases with the water depth.For algal biomass, the maximum average value was located around 0.7 m.This demonstrated that higher Chl-a concentration had a weak correlation to increasing algal biomass.Additionally, the CV value of both Chl-a and algal biomass increased with water depth, indicating that the relationship between surface concentrations and those of each layer should be carefully considered.The column integrated algal biomass was used in algorithm validation.Note: Algal biomass stands for a 0.1 m integration at the specific depth layer.For instance, algal biomass at 0.2 m refers to algal biomass integrated from 0.15 to 0.25 m.

Algorithm Validation
The algorithm validation was processed in two ways.One is based on the water column validation and the other is based on the temporal consistency of remote sensing images from different time periods.For water column validation, in situ water samples in algal bloom conditions were selected to examine the algorithm's accuracy.The temporal consistency of satellite derived algal biomass is another effective method of algorithm validation; the total algal biomass should remain consistent over short time periods.

Water Column Validation
Water column validation was conducted comparing in situ measurements and estimations from satellite images.The in situ validation dataset (N = 24) contains column integrated algal biomass and measured water depth under algal bloom conditions.Synchronous satellite data were used to calculate BNDBI index and estimate column algal biomass (Figure 6a).A significant correlation was observed, with R 2 = 0.84, RMSE = 45.94 and URMSE = 28.58%.For extremely high algal biomass conditions, an underestimation was observed.
Including the validation dataset under non-algal bloom conditions, a high correlation was found with R 2 = 0.89, RMSE = 14.71 and URMSE = 9.31% (Figure 6b).The result demonstrated that the developed algorithm is quite reliable.
indicating that the relationship between surface concentrations and those of each layer should be carefully considered.The column integrated algal biomass was used in algorithm validation.Note: Algal biomass stands for a 0.1 m integration at the specific depth layer.For instance, algal biomass at 0.2 m refers to algal biomass integrated from 0.15 to 0.25 m.

Algorithm Validation
The algorithm validation was processed in two ways.One is based on the water column validation and the other is based on the temporal consistency of remote sensing images from different time periods.For water column validation, in situ water samples in algal bloom conditions were selected to examine the algorithm's accuracy.The temporal consistency of satellite derived algal biomass is another effective method of algorithm validation; the total algal biomass should remain consistent over short time periods.

Water Column Validation
Water column validation was conducted comparing in situ measurements and estimations from satellite images.The in situ validation dataset (N = 24) contains column integrated algal biomass and measured water depth under algal bloom conditions.Synchronous satellite data were used to calculate BNDBI index and estimate column algal biomass (Figure 6a).A significant correlation was observed, with R 2 = 0.84, RMSE = 45.94 and URMSE = 28.58%.For extremely high algal biomass conditions, an underestimation was observed.
Including the validation dataset under non-algal bloom conditions, a high correlation was found with R 2 = 0.89, RMSE = 14.71 and URMSE = 9.31% (Figure 6b).The result demonstrated that the developed algorithm is quite reliable.

Area-Integrated Validation
This validation method was based on the temporal consistency of satellite images over time.Temporal consistency of algal biomass estimations was also examined by comparing the estimated result from satellite images over different time periods on consecutive days.
Observation from MODIS data on consecutive days demonstrated good consistency (Figure 7a,b).Algal biomass changed from 25.91 t to 23.99 t in February 2004.Satisfying results were obtained with MAPE of 7.41%.Similar results were derived from consecutive images' observation.Algal biomass on consecutive days ranged from 26.76 t to 28.63 t in May 2016.The variation was 1.87 t with MAPE of 6.98%.A lower error was observed in area integrated algal biomass compared to in situ points.Generally, the high correlation and temporal consistency show the built algorithm was applicable and reliable for both in situ and satellite data.

Sensitivity Analysis
A sensitivity analysis was conducted to observe which parameters have the greatest impact on the results.A simulation was conducted by a HydroLight radiative transfer model with highly variable integrated column algal biomass while keeping all other input parameters constant.Based on in situ datasets (see Table 1), two input parameters, Bio(40) and depth, were modified by ±5%, ± 10% and ± 20%, respectively based on the average value.Bio(40) was found to have greater influence than depth in algal biomass calculation under algae bloom conditions (Table 2).When

Sensitivity Analysis
A sensitivity analysis was conducted to observe which parameters have the greatest impact on the results.A simulation was conducted by a HydroLight radiative transfer model with highly variable integrated column algal biomass while keeping all other input parameters constant.Based on in situ datasets (see Table 1), two input parameters, Bio(40) and depth, were modified by ±5%, ±10% and ±20%, respectively based on the average value.Bio(40) was found to have greater influence than depth in algal biomass calculation under algae bloom conditions (Table 2).When Bio(40) was altered by ±20%, the calculated algal biomass was ±16.41%, while in term of depth, the corresponding variation was only ±4.07%.Thus, the algorithm was less sensitive to depth than algal biomass integrated from the surface to 40 cm.This revealed different results under non-algal bloom conditions, where surface investigation was less sensitive to water depth [54].

Comparison with Other Algorithms
Three proposed algorithms were evaluated and compared with our algorithm using the in situ validation dataset.The comparison result was illustrated in Table 3.For the Morel and Uitz algorithms, the in situ surface Chl-a concentration data was used directly to estimate the column integrated algal biomass at each site.For the Chriswell algorithm, the calculation of "critical depth," derived from the diffuse attenuation coefficient (K d ), surface irradiance and equivalence irradiance (where depth production equals losses), was not measured in our field surveys.Therefore, we used the radiative transfer model-HydroLight-to simulate these parameters.The input parameters (including Chl-a vertical concentration, SPIM, CDOM, absorption coefficient etc.) were kept the same as in the field measurements.
Morel's algorithms (in short for MB89 here below) were built based on nearly 4000 vertical profiles of Chl-a in ocean water only [72].In this dataset, most maximum Chl-a concentrations were typically located in the subsurface, which can be more than two orders different from the surface layer.This is not consistent with the vertical profile in lake water under algal bloom conditions, where the maximum Chl-a value is in the surface layer.Therefore, the overestimation of MB89 resulted in a RMSE = 278.9908,URMSE = 0.8503.However, a good correlation was observed, as the MB89 algorithm is based on the theory that the integrated algal biomass and the near-surface Chl-a concentration are highly but nonlinearly related.Our algorithm is also based on similar observations.Therefore, a good correlation was observed when applying MB89 to our dataset with the R 2 = 0.6204.The result was retrieved following the Utiz algorithm.The overestimation from the Utiz algorithm results from a variable background Chl-a value applied to replace the constant background value.The depth variable background value was derived from the surface concentration with a linear relationship.Due to the much higher surface Chl-a in lake water than ocean water, a larger background Chl-a was estimated, resulting in a large overestimation.
For the critical depth-based algorithm, surface Chl-a concentration is a key input parameter.The much higher surface Chl-a concentration commonly leads to an extremely high algal biomass value at each water column.This also made the estimated value more discrete and resulted in the observation of the lowest R 2 .

Analysis of Variation in Spatial Distribution
The seasonal changes in spatial distribution variation are attributed to the succession of dominant algae species.Three main algae species observed in Lake Chaohu were: Cyanobacteria, Bacillariophytes and Chlorophytes.A high abundance of Cyanobacteria was observed in summer, which contributes to the heavy cyanobacterial bloom [62].Starting in the autumn, cyanobacterial blooms degrade and are replaced by Bacillariophytes and Chlorophytes.Bacillariophytes are dominant in later winter and spring.Chlorophytes is the dominant species in spring and autumn.Moreover, Bacillariophytes and Chlorophytes are observed to be higher in the east lake segment of Lake Chaohu [73].
According to Figure 6a,b, a nearly homogeneous spatial distribution was observed in Lake Chaohu.This is because the dominant algae species is Bacillariophytes during this time was evenly distributed in the lake.In May, higher algal biomass was observed in the east lake driven by Chlorophytes.In Figure 6c,d, a growing trend could be observed in the west lake, showing the increasing Cyanobacteria concentrations, which lead to high spatial variation between the west lake and other segments.

Extending Application to other Lakes
When applying this algorithm to other lakes, such as Lake Taihu, two points should be noted.First, the threshold of FAI index used to identify the algal bloom conditions should be recalculated.This FAI threshold is based on a long time series of statistics and can be different from one lake to another [30,54].The algorithm is applied for algal bloom conditions only.Therefore, the accurate threshold may have an influence on the inversion result.The other factor is the specific coefficients in the algorithm.Even if the algorithm from Equations ( 2) and ( 4) can be applied to other lakes, the specific coefficients need to be adjusted (or validated) using local sampled data, as the coefficients in this study were determined from empirical regression with the Lake Chaohu dataset.The composition of in-water optical constituents may vary between lakes, which will lead to various different vertical distributions of phytoplankton [38].

Algorithm for Data from Other Sensors
The most significant strength of the algorithm comes from its use of green and red MODIS bands (band 1 and band 4).These broad wavebands allow the algorithm to be applied to other satellite systems (such as Landsat, Sentinel and SPOT).In theory, the algorithm could be applied to datasets with different temporal or spatial resolution when an appropriate atmospheric correction is conducted.In particular, the algorithm can be applied to the ocean and coastal sensors, such as ESA's Sentinel-3 OLCI and NASA's planned hyperspectral Plankton-Aerosol-Cloud-ocean Ecosystem (PACE)'s primary sensor, the Ocean Color Instrument (OCI).All of these missions will provide chances to further develop, refine and validate the algorithm to retrieve more accurate inversion results.

Algorithm Limitations
Our algorithm illustrated robust estimation of column-integrated algal biomass under algal bloom conditions.However, the algorithm may have problems in extremely high bloom conditions.In the validation process, points with surface Chl-a concentration over 700 (µg/L) were removed.Usually, under such conditions, cyanobacteria floats on the surface and accumulates to form scums [74].The corresponding remote sensing reflectance is similarly observed with terrestrial plants, rather than water [32].Large overestimation may occur if the built algorithm is applied.For this extreme condition, additional measurement and algorithm development is needed.
Another challenge in applying the algorithm is the difference in spatial resolution and the temporal gap between satellite and field measurements.Satellite data have a spatial coverage (i.e., MODIS has 250 m resolution) that may not be completely covered by algal bloom.Differences in local environmental factors, may also lead to differences in the vertical migration of algae.Furthermore,

Figure 2 .
Figure 2. Schematic of the processing procedure of column-integrated algal biomass estimation.

Figure 2 .
Figure 2. Schematic of the processing procedure of column-integrated algal biomass estimation.

Bio( 40 ) 3 )Figure 3 .
Figure 3.The relationship between BNDBI index and algal biomass integrated from the surface to different water depth.(a-c) refers to the depth of 10, 20, 40 cm respectively.(e.g., in (c), Y-axis refers to algal biomass integrated from surface to 40 cm).

Figure 3 .
Figure 3.The relationship between BNDBI index and algal biomass integrated from the surface to different water depth.(a-c) refers to the depth of 10, 20, 40 cm respectively.(e.g., in (c), Y-axis refers to algal biomass integrated from surface to 40 cm).

Figure 4 .
Figure 4.The relationship between algal biomass integrated from surface to 40 cm and total algal biomass integrated from surface to various depth based on in situ measurements.(a-f) represent depth from 1 to 6 m respectively (e.g., in (e), column-integrated algal biomass refers to algal biomass integrated from the surface to 6 m).

Figure 4 .
Figure 4.The relationship between algal biomass integrated from surface to 40 cm and total algal biomass integrated from surface to various depth based on in situ measurements.(a-f) represent depth from 1 to 6 m respectively (e.g., in (e), column-integrated algal biomass refers to algal biomass integrated from the surface to 6 m).

Figure 5 .
Figure 5.The relationship between (a) coefficient a, (b) coefficient b and water depth.The black line refers to the fitting line.The corresponding formulas are shown in each figure.

Figure 5 .
Figure 5.The relationship between (a) coefficient a, (b) coefficient b and water depth.The black line refers to the fitting line.The corresponding formulas are shown in each figure.

Figure 6 .
Figure 6.Validation results between in situ measured algal biomass and estimated algal biomass from MODIS observation.(a) results under algal bloom conditions; (b) results under all bloom and non-bloom conditions.

Figure 1 .
Figure 1.Validation results between in situ measured algal biomass and estimated algal biomass from MODIS observation.(a) results under algal bloom conditions; (b) results under all bloom and nonbloom conditions.

Figure 7 .
Figure 7. Area integrated validation results based on satellite images.Estimated algal biomass results from MODIS observation in consecutive days: (a) on 15 February 2004 (b) on 16 February 2004.The variation between (a,b) is 1.08 t with MAPE = 7.41%.Comparison of MODIS derived algal biomass in the adjacent day: (c) on 15 May 2016 (d) on 19 May 2016.The variation is 1.87 t, MAPE = 7.41%.

Figure 7 .
Figure 7. Area integrated validation results based on satellite images.Estimated algal biomass results from MODIS observation in consecutive days: (a) on 15 February 2004 (b) on 16 February 2004.The variation between (a,b) is 1.08 t with MAPE = 7.41%.Comparison of MODIS derived algal biomass in the adjacent day: (c) on 15 May 2016 (d) on 19 May 2016.The variation is 1.87 t, MAPE = 7.41%.

Table 1 .
Measurements from sampling campaigns in Lake Chaohu from during May, July and October 2013, May 2015 and November 2017: chlorophyll concentration (Chl-a, µg/L), algal biomass (mg • m −2 ) at each different water depth.

Table 1 .
Measurements from sampling campaigns in Lake Chaohu from during May, July and October 2013, May 2015 and November 2017: chlorophyll concentration (Chl-a, μg/L), algal biomass (mg•m −2 ) at each different water depth.

Table 2 .
Sensitivity analysis of the total algal biomass algorithm under algal bloom conditions.

Table 3 .
Comparison of the performance of the column integrated algal biomass algorithm applied to the in situ dataset from the present study.