Hybrid Chlorophyll-a Algorithm for Assessing Trophic States of a Tropical Brazilian Reservoir Based on MSI/Sentinel-2 Data

: Using remote sensing for monitoring trophic states of inland waters relies on the calibration of chlorophyll-a (chl-a ) bio-optical algorithms. One of the main limiting factors of calibrating those algorithms is that they cannot accurately cope with the wide chl-a concentration ranges in optically complex waters subject to different trophic states. Thus, this study proposes an optical hybrid chl-a algorithm (OHA), which is a combined framework of algorithms for specific chl-a concentration ranges. The study area is Ibitinga Reservoir characterized by high spatiotemporal variability of chl-a concentrations (3 – 1000 mg/m 3 ). We took the following steps to address this issue: 1) we defined optical classes of specific chl-a concentration ranges using Spectral Angle Mapper (SAM); 2) we calibrated/validated chl-a bio-optical algorithms for each trophic class using simulated Sentinel-2 MSI (Multispectral Instrument) bands; 3) and we applied a decision tree classifier in MSI/Sentinel-2 images to detect the optical classes and to switch to the suitable algorithm for the given class. The results showed that three optical classes represent different ranges of chl-a concentration: class 1 varies 2.89 – 22.83 mg/m 3 , class 2 varies 19.51 – 87.63 mg/m 3 , and class 3 varies 75.89 – 938.97 mg/m 3 . The best algorithms for trophic classes 1, 2, and 3 are the 3 bands (R² = 0.78; MAPE - Mean Absolute Percentage Error = 34.36%), slope (R² = 0.93; MAPE = 23.35%), and 2 bands (R² = 0.98; MAPE = 20.12%), respectively. The decision tree classifier showed an accuracy of 95% for detecting SAM’s optical trophic classes. The overall performance of OHA was satisfactory (R 2 = 0.98; MAPE = 26.33%) using in situ data but reduced in the Sentinel-2 images (R 2 = 0.42; MAPE = 28.32%) due to the temporal gap between matchups and the variability in reservoir hydrodynamics. In summary, OHA proved to be a viable method for estimating chl-a concentration in Ibitinga Reservoir and the extension of this framework allowed a more precise chl-a estimate in eutrophic inland waters.


Introduction
Reservoirs are transitional systems between rivers and lakes created from the damming of a river.Being complex functioning ecosystems, they are subject to rapid changes in their biotic and abiotic variables, which are due to the natural variability of the environment, to drainage basin land cover and land-use changes, as well as to water resource demands [1,2].Their main functions are power generation, domestic supply, flood control, and irrigation [3].
Industrial development alongside intense and disorganized population growth have affected reservoir ecosystem processes, especially water-quality preservation [4].Anthropogenic activities have promoted water eutrophication due to nutrient enrichment caused by domestic and industrial sewage discharges from urban centers and by the runoff from agricultural regions [5].Such high inputs of nutrients over time impair water quality, leading to increases in primary production, frequency of cyanobacteria blooms, and fish mortality [5,6].Moreover, the interaction between climate change and eutrophication might, shortly, increase the dominance of cyanobacteria and decrease phytoplankton biodiversity [7].Thus, reservoirs might lose their multiple uses and become severe threats to public health.Therefore, for better governance of inland water resources and in order to meet the millennium goals, it is mandatory to improve the monitoring of water quality in reservoirs [8].
This water-quality assessment through remote sensing (RS) relies on the interaction between Optically Active Constituents (OACs) and electromagnetic radiation (EMR) within the visible and near-infrared (NIR) regions [9].In association with in situ measurements, RS can be an essential tool for monitoring environmental threats such as eutrophication and harmful algal blooms (HABs), providing, therefore, information to support new strategies for sustainable management of these ecosystems [10,11].It also offers substantial advantages over traditional monitoring methods, mainly because of the data synoptic coverage and temporal consistency [12].Several studies monitoring aquatic systems through RS focus on estimating chlorophyll-a concentration (chl-a; a photosynthetic pigment present in all phytoplankton species [13]) [14][15][16].Chl-a is the most commonly derived parameter in RS water quality mainly because it is the most comprehensive index of aquatic system trophic status [17][18][19].Moreover, chl-a concentration can be used as a proxy for phytoplankton biomass [20,21].
Recently, studies have shown that bio-optical algorithms for chl-a estimation perform better when calibrated to specific ranges of chl-a concentration [22][23][24].In water color remote sensing, algorithms based on the relationship between blue and green reflectance (blue-green ratio) are predominantly restricted to oligotrophic ocean waters, where chl-a concentration reaches up to 10 mg/m³ and where colored dissolved organic matter (CDOM) does not mask chl-a absorption features in the blue region.Nevertheless, algorithms based on the relationship between red and near-infrared bands (red-NIR ratio) are more suitable for chl-a concentration above 10 mg/m³ in eutrophic environments [22,23].In tropical Brazilian inland waters, studies using a single algorithm to estimate chl-a concentration have provided satisfactory results in a specific chl-a range [25][26][27][28][29].As an example, Augusto-Silva et al. [25] used red-NIR bio-optical algorithms to estimate chl-a concentration in Funil Reservoir (Rio de Janeiro).However, the algorithms diverge from the 1:1 line as chl-a concentration increases towards 20 µg•L −1 , with a more significant error beyond this value.These results led them to conclude that there must be two algorithms for estimating and providing chl-a accurate concentration, one for what they call "normal conditions" and one for "blooming conditions".Therefore, a single algorithm cannot accurately cope with the wide concentration range of chl-a in optically complex waters subject to different trophic states.That may be due to the fact that those systems are characterized by a complex mixture of OACs and a wide range of optical properties [30].
Owing to the constant changes in environmental conditions caused by natural or anthropic factors, the optical properties of an aquatic system may present a high spatiotemporal variability [31].These significant optical variabilities make optically complex waters present different spectral behaviors resulting in high variability of spectral shapes and intensity of remote-sensing reflectance (Rrs).Thus, various optical water types may be simultaneously present either in the same or in distinct water systems, containing waters dominated by phytoplankton, total suspended matter, or CDOM or by their several possible combinations [32][33][34][35][36].As a result of all those optical complexities, precise estimates of chl-a concentration by RS in inland waters are challenging [37].Thus, to overcome this challenge, the abovementioned studies support the establishment of a hybrid (i.e., conditional) approach for estimating chl-a concentration in inland waters with changing trophic levels over time.The hybrid approach switches among specific algorithms for different concentration ranges of chl-a.Therefore, effective monitoring of chl-a concentration in water systems that show various trophic states would benefit by using a dynamic approach that chooses suitable algorithms for the dominant optical properties and phytoplankton biomass variations [24].
Recently, Matsushita et al. [22] have proposed a hybrid algorithm for estimating chl-a concentration in five Asian lakes subject to several trophic levels.Their hybrid algorithm comprises three algorithms for the estimation of chl-a concentration in clear waters (blue-green ratio [21]), turbid waters (2 band ratios using red-NIR ratio [38]), and extremely turbid waters (semi-analytic algorithm [39]).They applied the maximum chlorophyll index (MCI [40]) corresponding to the respective concentration ranges to select the best algorithm for each range (chl-a ≤ 10 mg/m³, 10 < chl-a ≤ 25 mg/m³, and chl-a > 25 mg/m³, respectively).Likewise, Smith et al. [24] developed a hybrid approach that blended and switched between two different algorithms for estimating chl-a concentration along the west coast of South Africa, where the trophic state can vary from oligotrophic in oceanic waters to hypertrophic in coastal waters.An algorithm using blue-green ratio [41] was employed in waters with less than 10 mg/m³, an algorithm using the red-NIR ratio [38] was employed in waters with a chl-a concentration above 25 mg/m³, and both algorithms were blended in the range between 10 and 25 mg/m³.For switching and blending the algorithms, the reflectance ratios of 708 and 665 nm were applied, where the percentiles from 25% to 75% of in situ data corresponded to the 10-25 mg/m³ range.Thus, the blue-green ratio algorithm was chosen when the reflectance ratios were R(708)/R(665) ≤ 0.75, the red-NIR ratio was selected when R(708)/R(665) > 1.15, and both algorithms were blended when 0.75 < R(708)/R(665) ≤ 1.15 by using a weighting factor.
The hybrid approach was also applied without being named as such, that is to say, optically classifying coastal or inland waters and developing specific algorithms for estimating chl-a concentration for each optical class [33,34,36,42].An example is the normalized difference index (bands 705 and 665 nm; bands 560 and 442 nm) for two optical types of Mediterranean lakes, allowing different algorithms to be employed for each optical type [42].Moore et al. [33] applied an optical water type (OWT) structure for switching and blending two algorithms for chl-a concentration (blue-green ratio and three-band ratio using red-NIR).Firstly, from in situ Rrs of American and Spanish lakes and from coastal Rrs obtained in the SeaBASS (SeaWiFS Bio-optical Archive and Storage System), they defined seven OWTs using a fuzzy c-means algorithm.After defining the clusters, Moore et al. [33] classified Rrs spectra into their subsets.The membership functions were developed for each OWT using the Mahalanobis distance between the observation and the class mean vector, which was then put into a chi-square probability function.The class memberships were used as the basis for the weighting factors used to blend algorithm's retrievals into a single product when applied to satellite imagery.
In all the studies mentioned, the development of the hybrid algorithm structure, which switches amongst algorithms used for each specific chl-a concentration range, was built without using training or test datasets, and there was no evaluation of how such structures divided the data.Thus, there was no guarantee that the trophic classes would be suitable for the hybrid algorithm to be applied in satellite imagery pixel by pixel or that the hybrid algorithms would be reproducible in other aquatic systems with similar optical conditions.Furthermore, the existing approaches are not designed for covering a wide range of chl-a concentrations [33,34,36,42] and are not applicable in aquatic systems with high trophic state variability.Hence, it is essential to enhance the hybrid algorithm structure introducing classification methods that can guarantee its reproducibility performance.Likewise, the need to develop a hybrid structure able to suit this type of environment is reinforced [43].
The hybrid approach should be appropriate for estimating chl-a concentration in tropical reservoirs since the trophic state widely changes in space and time in the same or among distinct aquatic systems produced by either weather, hydrologic, hydrodynamic, or anthropogenic causes.The monitoring of trophic state by RS in this type of aquatic system would be sufficient by using three types of trophic classes that relate to low (oligotrophic/mesotrophic-class 1), medium (eutrophic-class 2), and high (hypertrophic-class 3) eutrophication conditions.Thus, the objective of this study was to create a structure for developing and validating a hybrid algorithm suitable for tropical reservoirs characterized by changing trophic states in space and time.In order to do so, chl-a algorithms were based on large in situ datasets collected in 2005 as well as between 2013 and 2018 in Ibitinga Reservoir, which is the third in a reservoir cascade system along the Tietê River in São Paulo state (SP -Brazil) characterized by high spatiotemporal variability of chl-a concentration.To accomplish our goal, we established the following specific objectives: i) to test an optical classification method for identifying classes of specific chl-a concentration ranges; ii) to calibrate, validate, and select adequate specific algorithms for the identified range classes; and iii) to develop a structure that allows switching between the selected algorithms according to the range classes.

Materials and Methods
The research methodology comprises the following steps: i) organization of a database composed of chl-a concentrations up to 1000 mg/m³ and Rrs with glint correction obtained during eight field campaigns at Ibitinga Reservoir (São Paulo/Brazil) which took place in the years of 2005, 2013, 2014, and 2018; ii) optical classification of chl-a concentration range classes based on the Spectral Angle Mapper (SAM) algorithm; iii) calibration and validation of bio-optical algorithms in each specific chl-a concentration range using Monte Carlo simulation from in situ Rrs data simulated for Sentinel-2 MSI (Multispectral Instrument) bands; iv) development of a hybrid algorithm based on a decision tree classifier; v) validation of the hybrid algorithm using in situ data; and vi) application of the hybrid algorithm atmosphere and glint-corrected MSI/Sentinel-2 images, followed by algorithm validation comparing the estimated chl-a with in situ values.

Study Area
The Ibitinga Reservoir (Figure 1) is the third reservoir in a cascade system built along the Tietê River in the region of São Paulo state (21°45'S, 48°59'W).It is a regulation reservoir in which the dam's activity controls hydraulic and trophic dynamics, thus determining variations in optical and limnological properties.
Since the 2000s, Ibitinga drainage basin has been submitted to the fast expansion of sugar cane agriculture [44].In eleven years, from 2003 to 2014, the region around Ibitinga reservoir drainage basin (cities of Ibitinga, Iacanga, Itaju, and Arealva) suffered a 300% expansion in sugar cane area, increasing the culture area from 15,061 ha to 63,510 ha, with a growing rate of around 4500 ha/year [45].This rapid change might have aggravated the occurrence of cyanobacterial blooms during the Brazilian summer, causing the reduction of dissolved oxygen and the mortality of tons of fish [46].The increase of temperature in the summer, associated with higher nutrient export to the reservoir related to more frequent and intense rainstorms in the reservoir drainage basin, stimulates phytoplankton growth, favoring the eutrophication of the aquatic system [13].
Additionally, the reservoir water quality is also affected by untreated sewage disposal.The drainage basin of the Tietê River is one of the most industrialized and highly populated areas in Brazil, with critical water-quality issues [47][48][49][50][51].Moreover, the Tietê River receives crude sewage from São Paulo city, the largest metropolitan city of South America.Considering the two main tributaries of Ibitinga reservoir (Figure 1A), Jacaré-Guaçu River basin, which receives Ibitinga sewage, is more populated and less preserved than Jacaré-Pepira River.Jacaré-Pepira River basin is located in an Environmental Protection Area (EPA-Ibitinga) [52,53].
Past studies of Vieira et al. [54] and Guimarães Jr et al. [55] showed that land use in the reservoir drainage basin contributes to the increasing trophic level due to urban sewage and nutrients derived from agriculture.Tundisi et al. [56] identified many issues in Ibitinga drainage basin, such as the increase of nitrogen and phosphorus sources related to the expansion of sugar cane and urban river degradation associated with population growth.Novo et al. [17], based on remote-sensing methods, reported that the reservoir could be classified as mesotrophic and hypertrophic in different regions, depending on the sampling location.These results corroborate the previous study of Luzia [57], who classified the reservoir as hypertrophic in the rainy season of January 2006 and as eutrophic in the dry season of June 2005.It is worth mentioning that previous studies at Ibitinga reservoir [17,58,60] were based on a single chl-a concentration algorithm for the entire reservoir.The algorithm was calibrated and validated using data acquired in only one survey campaign.Therefore, the algorithm does not capture seasonal changes in the reservoir trophic state.The present study focuses on surpassing this limitation by combining bio-optical algorithms and criteria for algorithm selection by using a database composed of measurements acquired in different seasons and hydrological years, thus representing a wide range of optical and limnological conditions of Ibitinga reservoir.

In Situ Dataset
This study utilized data acquired in eight survey campaigns (Table 1; Figure 1), specifically chl-a concentration and radiometric data.The chl-a concentration was obtained by the methodology of Nush [61], and radiometric data were acquired using different methods and protocols.Londe [58] measured radiometric data using an ASD FieldSpec Hand Held, while in the Cairo [59] and Aug/2018 datasets, the radiometric measurements were carried out using a TriOs radiometer, following the method described by Cairo et al. [62].The remote-sensing reflectance measured above water (Rrs + ) was computed using the equation described by Kirk [9] for the Cairo [59] and Aug/2018 datasets.Londe [58] data were available as reflectance factors and were converted into Rrs + by dividing it by π.Moreover, to remove noise, we applied an average filter using a window size of nine to in situ Rrs + spectra of Londe [58].

Total sampling stations
Chl-a concentration variation (mg/m 3 ) Literature analysis helped to identify a series of algorithms previously proven to be useful for estimating chl-a concentration.In this study, those algorithms with satisfactory performance in optically complex waters were then assessed, including those already tested in a hybrid approach [22,24].

Algorithm types Equations
Simple band ratios   (

Chl-a Concentration Ranges
The partition of the whole range of chl-a concentration values at adequate intervals for algorithm parametrization, considering up to 1000 mg/m³ values, was carried out using an optical classification method, taking into account the shape of in situ Rrs + spectra.For that, the Spectral Angle Mapper (SAM) technique [73] with supervised classification was used.In situ spectral Rrs + ranging from 600 to 825 nm was utilized as input into SAM's classifier for three reasons: i) this is the spectral range in which the largest change in spectrum shape occurs as the chl-a concentration increases; ii) most of the bio-optical algorithms tested focus on wavelengths inside this spectral range; and iii) studies that used optical classification methods to estimate chl-a concentrations in optically complex waters also focus on wavelengths inside this spectral range [34,36].
Furthermore, SAM classifier used three reference spectra (Figure 2) representing low, medium, and high chl-a concentration values.Class 1 reference spectrum is equivalent to a chl-a concentration of approximately 3 mg/m³, class 2 is equivalent to that of 40 mg/m³, and class 3 is equivalent to that of 380 mg/m³, roughly corresponding to oligotrophic, eutrophic, and hypertrophic waters, respectively.Lastly, SAM grouped the Rrs + spectra in three subsets according to the reference spectra, and the specific chl-a concentration ranges were determined considering the minimum and maximum chl-a values of each subset.

Calibration/Validation of Bio-Optical Algorithms
Before calibrating/validating the algorithms, an exploratory analysis of all bio-optical algorithms listed in Table 2 was performed to select the most suitable ones for a given chl-a concentration range, thus reducing the number of algorithms tested in Monte Carlo simulation.For this purpose, scatterplots were made considering the measured and modeled chl-a concentrations and using all sampling stations present in each specific chl-a concentration range.The algorithms listed in Table 2 were used in their original form without any coefficient re-parametrization.Linear, exponential, and polynomial fits were applied in each scatterplot, and the respective determination coefficients (R 2 ) were computed.The criteria for selecting the algorithms submitted to Monte Carlo simulation were as follows: i) existence of a trend in "in situ chl-a versus modeled chl-a" dispersion, hence not being a random distribution; ii) R 2 value of the three fits tested larger than 0.6.
The calibration/validation process of bio-optical algorithms based on in situ Rrs + values simulated for MSI/Sentinel-2 bands in each specific chl-a concentration range used the Monte Carlo simulation with 10,000 iterations.The calibration step used 70% of the entire dataset (Table 1), while the validation step used the remaining 30%, randomly chosen.After each iteration, both steps computed the same statistical metrics: Mean Absolute Percentage Error (MAPE; Equation (1)), R², Root Mean Square Error (RMSE; Equation ( 2)), and Normalized Root Mean Square Error (NRMSE; Equation (3)).Furthermore, algorithms based on linear, polynomial, and exponential fit approaches were tested.MAPE's statistical mode was the selected metric to assess calibration/validation performance, as it provides absolute percentage errors, allowing the comparison among algorithms regardless of the dataset and concentration ranges [74].
where x i is the in situ chl-a concentration value for station i, y i the estimated chl-a concentration value for station i, and xmax and xmin are the maximum and minimum chl-a concentration values of the dataset.From all the Monte Carlo output algorithms, those with the lowest MAPE and NRMSE as well as with the highest R² were selected for each specific chl-a concentration range.This set of algorithms was applied to all sample stations within the respective chl-a range, resulting in scatter plots (chl-a measured data versus chl-a estimated data, 1:1 line).Therefore, the final selection of the best performing algorithm for each chl-a range considered not only the statistics in the Monte Carlo validation process but also the dispersion of estimated and measured chl-a in the scatter plot.

Hybrid Algorithm Construction
The hybrid algorithm was named Optical Hybrid Algorithm (OHA) since the division of chl-a concentration classes was based on the Rrs + spectrum shape, a water optical property.In order to build the OHA, a decision tree classifier was used [43], which assigns a specific algorithm to each image pixel depending on the chl-a concentration range by using decision rules defined by thresholds.
In the study presented herein, the input data for the decision tree were divided into 70% as a training set and 30% as a test set, chosen randomly.The input attributes were based on the hybrid approaches of Le et al. [34], Matsushita et al. [22], Shi et al. [36], and Smith et al. [24]: i) in situ Rrs + simulated (Table 1) for MSI/Sentinel-2 bands (B3, B4, B5, and B6); ii) their respective band ratios (e.g., B4/B3, B5/B3, B5/B4, B5/B6, B6/B3, B6/B4, and B6/B5); iii) slope between bands B3-B4, B4-B5, and B5-B6; and iv) Maximum Chlorophyll Index (MCI; Table 2).These kinds of input attributes were considered in order to apply the hybrid algorithm to MSI/Sentinel-2 images pixel by pixel simply and quickly.Since the input attributes (A) were of a continuous type (e.g., numbers), the training set test condition was expressed as a comparison test (A < v) or (A ≥ v) with binary results, where the decision tree algorithm considered all possible partitions (v) and selected the one producing the best partition.
Gini index (Equation ( 4)) was used to select the best partition as this index measures the probability of a particular variable being wrongly classified when it is randomly chosen [43].Gini varies between 0 and 1, where 0 denotes that all elements belong to a particular class (higher separability among classes) and 1 denotes that the elements are randomly distributed across various classes (lower separability among classes).Lastly, the decision tree training halts when it classifies at least 95% of the training data.
The performance of the classification was evaluated by analyzing the confusion matrix, the precision (Equation ( 5)), as well as the recall (Equation ( 6)).Precision is a measure of the result relevance, and recall is a classification algorithm sensitivity measure.Thus, high values for both performance metrics indicate precise results (high precision), with higher frequency for positive results (high sensitivity algorithm).
=     +   (6) where (/) is the fraction of records that belong to class  in a specific node ,   is the number of true positives,   is the number of false positives, and   is the number of false negatives [43].

Hybrid Algorithm Validation-In Situ Data
The OHA validation, based on in situ data, used the data from the randomly chosen 30% for Monte Carlo validation in each specific chl-a concentration range.Moreover, the hybrid algorithm performance was assessed using MAPE, R², and NRMSE statistics, as mentioned above.

Hybrid Algorithm Application on MSI/Sentinel-2 Image
The MSI/Sentinel-2 image simultaneously acquired by the field campaign on August 13, 2018 was downloaded from the United States Geological Survey (USGS), corresponding to the T22KGA tile.The atmospheric correction was based on the 6S (Second Simulation of the Satellite Signal in the Solar Spectrum) model [75], parameterized as described by Martins et al. [76].A modified version of Py6S [77] developed at the Instrumentation Laboratory for Aquatic Systems (LabISA-http://www.dpi.inpe.br/labisa/index_en.html) was used to apply 6S [78], which is a physical-based atmospheric correction algorithm; its good performance over tropical inland aquatic ecosystems for Rrs retrieval was demonstrated by some authors [76,79].Then, the surface reflectance image was divided by π to convert it to the atmospherically corrected remote-sensing reflectance (Rrs_atmcorr).Lastly, the MSI/Sentinel-2 bands were resampled for a 20-meter spatial resolution.
The atmospheric correction validation was executed as follows: i) the Rrs_atmcorr was determined from the image (August 13, 2018) as the mean value of pixels extracted from a 3 × 3 pixel window at each sampling station (lat/long; 8 stations) in order to reduce the SNR (signal-to-noise ratio) effects [80]; ii) the atmospheric-correction performance assessment was based on the statistical metrics MAPE, R², and NRMSE, comparing extracted Rrs_atmcorr with the in situ Rrs + simulated.Likewise, the atmospheric correction performance was assessed after the glint-correction application over the image, using a shortwave infrared (SWIR) subtraction methodology proposed by Wang and Shi [81].
OHA was applied over the corrected image, generating a class map and a chl-a concentration estimation map.The hybrid algorithm validation considered the estimated chl-a concentration extracted from the map using the median of a 3 × 3-pixel window in each sampling station (lat/long) of the August/2018 field (n = 8).Lastly, R², MAPE, and NRMSE statistics were calculated comparing these estimated results with in situ chl-a concentration measurements of the same field mission.It is noteworthy that, in the class map, the classification of each sampling station was determined from the statistical mode of a 3 × 3-pixel window and that this allowed to evaluate whether the decision tree classified each sampling station with the same optical classification as the SAM algorithm.

SAM Trophic Classes
SAM's classification resulted in three trophic classes (classes 1, 2, and 3), which are respectively represented in Figure 3A-C.Class 1 (low eutrophication) had a green reflectance peak at ~560 nm, which was higher than both the red peak (~650 nm) and the infrared peak (~700 nm); the red peak was higher than the infrared peak, which becomes noticeable around 810 nm.Class 2 (medium eutrophication) had a green reflectance peak at ~550-560 nm, which was higher than the red peak (~650 nm) and the infrared peak (~700-705 nm); the infrared peak was higher than the red peak; and the peak around 810 nm was enhanced but was still lower than the previous peaks.Class 3 (high eutrophication) had a green reflectance peak at ~550 nm, which was higher than the red peak (~650 nm) but with variable intensity at ~705-710 nm; the red peak (~650 nm) shows the same batter in relation to the peak around 810; and the peak around 810 nm could be a little lower than the green (~550 nm) and infrared peaks (~705-710 nm) but could also be higher than both, forming a plateau (vegetation-like spectral behavior).Table 3 shows the specific chl-a concentration ranges defined by the optical method described in Section 2.4 for the hybrid algorithm.As the chl-a concentration partition was based on the similarity of in situ Rrs + spectra shapes, this optical method is a suitable technique to establish the specific chl-a ranges in another eutrophic optically complex aquatic system with distinct spectral behavior.As the SAM algorithm establishes optical classes, the chl-a concentration ranges display small overlaps between classes related to the fuzzy nature of the water masses and the presence of the remaining optically active components.

Assessment of In Situ Chl-a Bio-Optical Algorithms for Each Chl-a Concentration Range
The best performing algorithms for each chl-a concentration range are shown in Table 4. Generally, the statistics of the model in all the spectra vary (maximum/minimum): MAPE from 20.12% to 34.36%; R² from 0.78 to 0.98; NRMSE from 7.92% to 26.78%.The results from applying the same algorithms calibrated and validated for wider chl-a concentration ranges (

Decision Tree for Detecting Trophic Classes
Simple band ratios proved to be the best input for the decision tree classification model training and test (methodology Section 2.6) applied in the hybrid algorithm conditional structure (Figure 4).Band ratios were the only input able to avoid the interference of Rrs + intensity into the conditional threshold and to not restrict the application of the hybrid algorithm on the image to specific Rrs + intensities.In general, the OHA decision tree showed good training results (Figure 4A) expressed by Gini index values equal or very close to zero, indicating that the classification algorithm divided the training data into the purest subsets.Furthermore, the OHA decision tree validation result not only presented an accuracy of 95% (Figure 4B) but also showed that the classification algorithm had a high precision and sensitivity (Table 6).

OHA Framework and In Situ Validation
Figure 5 depicts the final OHA framework.This figure summarizes the results of each step followed to develop the OHA, and the results are shown in Table 3, Table 4, and Figure 4. OHA framework is composed of i) the division of the specific chl-a concentration ranges (classes 1, 2, and 3, which are low, medium, and high eutrophication, respectively) resulting from SAM classifier (using in situ Rrs + (λ) in Table 3); ii) each trophic class having the best performing chl-a bio-optical algorithm associated (Table 4); and iii) trophic classes and their respective chl-a algorithms being switched pixel by pixel in satellite images (or using in situ data) by using a decision tree classifier based on decision rules defined by thresholds.In the example of Figure 5, the first step for applying OHA to MSI/Sentinel-2 images is to check the decision rule based on the decision tree (black circles).These decision rules are tested pixel by pixel in order to classify them into one of the three trophic level classes, having as output a map with the trophic level classes (low, medium, or high eutrophication conditions).Then, the best performing chl-a algorithm selected for each trophic class resulting from the Monte Carlo simulation is applied pixel by pixel according to its classification.Thus, an estimated chl-a concentration map is obtained.The validation performance of OHA (Figure 6) with in situ data indicated that the hybrid algorithm had a good performance using both class 3 algorithm cases (B5/B3 for Class3_600 (Figure 6A) and B6/B3 for Class3_1000 (Figure 6B; Table 4)).This shows that, for tropical reservoirs, bio-optical algorithms have better performances when describing datasets with similar spectral behaviors, therefore reinforcing the importance of the hybrid approach when estimating chl-a concentration in these complex aquatic systems based on optical remote sensing.The performance of OHA bio-optical algorithms in each specific concentration range (Figure 7) allows assessing the decision tree performance regarding the hybrid algorithm validation process (Section 2.7).Results show (Table 4 and Figure 7) that there was no decision tree classification error during the OHA validation process (statistical values were the same).

Atmospheric Correction Evaluation and OHA Image Validation
The validation results of the atmospheric correction, with and without glint correction (Figure 8), indicate good performance in both cases.However, the glint correction based on MSI/Sentinel-2 band 11 subtraction (Figure 8B) brought the scatter points of bands 2 and 6-8 closer to the 1:1 line, also causing overcorrection in the remaining bands.When the bands were analyzed separately (Table 7), it was noticeable that glint correction improves the accuracy for NIR MSI bands (reduction of up to 20% in MAPE values, expected for B05), corroborating with previous studies [79].MAPE values for glint correction using MSI B11 were lower than 29% for all bands except for B08.In comparison, Rrs without glint correction presented MAPE values for B06, B07, and B08 higher than 50%.Several authors have already demonstrated that errors in atmospheric correction increase in longer wavelengths [79,82].It is noteworthy that R² values are not high, which could be explained by the time differences between in situ data and satellite overpass as well as by the low variability in Rrs values for each band.
Despite that, as MAPE and RMSE presented accurate results for B11 glint-corrected Rrs (MAPE < 33%, RMSE < 0.003 sr -1 ), the 6S atmospheric correction with B11 glint corrected was selected for imagery application of the chl-a hybrid algorithm.Moreover, as the algorithms selected for class 1 and class 2 (in which chl-a concentration was observed for the 2018 field campaign) use B04, B05, and B06, 6S atmospheric correction combined with glint correction using B11 presented the most suitable results.Figure 9 shows OHA validation results over MSI/Sentinel-2 image, with and without glint correction.We can observe that the validation results improved when taking glint correction into account and that both glint correction methods obtained the same statistical values, as we can see in Figure 9B,C.The lack of glint correction (Figure 9A) caused two decision tree classification errors, while only one error was found after glint correction.OHA validation results were not affected by the expanded range of chl-a "Class 3_600" and "Class 3_1000" algorithms (Table 4).The reason being that, during the sampling stations of the August/2018 field campaign (n = 8), only low and medium conditions of eutrophication were obtained.Furthermore, it is possible to notice that there are some outliers (colored circles in Figure 9B) even when glint correction is applied.P1 and P2 stations (blue and red circles, respectively) were collected one day before Sentinel-2 overpass, and P6 station (orange circle) was collected an hour before the satellite overpass.Beyond the temporal gap between matchups, these errors could also be related to algal bloom hydrodynamic in the reservoir [60].

Trophic Class Optical Properties
Differences amongst the three subsets derived from SAM optical classification can be observed through in situ examples of Rrs + spectra; phytoplankton absorption coefficient (aphy); non-algal particles (NAP) absorption coefficient (anap); and CDOM absorption coefficient (acdom) (Figure 11).Moreover, each field station represents a distinct trophic state determined by SAM classes (Table 3) and their respective water colors at the time of field acquisition.It is possible to analyze that, in the three subsets, the Rrs + spectral properties are quite different, with red and NIR being the most affected spectral regions as chl-a concentration increases.The change is marked by both an increase in chl-a absorption around 676 nm and a backscattering by phytoplankton cells around 709 nm and 810 nm.As the bloom cell density rises, the reflectance throughout the NIR region increases until it forms a plateau.Finally, the water dominated by surface-accumulated phytoplankton has a spectral behavior similar to that of the vegetation.Under those conditions, a thick phytoplankton scum is formed, which hinders or entirely blocks the interaction between water and incident EMR Based on the aphy, anap, and acdom spectra, derived under low eutrophication conditions (Figure 11A), in the red region (above 600 nm), acdom (λ) has a smaller influence on total absorption than that of aphy (λ).On the other hand, anap (λ) has a greater influence on total absorption than that of aphy (λ).Thus, anap may mask aphy features at this eutrophication level, affecting the performance of the selected algorithm to estimate chl-a concentration for class 1 (2.89 ≤ chl-a ≤ 22.83 mg/m 3 ).This can be seen in Table 4, where the performance of the 3-band algorithm (665, 705, and 740 nm) of Gitelson et al. [70] was lower (MAPE = 34.36%)than that of those selected for the other concentration ranges (MAPE < 24%).It is important to highlight that, as chl-a concentration increases, the spectral influence of acdom and anap decreases with aphy (Figure 11B,C), especially in red and NIR regions.This shows that, when Ibitinga reservoir has medium to high eutrophication conditions, phytoplankton is the dominant optical component.Therefore, this may be a factor that contributed to a better performance of the algorithms selected for chl-a ranges of SAM's classes 2 and 3 (Table 4).
Moreover, from Figure 11, it is also possible to observe that, as chl-a concentration increases, the absorption around 440 nm in the Rrs + spectrum becomes more evident, overcoming the effect of acdom and anap and thus preventing them from masking the feature of chl-a in the Rrs + spectrum.acdom and anap have greater influence on the blue/green region under oligotrophic/mesotrophic conditions or under high precipitation in the Jacaré-Pepira River basin, an environmental preservation area with high accumulation of leached organic matter flowing into the reservoir.
It is noteworthy that the differences in optical properties of the subsets are highlighted by the apparent differences in water color, thus having low, medium, and high eutrophication levels in the subsets A, B, and C (Figure 11), respectively.Figure 11 also highlights the water-color changes for the same sampling station (near the dam) over time.All this diversity of water optical proprieties shown in each subset for Ibitinga reservoir endorses the necessity of a hybrid approach to describe all this spectral variability and thus to provide good estimates of chl-a concentration.

Specific Chl-a Concentration-Range Assessment
The definition of chl-a concentration ranges in the present study (Section 2.4; Table 3) was based on a more robust criterion compared to those of the hybrid algorithms proposed by Matsushita et al. [22] and Smith et al. [24].These authors [22,24] established the specific concentration ranges empirically (chl-a ≤ 10 mg/m 3 , 10 < chl-a ≤ 25 mg/m 3 , and chl-a > 25 mg/m 3 ), claiming that these concentration values were reported by previous studies as decisive thresholds for the operation of various algorithms.Hence, the authors used the defined chl-a ranges for purposes other than to support a hybrid algorithm.Furthermore, the effect that such classes would have on water optics was not taken into account, a fundamental piece of data for the algorithm accuracy.
On the other hand, the present study used a supervised optical classification method based on Rrs + spectra of which the shapes represented different trophic conditions of the aquatic system.This way, the hybrid algorithm allows the mapping of the Trophic State Index (TSI) from an optical perspective, through which a given set of Rrs + spectral shapes in a given specific chl-a concentration range can be associated to a specific trophic condition.TSI classes are broad, with trophic states split into low, medium, and high eutrophication conditions, since only three optical classes were obtained from SAM due to the size and representativeness of the sample set (Table 3; Figure 3).Likewise, other studies have used optical classification to establish OAC concentration ranges from different optical water types, such as optical classification based on fuzzy logic classification algorithm [33] and hierarchical clustering method [36].It is also noteworthy that, comparing the value of the chl-a range limits, in this paper, SAM's class 1 range (2.89 ≤ chl-a ≤ 22.83 mg/m 3 ) encompasses almost the first two chl-a ranges of Matsushita et al. [22] and Smith et al. [24].The chl-a ranges of the authors cited are narrower since they have in their database Rrs + spectra with spectral characteristic of oligotrophic waters.
It is also important to point out in this paper that there are transition regions among SAM classes (Figure 3, Figure 12), causing an overlap between two consecutive specific chl-a concentration ranges (Table 3).It can be observed (Figure 12A) that, although chl-a concentrations are similar, their spectral shape differences are small (mainly in the region between 600-825 nm, where SAM was performed), yet they were assigned to different classes by SAM.This is because the Rrs + spectral shape is associated not only with the chl-a concentration value but also with its distribution in the water column and with the physiological state of phytoplankton, influenced by environmental factors and the impact of other OACs.
However, this small difference between the shape of the Rrs + spectra in SAM's transition region may not be sufficient to separate them into different trophic classes at the time of decision tree application (Figure 4A).This happened with the in situ Rrs + spectra simulated for MSI/Sentinel-2 bands presented in Figure 12A, where the B5/B4 bands ratio of the "P4_Jul_2014" sampling station, classified as class 2 by SAM, was not enough (B5/B4 = 1.18) to classify it as class 2 by the decision tree (at B5/B4 ≤ 1.20, the spectrum/pixel is classified as class 1, trophic level: little eutrophic) in OHA validation (in situ data).Thus, these transition regions may also generate misclassifications while applying the hybrid algorithm to satellite imagery (Figure 12B).The "P6_Aug_2018" sampling station, which is also present in the transition region of SAM's class 1 and class 2, was the only station that had a decision tree misclassification when OHA was applied to the satellite image (OHA validation over image).This station was classified as class 1 by SAM and as class 2 by a decision tree.The B5/B4 band ratio of this station considering in situ simulated Rrs + was 1.12 while satellite Rrs was 1.28, justifying the decision tree misclassification (Figure 4A).One possible solution to such confusion in the transition region between classes would be to restrict the spectral region for SAM calculation, in this case, a narrower region between 600-825 nm, focusing on the wavelengths considered in training and test data of decision tree.

Chl-a Bio-Optical Algorithm Comparisons
In the present paper, the 3-band algorithm by Gitelson et al. [70] performed better in the class 1 concentration range (2.89 ≤ chl-a ≤ 22.83 mg/m 3 ) (Table 2).This concentration range presented a greater influence of NAP and CDOM (Figure 11A) than in the other ranges (class 2 and class 3, Table 3).This algorithm was designed to minimize the effects of these OACs on the Rrs spectrum, thus highlighting the chl-a absorption feature and being able to estimate the chl-a concentration in optically complex turbid waters [83,84].The results observed in the present work for the Gitelson et al. [70] algorithm were similar to those obtained by Gitelson et al. [85,86].They also applied the 3-band algorithm to Fremont Lakes (Nebraska, USA) using a set of Rrs + spectra similar to those of the present study and showed that this type of algorithm has potential to be applied at low/median concentrations.Gitelson et al. [85] reported that the algorithm (using the ~670-, 710-, and 750-nm bands) explained more than 89% of chl-a variation over a range of 2-20 mg/m 3 and can be used to estimate chl-a concentration with an RMSE < 1.65 mg/m 3 .Gitelson et al. [86] demonstrated that the algorithm has sensitivity (R 2 = 0.84) to estimate chl-a concentration in a range between 0-30 mg/m 3 .
Matsushita et al. [22] tested the 3-band algorithm (without re-parametrization) for the 10-25 mg/m 3 concentration range and for the entire dataset (chl-a = 1.8-153.9mg/m 3 ).The authors obtained between 10-25 mg/m 3 NRMSE values of 35.9% and R 2 of 0.71.However, when the 3-band algorithm was applied to the entire dataset, its performance decreased (NRMSE = 261.9%and R² = 0.93).The results for this range of chl-a were close to the ones observed in this study for class 1 (NRMSE = 26.78%and R 2 = 0.78).
For class 2 (19.51 ≤ chl-a ≤ 87.63 mg/m 3 ), Mishra and Mishra [69] slope algorithm performed better, with chl-a values being close to those used by the authors [69] (14.14 ≤ chl-a ≤ 70.97 mg/m 3 ) to test the algorithm in Pontchartrain Lake (USA).In the work by Mishra and Mishra [69], the authors evaluated the slope algorithm using the Rrs + of green and red MODIS (Moderate Resolution Imaging Spectroradiometer) bands (green-red slope with R 2 = 0.65 and RMSE = 9.43 mg/m 3 ).In the present study, the slope algorithm using in situ Rrs + simulated for the red and red-edge bands of MSI/Sentinel-2 sensor (red-NIR slope) obtained a better performance (R 2 = 0.93 and RMSE = 12.09 mg/m 3 ; Table 4).The slope algorithm was also tested by Watanabe et al. [71] in Barra Bonita/SP reservoir, which is part of the Tietê River cascade system, using MSI/Sentinel-2A sensor for a chl-a concentration range of 17.7-797.8mg/m 3 .Analyzing the in situ spectra, one can notice that the Rrs + shape of the present study fits amongst the spectra used by Watanabe et al. [71] for calibration/validation algorithms.Comparing the results, the algorithm had a better performance for the specific chl-a concentration ranges of this study (red-NIR slope in the class 2 range, exponential fit: R 2 = 0.93, MAPE = 23.35%)than that of the full concentration range tested by Watanabe et al. [71] (red-NIR slope, with linear fit: R 2 = 0.76, MAPE = 49.02%).
In this study, however, the B5/B3 and B6/B3 band ratio algorithms performed better for estimating chl-a under extremely critical eutrophication conditions (class 3, Table 3).Such bands were considered given that the higher the chl-a concentration, the more significant the difference between band values B5-B3 (705 nm and 560 nm) and B6-B3 (740 nm and 560 nm).At higher chl-a concentrations, the Rrs spectra begin to respond with increased signal at the wavelengths 705 nm and 740 nm than they do at lower concentrations.That variability is due to the greater influence of the phytoplankton cells scattering (denser blooms on the water surface) and the consequent reduction of the water absorption influence at higher wavelengths.To date, no application of these types of band combinations has been found in the literature to estimate high chl-a concentrations using the MSI/Sentinel-2 sensor, making this study an innovative option for this purpose.

OHA Framework Assessment
The choice of the decision tree classification approach to develop the conditional structure of the hybrid algorithm (Figure 4A) resulted from its simplicity and easy reproducibility.Other studies have also used simple conditional structures for hybrid algorithm operation [22,24,33,34,36,42], which allows switching between the application of different algorithms according to the chl-a concentration range or the optical water type.For example, Matsushita et al. [22] used the MCI algorithm; Shi et al. [36] used the values and ratios of MERIS (MEdium Resolution Imaging Spectrometer) bands 7-9 (665, 681, and 709 nm); Le et al. [34] tested the slopes between the wavelengths of 650 nm and maximum reflectance in the green and NIR regions, and Smith et al. [24] applied a band ratio between reflectance at 708 and 665 nm wavelengths.In these papers, the conditional thresholds of the hybrid algorithms were empirically determined.Moreover, there was no performance evaluation of how these conditional structures split the data.In this study, to overcome this limitation, the dataset was divided into training and testing.Additionally, performance metrics such as the confusion matrix, precision, and recall were used to evaluate the performance of the classification algorithm.The performance metric results (Figure 4B; Table 6) showed that the conditional structure of the hybrid algorithm allows different trophic conditions to be appropriately classified before the application of the hybrid algorithm on satellite imagery.Furthermore, it makes the hybrid framework applicable to other tropical inland aquatic systems with similar optical conditions.
One of the differences between the hybrid algorithm of this paper and those developed by Matsushita et al. [22] and Smith et al. [24] is that, in this study, the bio-optical algorithm selected for each specific chl-a concentration range was calibrated/validated using the Monte Carlo simulation, while the other authors tested the algorithms in their original form without any re-parameterizations.The best performing algorithm for each chl-a concentration range of Matsushita et al. [22] had validation statistics (in situ data) which varied from 9-16% NMAE (normalized mean absolute error, equivalent to MAPE), 0.30-10.28mg/m 3 RMSE, and 0.54-0.90R 2 .Regarding the bio-optical algorithms used by Smith et al. [24], there was a variation of 37.9-44.5% for MARD (median absolute relative difference; MARD statistic is comparable to MAPE because the median and mean values are generally close).Compared to the present study (MAPE = 20.12-34.36%,RMSE = 5.34-58.90mg/m 3 , R 2 = 0.78-0.98;Table 4), the bio-optical algorithm validation in the study of Matsushita et al. [22] had better results, but it is worth remembering that they were tested within a narrower range.
Comparing the hybrid algorithm in situ validation performance of this study (Figure 6) with those in the literature, it was possible to observe that the performance of OHA (OHA up to 1000 mg/m 3 : MAPE = 26.33%,R 2 = 0.98) was better than Smith et al. [24] (MARD = 36.4%,R 2 = 0.81) and slightly lower than Matsushita et al. [22] (NMAE = 13.3%,R 2 = 0.94).Furthermore, for the satellite data, the hybrid algorithm validation presented a performance of MAPE 28.32% and R 2 0.42 (Figure 9B).When comparing this result with that of Smith et al. [24] (MARD = 45.7%,R 2 = 0.69), one notices that, although the hybrid algorithm of Smith et al. [24] explains 69% of the chl-a variation for a range of 0.43-309 mg/m 3 (more than this study, 42% for a range of 14.96-59.19mg/m 3 ), their error percentage is almost two times higher than that of the present study.
The hybrid model developed in this study also had a better performance in estimating chl-a than that of the semi-analytical algorithm parametrized (based on Sentinel-3 OLCI -Ocean and Land Colour Instrument -bands) to the highly productive Barra Bonita Reservoir (São Paulo; chl-a higher than 700 mg/m 3 ) [27].The validation of this algorithm has been carried out with data measured in a downstream eutrophic reservoir of the cascade system along with the Tietê River (Bariri Reservoir; chl-a 25.1-694.3mg/m 3 ).The validation results were inferior to those of OHA (MAPE = 62.3%, R 2 = 0.81, chl-a < 300 mg/m 3 ), and the semi-analytical algorithm was not capable of accurately retrieving chl-a for the highest chl-a concentration (> 600 mg/m 3 ).Furthermore, Andrade et al. [30] showed that neither original nor parameterized quasi-analytical algorithm (QAA) versions tested in Ibitinga Reservoir were capable of estimating the absorption coefficients in all wavelengths.The results [30] showed a challenge in coping with high optical variability in the cascading system and highlighted the limitation of such a quasi-analytical scheme to monitor spatiotemporal OACs in this type of tropical water.In general, OHA performed better than if only one algorithm was considered to describe all the optical and trophic variability of the reservoir (Table 5).
One of the advantages of the proposed methodology is that the conditional structure of the hybrid algorithm, generated by the decision tree, is independent of the development of bio-optical algorithms in each specific chl-a concentration range, regardless of how the algorithms are implemented.Thus, other types of algorithms can be combined using the same framework to estimate chl-a concentration considering different optical conditions and specific concentration ranges.Besides that, considering the hybrid algorithm applicability to satellite imagery, the conditional structure developed has made this process more operational since it is only necessary to use relationships between the B4 and B5 band ratios of the MSI/Sentinel-2 sensor to classify each pixel into one of specific chl-a concentration ranges and thus to estimate chl-a using the corresponding bio-optical algorithm for that range.
A disadvantage of this process is that the successful application of the conditional structure to switch among algorithms and of the bio-optical algorithms to estimate the chl-a concentration in each range depends on the quality of the atmospheric correction applied to the image.In the present study, atmospheric correction performed better considering glint correction (Sentinel-2 band 11 subtraction), although some bands were overcorrected (Figure 8 and Table 7).The uncertainties in Rrs retrieval influence the overall accuracy of the hybrid algorithm.However, it is important to note that, despite the low values of R² for satellite-derived Rrs when compared to those of in situ data, OHA MAPE values were lower than 29%.This uncertainty is commonly observed in inland waters.For example, Maciel et al. [79] obtained MAPE values of about 20% for turbid waters in Amazon Floodplain lakes.Wang et al. [87] also observed MAPE values of about 20% for 6SV atmospheric correction for green and red bands of OLI (Operational Land Imager) Landsat-8 over eutrophic environments of lakes in China.In the present work, the variability between in situ and satellite-derived Rrs may have influenced satellite OHA validation results (Figure 9B, MAPE = 28.32%,RMSE = 12.98 mg/m 3 , NRMSE = 29.35%,and R 2 = 0.42) compared to that of in situ Rrs + data (Figure 6, MAPE = 26.33-26.99%,RMSE = 26.39-34.07mg/m 3 , NRMSE = 4.05-5.96%,and R 2 = 0.94-0.98).
Another factor that may have affected OHA performance when applied to the image was that not all sampling stations were collected on the same day of the satellite overpass.Furthermore, the in situ data were collected approximately over 4 hours (~10 am to 2 pm) while the image was acquired almost instantaneously at approximately 10:30 am.Thus, the low R² observed in Figure 9 between the comparison of measured and modeled chl-a from satellite images could be due to differences in hydrodynamic and environmental conditions for each sampling station.The Ibitinga hydroelectric reservoir has variable hydrodynamics, influenced mainly by the hydraulic residence time, the energy demand, and the water release due to water-quality control issues [60].These processes can lead to a change in bloom dynamics, and its impact is not constant in the entire reservoir.In general, the regions near the dam are more affected by mechanical shock and hydraulic washing while the upstream regions may remain more protected from those actions.This can be seen in Figure 13, where a comparison between the true-color image of OLI/Landsat-8 from 12-Aug-2018 and the MSI/Sentinel-2 image used in this paper (13-Aug-2018) demonstrated the variability of blooms between these days.The station regions P1 and P2 (sampled on 12-Aug-2018; blue and red circles in Figure 9) had a denser bloom one day before the satellite overpass (13-Aug-2018), and this may have led to underestimating the chl-a concentration in these two stations.
Moreover, bloom dynamics in the reservoir could also be influenced by the outflow of the tributaries, as can be seen in Figure 13.In this figure, the region of P6 station (sampled at 13-Aug-2018; orange circle in Figure 9), located in the Jacaré-Pepira River mouth and near a fish-farm, had more bloom formation on the day of the satellite overpass.Moreover, this station did not have a satisfying agreement in OHA validation (Figure 9) as the sample was collected one hour before the image acquisition.In general, considering the spatialization of the classes and of the chl-a concentration estimate in Ibitinga reservoir (Figure 10), it can be said that the classification using the decision tree was efficient, since it classified all of Jacaré-Pepira River as class 1 (environmental preservation area) and classified a fraction of the Jacaré-Guaçu River as class 2 (receives untreated sewage from the Ibitinga city).The red region (class 3) in Figure 10A was also expected to have more blooms as it is a fish-farming region.The reservoir meanders and the region near the dam (top of the figure) both have a marked presence of blooms, as they are regions that tend to have low water-flow velocity [62], which facilitates the development of Microcystis aeruginosa cyanobacteria [13].Cyanobacteria are commonly found in eutrophic waters and may dominate phytoplankton due to various adaptations, such as the regulation of fluctuation in the water column by the presence of vacuoles and the efficient use of light in the yellow/orange region for photosynthesis [88].A concern of their dominance in the Ibitinga reservoir under different eutrophication conditions is that cyanobacteria produce a variety of toxins that negatively affect both human health and aquatic life [89].

Conclusions
This was the first study in Brazil to use a hybrid approach to estimate chl-a concentration in optically complex inland waters.From this study, it was possible to verify that, for tropical reservoirs, the bio-optical algorithms applied to estimate chl-a concentration have a better performance when calibrated/validated for specific chl-a concentration ranges when compared to more extensive ranges.
This study observed that hybrid chl-a algorithms might have variations in their framework according to two types of optical water conditions in optically complex aquatic systems: i) waters with phytoplankton being the optically dominant component (which is the case of this study) and ii) waters with different combinations of OACs.In the former, the hybrid algorithm focuses on the trophic state variation and the bio-optical algorithms that compose it are calibrated/validated for specific chl-a concentration ranges.In the latter, the hybrid algorithm focuses on the different optical water types and the bio-optical algorithms that compose it estimate chl-a in each considered optical class.The hybrid algorithm developed in this paper provides two types of water quality mapping: i) a trophic state map for classes established with SAM and applied to the image by the decision tree and ii) a chl-a concentration estimate map derived from the bio-optical algorithm encompassed in each concentration range.
One issue to consider in the present study is that, since the spectral properties of the collected Rrs + are representations of water optical conditions that are snapshots of a continuum, it is not possible to state whether all possible types of Rrs + spectral properties of Ibitinga reservoir waters are represented in the dataset.Thus, the development of OHA was limited to the data used in this study and it can assume to be applicable in inland tropical aquatic systems where phytoplankton is the dominant optical component.The suggestion is that more aquatic systems with different OAC combinations, Rrs + spectral properties, and environmental conditions should be inserted into the dataset, thus generating a hybrid algorithm of chl-a concentration estimation that is more representative of different optical water types in tropical regions.
Chl-a and TSI mapping using OHA framework have the potential to be expanded to other types of aquatic systems (e.g., lakes, rivers, and reservoirs) in different environmental conditions and regions worldwide.This might be due to the fact that each chl-a bio-optical algorithm that composes the hybrid framework is adjusted to describe each set of spectral behavior resulting from different combinations of OACs in water.Thus, the use of the hybrid approach would be suitable for systematically monitoring the chl-a concentration in aquatic systems with high optical and trophic variability in space and time, having the potential to be consolidated as a methodology for outstanding water-quality monitoring projects.This approach will allow creating chl-a concentration time series to assess the impact of sugar cane expansion on the increase of algae blooms in different reservoirs of São Paulo state.

Figure 1 .
Figure 1.Ibitinga reservoir and its main tributaries: Survey campaigns by Londe [58] carried out in 2005 (A) and by Cairo [59] carried out in November 2013; in February, March, May, July, and September 2014; and in August 2018 (B).The red dots in Figure 1B are sample stations collected in 2013, 2014, and 2018.

Figure 2 .
Figure 2. (A) Spectral Angle Mapper (SAM) reference classes; (B) class 1 zoom, making explicit the shape of the reference spectrum to compare the shape with other reference spectra.Classes 1, 2, and 3 refer to low, medium, and high eutrophication conditions, respectively.

Figure 3 .
Figure 3. Classified in situ Rrs + spectra: (A) class 1, (B) class 2, and (C) class 3. From left to right, red arrows indicate peaks on the green, red, and infrared regions.

Figure 4 .
Figure 4. (A) Decision tree results for OHA, where classes 1, 2, and 3 are the respective SAM's classes; (B) OHA decision tree confusion matrix.Classes 1, 2, and 3 refer to low, medium, and high eutrophication conditions, respectively.

Figure 6 .
Figure 6.(A) OHA with the class 3 algorithm up to 600 mg/m³ and (B) OHA with the class 3 algorithm up to 1000 mg/m³.Class 3 refers to high eutrophication conditions.

Figure 10
Figure10shows the spatial distribution of the classes and of the chl-a concentration estimate at Ibitinga reservoir for the August/2018 field.As one can see, most of the reservoir was classified as classes 1 and 2 on the date of the field campaign (August 13, 2018).Furthermore, chl-a concentration estimated values for this date were mostly between 0 to 149 mg/m³.

Figure 10 .
Figure 10.Spatial distribution of (A) chl-a range classes and of (B) chl-a concentration estimates based on the application of the OHA to the Sentinel image concurrent to the August 13, 2018 field campaign at Ibitinga reservoir.Classes 1, 2, and 3 refer to low, medium, and high eutrophication conditions, respectively.

Figure 11 .
Figure 11.Analysis of water-color change and Rrs + (λ), aphy (λ), anap(λ), and acdom(λ) variation for the three subsets generated using the optical method: (A) P1 station of May/2014, present in the specific chl-a range "class 1"; (B) P1 station of July/2014, present in the specific chl-a range "class 2"; and (C) P1 station of February/2014, present in the specific chl-a range "class 3".Classes 1, 2, and 3 refer to low, medium, and high eutrophication conditions, respectively.Note 1: In the right corner of Figure 11A is Rrs + (λ) with an enlarged scale to emphasize the shape of the spectrum in relation to Figure 11B,C.Note 2: in Figure 11C, the absorption scale is different from that of Figure 11A,B.

Figure 12 .
Figure 12.Examples of (A) transition spectra (in situ Rrs + simulated for MSI/Sentinel-2 bands) between SAM's classes 1 (orange line) and 2 (green line), with close chl-a concentration values (mg/m 3 ), and (B) in situ simulated (red line) and satellite-derived (purple line) Rrs spectra of the "P6_Aug_2018" sampling station, showing in situ (red) and estimated (purple) chl-a concentration values.

Table 1 .
Field campaign information.

Table 3 .
Specific chl-a concentration ranges defined by SAM.

Table 5 )
indicate a loss of accuracy with poorer statistics (minimum/maximum) as follows: up to 600 mg/m³ with MAPE of 77.52-247.10%,R² of 0.67-0.97,and NRMSE of 4.89-26.65%and up to 1000 mg/m³ with MAPE of 77.52-247.10%,R² of 0.67-0.97,and NRMSE 4.89-26.65%.A table with the statistical estimators for all algorithms tested in each specific chl-a concentration range is presented in the Supplementary Materials.

Table 4 .
Monte Carlo results: the best performing algorithms for each specific chl-a concentration range generated by the optical method, where N is the number of samples used in the validation, Lin is linear fit, Pol is polynomial fit, Exp is exponential fit, MAPE is mean absolute percentage error (%), RMSE is root mean square error (mg/m³), and NRMSE is normalized root mean square error (%).Classes 1, 2, and 3 refer to low, medium, and high eutrophication conditions, respectively.

Table 5 .
Monte Carlo results using the best performing algorithms from Table4in a wider chl-a concentration range up to 600 mg/m³ and up to 1000 mg/m³, where Lin is linear fit, Pol is polynomial fit, Exp is exponential fit, MAPE is mean absolute percentage error (%), RMSE is root mean square error (mg/m³), and NRMSE is normalized root mean square error (%).

Table 6 .
OHA decision tree classification performance assessment.Classes 1, 2, and 3 refer to low, medium, and high eutrophication conditions, respectively.

Table 7 .
Performance of atmospheric and glint correction for Rrs retrieval (n = 8).