Remote Retrieval of Suspended Particulate Matter in Inland Waters: Image-Based or Physical Atmospheric Correction Models?

: The objective of this paper was to compare the limits of three image-based atmospheric correction models (top of the atmosphere (ToA), dark object subtraction (DOS), and cosine of the sun zenith angle (COST)), and three physical models (atmospheric correction for ﬂat terrain (ATCOR), fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH)), and ACOLITE) for retrieving suspended particulate matter (SPM) concentrations in inland water bodies using Landsat imagery. For SPM concentration estimates, all possible combinations of 2-band normalized ratios (2 b NR ) were computed, and a stepwise regression was applied. The correlation analysis allowed highlighting that the red/blue 2 b NR was the best spectral index to retrieve SPM concentrations in the case of image-based models, while the red/green 2 b NR was the best in the case of physical models. Contrary to expectations, image-based atmospheric models outperformed the accuracy of physical models. The cross-validation results underlined the good performance of the DOS and COST models, with R 2 > 0.83, NASH-criterion (Nash) > 0.83, bias = − 0.01 mg/L, and RMSE < 0.27 mg/L. This outperformance was conﬁrmed using blind test validation data, with an R 2 > 0.86 and Nash > 0.58 for the DOS and COST models. The challenges and limitations involved in the remote monitoring of SPM spatial distribution in turbid productive waters using satellite data are discussed at the end of the paper.


Introduction
Suspended particulate matter (SPM) is an important element for water quality evaluations [1] and plays a major role in the ecological regulation of aquatic systems [2]. SPM is an indicator of water turbidity [3] that can be of minerals, humic, or planktonic origin and can be considered as a stimulating factor for cyanobacteria bloom growth [2,4]. Indeed, a part of the sunlight's ultraviolet wavelengths are backscattered by SPM, which both favors the increase of bacteria and decreases their mortality [5]. Such conditions are ideal for the development of cyanobacteria, and once established, they are very difficult to control and to mitigate [6]. Lakes experiencing recurrent cyanobacteria are subject to an accelerated eutrophication, and their water can be a real threat to human and animal health [7][8][9], since some species are toxin-carrying.
In situ sampling is the most accurate and commonly used technique to monitor SPM concentrations, but it is costly and limited in time and space for representative lake water quality characterization. The SPM in lakes causes distinct changes in the water color by absorbing and scattering the sunlight at specific spectral wavelengths. This physical behavior can be used to retrieve SPM concentrations based on the reflectance of remotely sensed data, by relating optical changes observed in these specific spectral wavelengths to in situ measurements of SPM concentration. Based on this principle, several empirical models [4,[10][11][12][13][14] were developed, and they were able to estimate SPM concentrations using many sensor types (MERIS, MODIS, SPOT, Landsat, and Sentinel-2). However, estimation accuracy has often been moderate. Numerous authors have reported this issue and have related it to atmospheric correction using physical models, which were not initially designed for inland waters applications [14][15][16]. Furthermore, the analysis made by Wang, Chen [17], which aimed to compare more than 20 algorithms used for retrieving SPM concentration using Landsat images corrected by the 6S model (second simulation of a satellite signal in the solar spectrum) also emphasized the lack of SPM estimate accuracy (R 2 < 0.78).
Indeed, up to 80% of remotely sensed signals recorded by sensors are backscattered by the air column of the atmosphere and do not represent the real optical feature of interest [18]. Unlike open oceans (known as case-1 waters), the optical properties of inland waters (known as case-2 waters) are very complex [19]. In addition to the SPM, there are many other optically active components (chlorophyll-a of phytoplankton (chl_a) and colored dissolved organic matter (CDOM), among others) that render SPM modeling very challenging in these aquatic systems [20]. A poor atmospheric correction could thus lead to significant errors in estimating the reflectance and affect the accuracy of SPM modeling [21].
In fact, researchers have subdivided the effect of the atmosphere on sun-light radiance into two groups: a multiplicative effect that is caused by the aerosol optical depth at 550 nm, ozone, and transmittance; and an additive effect that is mainly caused by haze [22][23][24]. Physical models implemented in most industrial software (atmospheric correction for flat terrain (ATCOR) in PCI-Geomatics and fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) in ENVI, for instance) were developed to correct both multiplicative and additive atmospheric effects, but are based on the 0 water-leaving radiance assumption in the near infrared (NIR), which is incorrect for case-2 waters. Therefore, their use is usually error-tainted, specifically in the case of turbid waters [25,26]. As a solution, a new algorithm, called ACOLITE, was recently developed for water applications that is based on the 0 water-leaving radiance assumption in the short wave of infrared (SWIR) [27]. Image-based models (apparent reflectance at the top of the atmosphere (ToA), dark object subtraction (DOS), and cosine of the sun zenith angle (COST), for instance), on the other hand, only correct for the additive effect. This correction consists of subtracting the extra energy caused by the atmosphere for features that are dark, assuming that they should have a reflectance of 1%. Of course, the images are partially corrected and are still tainted by the multiplicative effect, but the original signal recorded by the images is barely modified and preserves its initial optical properties.
Thus, the aim of this work was to test whether complex physical model corrections on spectra are to the advantage of SPM modeling in inland water bodies or whether minimal corrections to spectra using image-based models yield the best results. To do so, a comparative analysis was performed between three physical models (ATCOR, FLAASH, and ACOLITE) and three image-based models (ToA, DOS, and COST) to evaluate the accuracy of SPM modeling over two inland water bodies using Landsat images. Models were calibrated using a stepwise multivariate regression and were evaluated using the cross-validation technique and a blind test database. For the accuracy assessment, four statistical indices were used (coefficient of determination (R 2 ), root mean square error (RMSE), bias, and NASH-criterion (Nash)).

Calibration and Cross-Validation Database
The data used in this study were collected from two water bodies (Missisquoi Bay (M.B.) of Champlain Lake and Lake Saint-Pierre (St-P.)) known to be turbid in the Québec Province. M.B. of Champlain Lake is located in the northern end (southern part of the province of Quebec) of the lake in the Lake Champlain Valley, on the border of the states of Vermont and New York. The lake area is 1269 km 2 and its watershed is 23,720 km 2 . The length and width are 201 and 23 km respectively, and the maximum depth is 122 m. The St-P. Lake is located on the St. Lawrence River, between Sorel-Tracy and Trois-Rivières, in Quebec, Canada. The lake is located downstream and east of Montreal, and upstream and west of Quebec City. The lake area is 353 km 2 and its watershed is 990,000 km 2 . The length and width are 35 and 10 km, respectively, and maximum depth is 20 to 11 m.
In situ analyses were made for physical nutriments (SPM, Chlorophyll-a (chl_a), dissolved organic carbon (DOC), total organic carbon (TOC), total nitrogen (TN), and total phosphorus (TP)). Samples were collected over three years (2008 to 2010) for a total of 769 samples. Only two sampling dates matched with the Landsat sensor's cross-pass over the two studied water bodies (25 August 2009, for M.B. of Champlain Lake, and 2 September 2009, for St-P. Lake) for a total of 20 samples. However, due to the cloud coverage over M.B. of Champlain Lake (red zoom on Figure 1) and the lack laboratory results for some samples, three additional measures were removed, reducing the size of the calibration database to 17 samples, with a maximum, minimum, mean, and standard deviation of: 40 of Vermont and New York. The lake area is 1269 km 2 and its watershed is 23,720 km 2 . The length and width are 201 and 23 km respectively, and the maximum depth is 122 m. The St-P. Lake is located on the St. Lawrence River, between Sorel-Tracy and Trois-Rivières, in Quebec, Canada. The lake is located downstream and east of Montreal, and upstream and west of Quebec City. The lake area is 353 km 2 and its watershed is 990,000 km 2 . The length and width are 35 and 10 km, respectively, and maximum depth is 20 to 11 m. In situ analyses were made for physical nutriments (SPM, Chlorophyll-a (chl_a), dissolved organic carbon (DOC), total organic carbon (TOC), total nitrogen (TN), and total phosphorus (TP)). Samples were collected over three years (2008 to 2010) for a total of 769 samples. Only two sampling dates matched with the Landsat sensor's cross-pass over the two studied water bodies (25 August 2009, for M.B. of Champlain Lake, and 2 September 2009, for St-P. Lake) for a total of 20 samples. However, due to the cloud coverage over M.B. of Champlain Lake (red zoom on Figure 1) and the lack laboratory results for some samples, three additional measures were removed, reducing the size of the calibration database to 17 samples, with a maximum, minimum, mean, and standard deviation of: 40 (Figure 1)) and Lake Saint-Pierre (St-P (blue zoom (Figure 1))) for the three studied years (2008)(2009)(2010). Values marked in red bold refer to database statistics used in this study.     (Figure 1)) and Lake Saint-Pierre (St-P (blue zoom (Figure 1))) for the three studied years (2008)(2009)(2010). Values marked in red bold refer to database statistics used in this study.

Validation Test Database
In order to test the performance of the SPM modeling, a blind test validation database was used. These data are freely available at the COGESAF (Conseil de gouvernance de l'eau des bassins versants de la rivière Saint-François) web-server (http://cogesaf.sigmont.org/cogesaf/ cogesaf.php, accessed date 11 July 2021). Since 2000, COGESAF has been working with sampling results to monitor water quality in the Saint-Francis (St-F.) river watershed. Water parameters such as SPM and phosphorus, among others, are constantly measured in the rivers belonging to this watershed. However, only the annual medians are uploaded to the server. The St-F. River is 200 km long and has an average width of 2.5 km. Data collected from nine stations (green frame of Figure 1) located on the St-F. River were used in this testing process, with a maximum, minimum, mean, and standard deviation of: 10.00, 2.50, 5.22, and 2.29 mg/L per annual median. Since the in situ SPM data are annual medians, it was necessary to compute annual medians of remotely estimated SPM concentrations as well. Thus, cloud-free Landsat images from April to November, 2009 (N = 12) were used for this purpose.

Image Pre-Processing
Six different models were used to atmospherically correct the two Landsat images: three physical models and three image-based models.

Atmospheric Correction for Flat Terrain (ATCOR)
ATCOR is an approach based on physical data developed by Richter [28]. It aims to minimize atmospheric and illumination effects on high spatial resolution satellite sensors to retrieve physical parameters of the Earth's surface, such as atmospheric conditions (emissivity and temperature), thermal, and atmospheric radiance (https://www.satimagingcorp. com/services/atcor/, accessed date 15 July 2020). The ATCOR model has several submodels (ATCOR-2, ATCOR-3, ATCOR-4, and ATCOR Thermal). Each is optimized for specific conditions (flat terrain, mountainous terrain, airborne remotely sensed images, and correction for thermal images, respectively [29]). This algorithm is implemented in commercial software such as ERDAS Imagine and PCI Geomatics.

Fast Line-of-Sight Atmospheric Analysis of Spectral Hypercubes (FLAASH)
FLAASH is an approach based on physical data developed by Anderson, Felde [30]. It is a sophisticated atmospheric correction model based on the MODTRAN algorithm of radiative transfer. The model considers the radiance reflected from the surface of the Earth and scattered by the atmosphere towards the sensor. The difference between these two radiances is due to the adjacency effect (spatial mixture of radiance among nearby pixels) caused by atmospheric scattering. This algorithm is implemented in ENVI software.

ACOLITE
The ACOLITE is an approach initially developed by Vanhellemont and Ruddick (2014). It is a model designed for aquatic remote sensing applications. In general, the algorithm performs in two steps: (1) correction for the Rayleigh scattering using a look-up table generated using the 6S model (Kotchenova, Vermote [31]), and (2) correction for aerosol based on the assumption of the 0 water-leaving radiance in the SWIR and an exponential spectrum for multiple scattering aerosol reflectance. For this study, an improved version of this algorithm was used. The improvement was in the use of the dark spectrum fitting method instead of the exponential spectrum, which allows selecting the most appropriate band automatically and, hence, avoids amplification of glint and adjacency effects in the atmospheric correction. The ACOLITE model is compiled on Python GUI, which is freely available at: https://odnature.naturalsciences.be/remsem/software-and-data/acolite, accessed date 15 July 2020.

Apparent Reflectance at the Top of Atmosphere (ToA)
The ToA is an image-based approach inspired by the general equation of spectral reflectance proposed by Moran, Jackson [32]. It allows converting the digital number (DN) of images into reflectance using radiance computation. It is not a real atmospheric correction model, since it computes the reflectance captured at the top of the atmosphere, hence the word "apparent". This algorithm is implemented in most commercial software. It is also available in open-source software such as QGIS and coded in packages in R, Python, and MATLAB.

Dark Object Subtraction (DOS)
DOS is an image-based approach developed by Chavez Jr [33]. It aims to minimize the additive effect of the atmosphere caused by haze. The main assumption is that dark objects should represent 1% of reflectance. The amount of added energy in these pixels is caused by haze. Dark objects within the image are identified by an area with clear water in deep lakes or by the histogram method, which selects the DN of haze from the DN frequency histogram of a digital image. This algorithm is implemented in most commercial software. It is also available in open-source software such as QGIS, as well as being coded in packages such as R, Python, and MATLAB.

Cosine of the Sun Zenith Angle (COST)
COST is an image-based approach and is an enhanced version of DOS. In addition to the additive effect correction, like the one used for the DOS algorithm, COST uses an additional correction for transmittance along the path from the ground toward the sensor (known as TAUz). The correction of this module, which is part of the multiplicative effect of the atmosphere, is made by the computation of the cosine of the solar zenith angle. According to Chavez [34], the solar zenith angle cosine is a good approximation of TAUz. This algorithm is also implemented in most commercial software and coded in the R, Python, and MATLAB packages.

Methodological Approach and Evaluation Indices
The flowchart in Figure 2 summarizes the methodological steps adopted in this study. After preprocessing images using the six atmospheric correction models, a correlation Water 2021, 13, 2149 6 of 18 coefficient matrix (CCM) of all possible combinations of the 2-band normalized ratios (2b NR ) was calculated ( Figure 3) (band ratios and subtractions were also tested; the normalized ratios performed the best). Correlation coefficients were computed based on a simple regression (both linear and exponential correlations were tested; the exponential function fit the best.) between the 2b NR indices and in situ measurements of SPM. The 2b NR indices for which the R 2 was higher than 0.35 and were orthogonal one to another were thereafter selected and used in a stepwise multivariate regression for the last calibration. This step reduces the collinearity among the variables (2b NR indices) selected by the stepwise algorithm (blue lozenge in Figure 2).
were thereafter selected and used in a stepwise multivariate regression for the last calibration. This step reduces the collinearity among the variables (2 indices) selected by the stepwise algorithm (blue lozenge in Figure 2).
Due to the reduced size of the database (N = 17), the cross-validation technique was favored. This method involves temporarily removing a given value of SPM from the calibration database and using the remaining observations as a sub-calibration group to estimate the withdrawn value. This operation is repeated for every sample in the database. To assess the accuracy of the modeling, each measured SPM sample is compared to its corresponding estimate using statistical evaluation indices.
For performance analysis with the blind test validation database, Landsat images from April to November 2009 (open water period) over the St-F. River were corrected using the different atmospheric correction models used in this study. Afterwards, the corresponding calibration functions were applied for SPM concentration estimation. The annual medians were thereafter calculated and challenged with the in situ COGESAF data, using the same statistical evaluation indices (orange lozenge in Figure 2).
Flowchart of the methodological approach; blue frames refers to calibration and cross-validation steps, and orange frames refer to the testing step. were thereafter selected and used in a stepwise multivariate regression for the last calibration. This step reduces the collinearity among the variables (2 indices) selected by the stepwise algorithm (blue lozenge in Figure 2).
Due to the reduced size of the database (N = 17), the cross-validation technique was favored. This method involves temporarily removing a given value of SPM from the calibration database and using the remaining observations as a sub-calibration group to estimate the withdrawn value. This operation is repeated for every sample in the database. To assess the accuracy of the modeling, each measured SPM sample is compared to its corresponding estimate using statistical evaluation indices.
For performance analysis with the blind test validation database, Landsat images from April to November 2009 (open water period) over the St-F. River were corrected using the different atmospheric correction models used in this study. Afterwards, the corresponding calibration functions were applied for SPM concentration estimation. The annual medians were thereafter calculated and challenged with the in situ COGESAF data, using the same statistical evaluation indices (orange lozenge in Figure 2). Due to the reduced size of the database (N = 17), the cross-validation technique was favored. This method involves temporarily removing a given value of SPM from the calibration database and using the remaining observations as a sub-calibration group to estimate the withdrawn value. This operation is repeated for every sample in the database. To assess the accuracy of the modeling, each measured SPM sample is compared to its corresponding estimate using statistical evaluation indices.
For performance analysis with the blind test validation database, Landsat images from April to November 2009 (open water period) over the St-F. River were corrected using the different atmospheric correction models used in this study. Afterwards, the corresponding calibration functions were applied for SPM concentration estimation. The annual medians were thereafter calculated and challenged with the in situ COGESAF data, using the same statistical evaluation indices (orange lozenge in Figure 2).
An example of the computing of the 2b NR and its mirror is presented below: with R s (λ A ) and R s (λ B ) being the reflectance of the central wavelengths λ A and λ B , respectively. L and C refer to lines and columns of the correlation matrix (i = ii = 6 (number of Landsat bands)). Four statistical indices, coefficient of determination (R 2 ), root mean square error (RMSEr), bias, and NASH-criterion (Nash) were used. The Nash assesses the performance by comparing the estimated values to the mean of the in situ measurements, producing a result between −∞ and 1.0 (inclusive). A negative Nash result means that it would be better to use the mean of the in situ measurements rather than the model estimates, whereas values between 0.0 and 1.0 are generally considered acceptable levels of performance. The model performance is satisfactory and good for values above 0.6 and 0.8, respectively; the model is perfect when Nash = 1.0. The mathematical equations for the indices are as follows: where n is the size of the calibration database, M and Es are, respectively, measured and estimated SPM values, and M and Es are the average of measured and estimated SPM values, respectively. Figure 4 shows boxplots of the spectral reflectance picked up over the in situ measurements for all models used, including their corresponding medians. Based on the median results, the spectra extracted from the FLAASH, ATCOR, and ACOLITE models have the same shape, typical of case-2 waters; a reflectance peak at the green wavelength, and almost total absorption of backscattered water in longer wavelengths, especially in the SWIR portions. This typical spectral behavior was also perceived for DOS and COST, with a less pronounced reflectance peak at the green. For ToA, no concrete atmospheric correction was made, it was just a conversion of radiance to the reflectance. Hence, shorter wavelengths are most affected by the Rayleigh scattering and consequently, the reflectance at shorter wavelengths is the higher. In fact, by using the dark object subtraction method (for the DOS and COST), it is possible to reduce the additive effect, but the signal recorded by images is still affected by the multiplicative effect. Thereby, the signal is partially corrected using the image-based models, particularly in the visible part of the spectrum, from where the lack of this pronounced peak at the green part of the spectrum originates.

Spectral Analysis
for the DOS, COST, and ACOLITE models were narrower for the blue band compared to the rest of the models. The boxplots presented here underline the homogeneity of the calibration database reflectance values, except for the NIR and SWIR bands, where some outliers are noticed. This is likely related to the effect of another optically active element, such as the chl_a of algae, which are known to be highly reflective in these spectral regions.
Clearly, the computed reflectance is different from one atmospheric model to another, as was expected. Indeed, modeling is an approximation of reality, and each model is based on its own assumptions to approximate this reality. Notably, the reflectance estimated by the six models converges on the same results, but a slight variability is perceived. This inter-variability (medians) and intra-variability (boxplots) could considerably influence the relationship between spectral indices and in situ SPM concentrations and, consequently, the accuracy of SPM modeling using remote sensing data.  The boxplots show that the boxplot standard deviation (BSD) of reflectance values for all models tended to be narrower above the 830 nm wavelength, except for ATCOR; for which BSDs were large for all bands. It can also be observed that the reflectance's BSD for the DOS, COST, and ACOLITE models were narrower for the blue band compared to the rest of the models. The boxplots presented here underline the homogeneity of the calibration database reflectance values, except for the NIR and SWIR bands, where some outliers are noticed. This is likely related to the effect of another optically active element, such as the chl_a of algae, which are known to be highly reflective in these spectral regions.
Clearly, the computed reflectance is different from one atmospheric model to another, as was expected. Indeed, modeling is an approximation of reality, and each model is based on its own assumptions to approximate this reality. Notably, the reflectance estimated by the six models converges on the same results, but a slight variability is perceived. This inter-variability (medians) and intra-variability (boxplots) could considerably influence the relationship between spectral indices and in situ SPM concentrations and, consequently, the accuracy of SPM modeling using remote sensing data.

Modeling of Suspended Particulate Matter
Two main steps were used to model the SPM concentration: (1) calibration of empirical models using a stepwise multivariate regression, and (2) performance evaluation using the cross-validation technique. The evaluation of each model's performance was quantified using Equations (1)-(4).

Calibration of the Models
It is important to underline that the correlation coefficients from either side of the CCM diagonal are mirrors (see the algorithm on Figure 3). The analysis of the CCMs (presented at the bottom right of Figure 5's plots) highlights three main findings. First, the correlations of image-based algorithms are practically the same and are very different from those of the physical algorithms. Thus, the SPM relationships to reflectance along the electromagnetic spectrum are controlled by different patterns, depending on the type (image-based or physical) of atmospheric correction model used. Second, the best correlations for image-based algorithms were recorded in the visible part of the spectrum, whereas for physical algorithms the highest correlations were mainly located in the green/red part. Third, correlations of SPM concentrations in the NIR and SWIR parts of the spectrum were qualified as low to modest for most atmospheric correction models. Some relatively moderate correlations (either positive or negative) were, however, recorded for the ATCOR model.
The findings of the CCMs were confirmed with the regression plots. The best stepwise estimator of SPM concentrations was the 2b NR (R, B) (2-band normalized ratio between red and blue bands) for image-based models, whereas the best stepwise estimator of SPM concentrations was the 2b NR (R, G) (green and blue bands) for physical models. In the case of the DOS and COST models, the calibration dot distribution and R 2 were very similar (the plots are practically the same, but the intercepts and slopes of their calibration functions are slightly different). These findings were expected, because image-based models use a similar algorithm, with only a few differences in the input parameters. Image-based models can explain up to 87% of the SPM concentration variance (R 2 = 0.87 for the COST and DOS). Physical models were able to explain up to 75% (R 2 = 0.75 for ACOLITE) of the variance using the same database. It is also important to highlight that all models are univariable, despite some existing high correlations (see CCMs), notably for the ATCOR model.

Cross-Validation of the SPM Modeling
Results of the cross-validation and the statistical evaluation indices are presented in Figure 5 as well (frames at the top left of Figure 5's plots). Again, image-based models reached the best results, notably for the COST and DOS models. They were almost unbiased and could explain up to 84% of the SPM concentration variance. The robustness of SPM concentration modeling was also confirmed by Nash (up to 0.84). This index is very sensitive, because it uses the mean value of in situ measurements in its evaluation. The slightest estimation error can diminish the Nash results. It is, however, important to emphasize that according to the 1:1 line, except for ToA, all models tend to underestimate high SPM concentrations, with different variability levels. For the physical models, the ACOLITE achieved the best performance, as expected, with R 2 = Nash = 0.67. The worst results were recorded for the FLAASH model. Again, except for the FLAASH model, all models showed a low bias (<0.04 mg/L), with the best performances recorded equally for the DOS and COST models (=−0.01 mg/L). The models' RMSE, which quantifies the modeling accuracy, was less than 0.58 mg/L. The best performance was recorded for DOS with 0.27 mg/L, and for the physical models, ACOLITE had the best performance, with an accuracy of 0.39 mg/L. The findings of the CCMs were confirmed with the regression plots. The best stepwise estimator of SPM concentrations was the 2 ( , ) (2-band normalized ratio between red and blue bands) for image-based models, whereas the best stepwise estimator of SPM concentrations was the 2 ( , ) (green and blue bands) for physical models.
In the case of the DOS and COST models, the calibration dot distribution and R 2 were very similar (the plots are practically the same, but the intercepts and slopes of their calibration functions are slightly different). These findings were expected, because image-based models use a similar algorithm, with only a few differences in the input parameters. Image-based models can explain up to 87% of the SPM concentration variance (R 2 = 0.87 for the COST and DOS). Physical models were able to explain up to 75% (R 2 = 0.75 for ACO-LITE) of the variance using the same database. It is also important to highlight that all models are univariable, despite some existing high correlations (see CCMs), notably for the ATCOR model.

Evaluation of the SPM Modeling with the Blind Test Validation Data
To assess the long-term modeling accuracy, the annual medians of atmospheric correction model Landsat-derived SPM estimates were challenged with in situ SPM measurements (annual medians) collected by the COGESAF monitoring program. The results are presented in Figure 6 and which again underline the high performance of image-based algorithms compared to physical models. The DOS and COST models achieved the best estimates with R 2 of 0.87 and 0.86 (in the same accuracy range as for the calibration/crossvalidation steps) and Nash of 0.60 and 0.58, respectively, highlighting their robustness for long-term modeling. The ToA model did not achieve the same quality of estimates in this validation step. In fact, since the ToA is a radiance to reflectance conversion algorithm, the effect of the atmosphere is not corrected at all (images from one date to another are not "standardized"), which could affect the quality of the SPM modeling. The high performance of ToA in the calibration/cross-validation steps was probably related to the small number (2) of images used and the short date difference between them (August to September). These two months are part of the same season and, therefore, the weather conditions are almost identical. For the validation with the test database, the images used were part of three different seasons (late spring, summer, and early fall). This variation in weather conditions likely played a key role in decreasing the accuracy of the SPM estimates using the same calibration function.
Concerning the physical models, the modeling accuracy was mediocre for all models (Nash <0; this means that the average of the in situ measured SPM data is a better estimator than the model estimates). It is important to recall, however, that the modeling process is generally tainted by two uncertainties: systematic (bias) and random (RMSE) errors. The first is quantifiable and correctable, while the second is not. With this in mind, the FLAASH model could be a potential candidate for the SPM modeling of inland water bodies. Indeed, it is clear that all SPM concentrations are systematically overestimated (the alignment of the points is parallel to the 1:1 line) with an R 2 of 0.69. Thereby, it would be possible (in future works) to use the bias of this study to systematically correct the SPM concentrations as so:ŜPM =ŜPM + bias. This could hugely enhance the modeling accuracy using the FLAASH algorithm.
The results of the blind validation may appear inconsistent, as the accuracy of the FLAASH modeling was the worst for the calibration/cross-validation steps. However, it is important to note the difference, in terms of concentration range, between the two in situ SPM databases (up to 40 mg/L for the calibration/cross-validation database versus up to 10 mg/L for the blind test validation database) that may have played an important role in this incompatibility. According to Figure 5 the FLAASH model tends to significantly underestimate the concentration for high values. In contrast, low SPM values (<10 mg/L) were well distributed around the 1:1 line. Since the blind test validation data does not include high values, this may explain the relatively high performance of the FLAASH model compared to the ACOLITE and ATCOR models for this specific validation database. Therefore, it is necessary to challenge the FLAASH model with higher SPM values.
The above results answered the main hypothesis of this work. Are physical atmospheric correction models, with their complex radiative transfer functions, more appropriate for modeling inland water quality parameters (SPM for this specific work), or are image-based models, with their simplistic corrections, the most appropriate? Whether based on cross-validation or validation with the blind test data, image-based models (DOS and COST) performed the best. Although these results were not expected, they are actually quite logical. Knowing that the water leaving radiance is initially very low (of the order of a hundredth), any slight over-or under-correction of the initial signal recorded by satellite images would alter the reflectance shape along the spectrum and, consequently, the images post-processing.
in this incompatibility. According to Figure 5 the FLAASH model tends to significantly underestimate the concentration for high values. In contrast, low SPM values (<10 mg/L) were well distributed around the 1:1 line. Since the blind test validation data does not include high values, this may explain the relatively high performance of the FLAASH model compared to the ACOLITE and ATCOR models for this specific validation database. Therefore, it is necessary to challenge the FLAASH model with higher SPM values. The above results answered the main hypothesis of this work. Are physical atmospheric correction models, with their complex radiative transfer functions, more appropriate for modeling inland water quality parameters (SPM for this specific work), or are image-based models, with their simplistic corrections, the most appropriate? Whether based on cross-validation or validation with the blind test data, image-based models (DOS and COST) performed the best. Although these results were not expected, they are actually quite logical. Knowing that the water leaving radiance is initially very low (of the order of a hundredth), any slight overor under-correction of the initial signal recorded by satellite images would alter the reflectance shape along the spectrum and, consequently, the images post-processing.  (Figure 7)) and waters highly laden in SPM (in the case of Saint-Pierre (St-P.) Lake (Figure 8)). It is important to highlight that these two images were the ones used in the calibration/cross-validation steps. The SPM maximum in situ concentrations were 12.30 and 40.30 mg/L for M.B. of Champlain Lake and St-P. Lake, respectively (Table 1)

Discussions
The above results illustrate the ability of remote sensing data to estimate the SPM concentrations in inland water bodies with an acceptable accuracy rate, when using the right atmospheric correction model. Contrary to expectations, the results highlighted that image-based atmospheric models are the best for retrieving SPM concentrations. This finding is nonetheless coherent with the results of Nazeer, Nichol [39], who assessed five different atmospheric correction methods, including three physical methods (6S, FLAASH, and ATCOR) and two image-based methods (DOS and ELM) using in situ multispectral radiometer measurements (as a radiometric analysis, unlike this study which The Figure 7 shows an example of the SPM modeling results over the M.B. of Champlain Lake. From the RGB product, it is clear that the waters are poor to moderately laden with SPM (dark shade), which is also confirmed by the outputs of all models. However, image-based models seem to better reproduce the spatial distribution, as it is perceived on the RGB product, compared to physical models. For the latter, the SPM spatial distribution is lost and the estimate seems to be random (pixels with high to very high concentrations are adjacent to pixels with low to very low concentrations, giving a salt and pepper aspect to SPM the modeling).
On the other hand, when waters were highly laden with SPM, the spatial distribution was quite similar for all the model outputs ( Figure 8). It is, however, important to underscore that SPM concentrations were slightly overestimated by all the models compared to the outputs of the ToA-based model, particularly for the St-F. (Saint Francis) river. According to [35,36], the St-F. River is known to drain waters with low to moderate SPM concentrations and the Yamaska river (Y.R.) is known to drain waters with the highest SPM concentrations in the St. Lawrence basin [37]. This is because the two tributaries belong to different watersheds. The St-F. River watershed is composed of 66% forested lands (versus 42% for the Y. R.) and of 23% agrarian lands (versus 54% for the Y. R.) [38]. This may explain this remarkable difference in the SPM load of the two rivers. In all model outputs, the Y.R. is associated with high SPM concentrations, confirming the assumption that all models reach the same accuracy when waters are highly laden in SPM. On the other hand, excepting the image-based models, the physical models also associated high to very high SPM concentrations with the St-F. River. This was expected, as all physical models tend to overestimate low to medium SPM concentrations ( Figure 6).

Discussions
The above results illustrate the ability of remote sensing data to estimate the SPM concentrations in inland water bodies with an acceptable accuracy rate, when using the right atmospheric correction model. Contrary to expectations, the results highlighted that image-based atmospheric models are the best for retrieving SPM concentrations. This finding is nonetheless coherent with the results of Nazeer, Nichol [39], who assessed five different atmospheric correction methods, including three physical methods (6S, FLAASH, and ATCOR) and two image-based methods (DOS and ELM) using in situ multispectral radiometer measurements (as a radiometric analysis, unlike this study which used a limnological-based analysis). They concluded that the DOS method outperformed the FLAASH and ATCOR methods.
In addition, our results highlighted a strong correlation with 2b NR (R, B) when using image-based images, versus 2b NR (R, G) when using physical images. This may be related to the fact that physical algorithms tend to over-correct the reflectance in the blue part of the spectrum compared to the image-based models ( Figure 4). Indeed, the short wavelengths of the spectrum (some parts of ultraviolet and most of the blue) are less absorbed by clear waters and consequently penetrate the deepest in the water column [19]. Thus, there is an inverse correlation between the reflectance captured by the blue band and the concentration of SPM (the less SPM, the clearer the water [3]). This relationship is most likely due to the strong absorption of CDOM or chl_a from algae in this part of the spectrum. On the other hand, the presence of SPM in water causes turbidity. Turbid waters (caused either by SPM [40], dissolved organic carbon [41], algal blooms [40], etc.) are known to reflect strongly in the red part of the spectrum. Therefore, this natural optical contrast (strong absorption in the blue versus strong reflection in the red) across the spectrum, should be most correlated with SPM. The overcorrection of the blue band reflectance by physical models probably destroys this natural physical dependence that exists between the SPM concentration and the energy absorbed by waters in the blue part of the spectrum and shifts it to the closest band (green; less correlated with clear waters). This may explain the higher correlations for image-based algorithms (up to 87%), which are based on the blue Landsat band, and the relatively lower correlations (up to 75%) for physical models, which are based on the green Landsat band ( Figure 5).
Physical atmospheric correction methods usually use two or more NIR (or SWIR) wavebands, where the marine signal is assumed to be zero (open ocean waters), thus the signal in the NIR (or SWIR) can be regarded as entirely atmospheric, and it is used to determine the aerosol model. However, the signal in the NIR (or SWIR) is not negligible in case-2 waters, due to the concentrations of particulate matter in inland water bodies, and, consequently, maritime correction over inland water causes low or even negative water reflectance in the visible bands [15,16,42]. In addition, the physical models' inputs are not initially optimized to generate accurate reflectance estimates for inland water bodies. Indeed, limited choices of aerosol type are available in commercial remote sensing software; for instance, rural, desert, urban, and maritime are the choices available for the ATCOR plug-in integrated into PCI Geomatics. The nearest choice would be the rural option, but it is not optimal for inland waters, since the aerosols captured by these aquatic systems are very different from those located on top of forests or on crops in agricultural fields. The rural option is optimized to correct the aerosols of vegetation more than those of water bodies [39]. As a consequence, reflectance over inland water bodies is estimated based on incorrect assumptions (0 water-leaving radiance in the NIR (or SWIR) and aerosol origin/type models), leading to the destruction of this natural optical-physical relationship (in terms of reflectance) of the water parameters across the electromagnetic spectrum.

Conclusions
The goal of this work was to compare the performance image-based atmospheric correction (ToA, DOS, and COST) models to the common physical models (ATCOR, FLAASH, and ACOLITE) most used for retrieving SPM concentrations in inland water bodies. Landsat images were used for this purpose. The results highlighted the potential of remote sensing data to retrieve SPM concentrations. The best performances were recorded for the image-based models (R 2 up to 0.84, Nash up to 0.84, bias down to −0.01 mg/L, RMSE down to 0.27 mg/L) using the blue-red spectral indices. The results of physical models were acceptable (R 2 up to 0.67, Nash up to 0.67, bias down to −0.02 mg/L, RMSE down to 0.39 mg/L) using the green-red spectral indices. Validation with the blind test data, as a long-term analysis, also affirmed the above finding, with an R 2 up to 0.87, Nash up to 0.60, bias down to −1.28 mg/L, and RMSE down to 1.54 mg/L. Nevertheless, the accuracy of the ToA decreased significantly, probably due to large weather variations resulting from the long-term analysis. The spatial distribution of SPM concentration estimates underlined the good performance of image-based models when waters are poorly to moderately laden with SPM. For water highly laden with SPM, the modeling accuracy was almost the same for all models. To summarize, the results of this study highlighted that image-based models, particularly the COST and DOS, are more appropriate than physical models for retrieving SPM concentrations in inland waters if the inputs of the physical atmospheric parameters are not well controlled. Reanalysis of these results using a larger database and covering a wider range of SPM concentrations is strongly recommended and would help to approve/disapprove our results. Nevertheless, challenges are still on the table to remotely estimate inland water quality parameters using satellite images for large-scale applications. To this end, the development of physical atmospheric correction algorithms specific to inland waters is greatly needed.