1. Introduction
Sustainable management of the quality of inland and coastal waters has emerged as a growing demand due to human-driven nearshore activities and climate change that substantially threaten the aquatic ecosystems [
1,
2]. For instance, eutrophication mainly caused by increased agricultural and industrial activities introduces major problems to the ecosystem health, aquaculture and fisheries activities, recreation, and tourism [
3,
4]. In this context, spatially and temporally explicit information on the water quality parameters is central in furthering our understanding of ecosystem services and health as well as environmental impact assessment [
5,
6]. Field-based measurements of constituents have traditionally been a key source for monitoring the water quality in inland and coastal environments. However, in situ observations are limited in space and time, which severely restricts their utility for capturing the spatiotemporal dynamics of the constituents such as Chl-a, TSM, and CDOM [
7,
8]. Moreover, field sampling and analyzing the samples in the laboratory are costly and time consuming. To overcome these issues, remote sensing techniques are pursued as complementary to in situ measurements, particularly in open oceans and coastal waters. The variation of every optically active constituent such as Chl-a, TSM, and CDOM alters the water-leaving radiance across the spectrum [
8]. The concentration of Chl-a, i.e., the photosynthetic pigment of every phytoplankton species, is a common proxy of the trophic status in the water bodies [
1,
9,
10]. The dynamics of phytoplankton biomass and particularly harmful blooms are major stressors impacting the aquatic food web and habitats, biogeochemical cycles, and aquaculture [
6,
11]. TSM encompasses organic and mineral suspended solids and is closely associated with water turbidity and Secchi depth [
10]. The spatiotemporal monitoring of TSM can contribute to studies of sediment transportation, water quality assessment, and lake management [
10,
12]. CDOM is a blend of organic molecules arising from terrestrial and aquatic vegetation, phytoplankton, and bacteria [
10]. This parameter affects the available light in the water and serves as a proxy of the carbon content in lakes, which can contribute to the carbon cycle studies and water treatment projects for drinking water [
13,
14].
A variety of techniques are developed for the spectrally based retrieval of constituents that can be categorized into three main approaches [
8]. (i) The first approach consists of empirical methods that train/calibrate a regression model (e.g., polynomial) between image-derived features (e.g., band ratios) and associated concentrations of the constituent of interest known from in situ observations. Upon forming the relation between spectral features and the concentrations of the constituent (i.e., estimation of the coefficients of the regression model), the model can be used for estimation of the constituent for any other spectrum. (ii) The second approach includes semi-empirical methods that are built upon training a regression model using a broad range of in situ observations. Unlike empirical methods for which the training is local (site-specific), the semi-empirical methods are designed as generic models due to the large database of in situ bio-optical measurements used through training the regression model such as the OC3 bio-optical model [
15,
16]. However, the training samples of existing models are mostly from oceanic or coastal waters [
15], which can introduce uncertainties applying them to optically complex waters. (iii) The third approach consists of physics-based methods that rely on inverting the radiative transfer model to retrieve the constituents. The inversion procedure is mainly based on two different approaches: (1) neural networks that are trained using a large database of radiative transfer simulations such as the Case-2 Regional/Coast Colour (C2RCC) algorithm [
17], and (2) inverse modeling that compares any measured (image) spectrum with radiative transfer simulations in a range of constituents seeking the best fit such as the Water Color Simulator (WASI) model [
18]. There are several studies in the literature relying on one of the mentioned approaches: for instance, the WASI processor was used for physics-based estimation of TSM and bathymetry in the Venice Lagoon using high-resolution PlanetScope imagery [
19] as well as for the first study on retrieving water quality parameters from hyperspectral PRISMA imagery [
20]. Time-series of Landsat imagery are exploited to capture the dynamics of the Chl-a in inland lakes in Finland based on empirical methods [
6]. Multi-sensor imagery and in situ observations are integrated for the long-term retrieval of constituents over the Dutch Wadden Sea [
21]. Empirical methods are applied to Landsat TM imagery to map the constituents of lakes and reservoirs in Spain [
22]. A set of novel features such as those derived from the transformation of the color space or coordinate system of the feature space is proposed through the empirical retrieval of constituents over a broad range of optical conditions [
8].
The remote sensing of inland waters still requires significant advancements compared to the studies carried out in the coastal and oceanic environments [
23,
24,
25,
26,
27,
28,
29]. With the recent availability of imagery captured by Landsat-8 Operational Land Imager (OLI) and Sentinel-2 MultiSpectral Instrument (MSI), there is a growing trend in applications of this imagery in mapping constituents of inland bodies of water [
30,
31]. This is mainly due to the enhanced radiometric resolution (12-bit) and high spatial resolution (10−30 m) of the imagery provided by these sensors, because the spatial resolution of ocean color sensors that are either sun-synchronous or geostationary are too coarse (hundreds of meters at the best) for most of the inland waters [
32,
33,
34]. The first attempts to derive lake constituents using Sentinel-2 imagery have been made in Estonian lakes based upon empirical methods [
31] and in an oligotrophic German lake using a physics-based method [
35], which demonstrated promising potential. A set of empirical methods are examined through Chl-a estimation from Sentinel-2 data with particular attention to the atmospheric correction methods [
7]. The applicability of Sentinel-2 images in the estimation of TSM is demonstrated in Poyang Lake, China [
36]. The utility of the MSI’s red-edge band (centered at 783 nm) is proved in estimating water quality parameters in black lakes where the reflectance over the visible spectrum is negligible due to high absorption by CDOM [
5]. The WASI tool is employed for the estimation of constituents across an oligotrophic lake (Lake Starnberg, Germany) using Sentinel-2 images, which is accompanied by the retrieval of bathymetry and substrate types in optically shallow parts of the lake [
35]. C2RCC is examined through the estimation of constituents in Baltic lakes using Sentinel-2 images [
10].
Given that current versions of C2RCC and WASI are relatively new tools available to the public, their applications in inland waters are yet scarce, and there is not yet available any comparison between their products. Moreover, OC3 is being used also in optically complex waters for retrieving the Chl-a concentration [
37,
38]. Thus, there is a need to perform a comparative analysis among the three methods to better understand the effectiveness of each processor in retrieving water quality parameters over lakes with different bio-optical conditions. The main goal of this study is to perform such cross-method comparison for evaluating the retrievals of Chl-a for which in situ matchup data are also available from Italian lakes. Complementary to our Chl-a analyses, relative comparisons are performed for all constituents including TSM and CDOM retrievals of C2RCC and WASI. In this regard, the following objectives are pursued: (i) demonstrating the utility, challenges, and performance of publicly available methods (C2RCC, WASI, and OC3) in retrieving Chl-a concentration of lakes; (ii) analyzing the cross-method consistency of Chl-a and other constituents (TSM and CDOM), and (iii) demonstration of the potential of Sentinel-2 (MSI) imagery in multitemporal retrieval of the mentioned constituents in lakes.
Section 2 introduces the studied lakes and datasets.
Section 3 describes C2RCC, WASI, and OC3 methods and compares their characteristics. A set of metrics for accuracy and consistency assessments are also described. The results and discussions are provided in
Section 4 and
Section 5, respectively. The conclusions and outlooks are given in
Section 6.
2. Studied Lakes and Datasets
We have considered three subalpine lakes including Garda, Idro, and Ledro in northern Italy as well as a turbid lake in Central Italy (Trasimeno) for evaluation of the water quality retrieval algorithms in a multitemporal analysis framework (
Figure 1). The studied lakes are considered as case II waters with complex optical properties [
39,
40]. Lake Garda (
Figure 1a) is the largest lake in Italy. It is a key source of drinking and agricultural water and hydropower production. The main inflow of Lake Garda is the Sarca River in the northern part of the lake. Lakes Ledro (
Figure 1b) and Idro (
Figure 1c) are smaller lakes close to Lake Garda. The Chl-a concentration is low in these subalpine lakes and particularly in Lake Garda (<2.5 mg/m
3 since 2012, [
41]) and TSM < 15 g/m
3 and
< 1.1 m
−1 [
42]. Lake Trasimeno (
Figure 1d) has different bio-optical conditions than the other lakes. It is a shallow (depth < 6 m), turbid (Secchi depth ≈ 1.1 m), and eutrophic lake. Long-term measurement of Chl-a shows a range of 2 to 40 mg/m
3 in two stations in the lake (
Figure 1). The average TSM is about 10 g/m
3 [
40], and a mean value of 0.3 m
−1 can be considered for
[
43]. The southeast corner of Lake Trasimeno is colonized by aquatic vegetation. Seasonal algal blooms occur in the lake mostly in the period between July to September [
40,
43]. The selected lakes cover a relatively broad range of bio-optical conditions (from oligotrophic-mesotrophic to eutrophic) [
44,
45] that provide a comprehensive dataset for our inter-comparison analyses (
Figure 1). They are optically deep, so the target methods are all applicable.
2.1. Sentinel-2 Imagery
Forty and 23 scenes (in total 63) with minimal cloud cover (<5% within the frame) from Sentinel-2A and Sentinel-2B, respectively, are selected for the subalpine lakes for a multitemporal analysis spanning from July 2016 to September 2019. The joint use of Sentinel-2A and Sentinel-2B data allows for denser temporal analysis and increasing the number of in situ matchups. Each image covers all three subalpine lakes. Seventeen cloud-free images (12 from Sentinel-2A and 5 from Sentinel-2B) of Lake Trasimeno are also available with corresponding in situ Chl-a data. Since most of the water quality relevant bands (< 1000 nm) are acquired at either 10 or 20 m, level-1C images are resampled to 20 m spatial resolution. Downsampling of the bands with 10 m resolution to 20 m enhances the signal-to-noise ratio and tends to enhance the retrievals.
The atmospheric correction is performed by the C2RCC processor, which provided accurate remote sensing reflectance (
Rrs) in previous studies [
34,
46,
47]. Apart from the demonstrated high quality of
Rrs derived from C2RCC, it should be noted that the publicly available version of C2RCC works only with the built-in atmospheric correction. Thus, to make results consistent and comparable, the water quality retrieval methods are supplied with the same
Rrs after C2RCC atmospheric correction. To ensure the reliability of input
Rrs data, we investigated C2RCC’s quality flags [
46] including (i) Rtosa_OOS: the input spectrum is out of the training range of the atmospheric correction neural net, (ii) Rtosa_OOR: the input spectrum is out of the training range of the atmospheric correction neural net, (iii) Rhow_OOS: the Rhow input spectrum to the inherent optical properties (IOPs) neural net is probably not within the training range of the neural net and the inversion is likely to be wrong, (iv) Rhow_OOS: one of the inputs to the IOP retrieval neural net is out of training range, and (v) Cloud risk: high downwelling transmission indicates cloudy conditions. None of the first four flags were raised for the analyzed images, indicating no issue identified by the processor regarding the quality of inputs to the atmospheric correction and IOP retrieval neural networks. We excluded the pixels with the cloud risk flag for all images. Samples of
Rrs spectra derived from the C2RCC atmospheric correction are illustrated in
Figure 2. The spectra are provided for 8 July 2017 when all case studies are captured by Sentinel-2. Spatial windows (5 × 5 pixels) are applied to extract the average spectra at the location of stations (the central station for Trasimeno). As expected, Lake Trasimeno has very different optical characteristics from the others. In particular, the significantly higher
Rrs across the spectrum (>500 nm) can be attributed to the higher TSM concentration [
48] of Lake Trasimeno compared to the others.
2.2. In Situ Data
Twenty-eight samples were available from a measurement station in the northern part of Lake Garda (shown in
Figure 1a). Furthermore, two stations in Lake Trasimeno (
Figure 1d) provided 33 in situ matchups. Moreover, four and five in situ matchups were available from stations in Lake Idro and Ledro, respectively (
Figure 1b,c). The measurements in Lakes Idro and Ledro are less frequent than the other lakes, and the availability of the data is further restricted by preserving a minimal time gap with the satellite overpass. The stations do not provide TSM and CDOM measurements. The image-derived values of Chl-a centered at the location of the stations are averaged for the matchup analyses [
49]. The in situ concentration of Chl-a is measured based on spectrophotometric experiments following the standard methods [
50,
51]. The measurements in Lake Garda are all near simultaneous (
h) with the Sentinel-2 overpasses. The measurements in other lakes are acquired within
days on average from the satellite overpasses.f.
5. Discussion
In this study, the freely available inversion methods of C2RCC and WASI, as well as the semi-empirical OC3 method, are examined and compared through processing a long time-series of Sentinel-2 imagery acquired over different bio-optical conditions. In the subalpine lakes, C2RCC and WASI provided products with high consistency over low-CDOM cases (
< 0.5 m
−1). Although in situ matchup validation is performed only for Chl-a (the main goal of the study), the high consistency of the derived TSM and CDOM concentrations and their spatial distributions derived from two independent methods indicates their reliability also for these parameters. Given the very different inversion approaches of C2RCC and WASI, it is unlikely that there is a systematic error that affects both methods in the same way. Moreover, the ranges of parameters agree well with the available information from the studied lakes (
Section 2). This is also in line with the in situ Chl-a data. The comparison of methods and in situ matchups from Lake Garda reveal unrealistic high Chl-a estimates of C2RCC-N when WASI-based
is relatively high. The Chl-a values based upon C2RCC-N exceed 25 mg/m
3 for the high-CDOM cases (
Figure 5 and
Figure 7), which is much higher than the values reported in a recent study in Lake Garda (< 2.5 mg/m
3) over the studied period [
41]. A visual inspection of
Figure 4 conveys that temporal trends of average TSM and CDOM derived from WASI and C2RCC-N are in good agreement in Lake Garda. The large mismatches are again related to high-CDOM cases. The training of C2RCC-N has been based on relatively low values of
(< 1 m
−1), whereas according to the long-term observations, this parameter can reach values above 1.2 m
−1 in Lake Garda [
42]. The limited range of
considered through training the C2RCC leads to an underestimation of this parameter for the high-CDOM cases. Excluding the high-CDOM cases leads to a remarkable enhancement of the agreements, e.g., improvement of Chl-a R
2 on the order of 0.39 and RMSD of 0.21 mg/m
3 comparing C2RCC-N against WASI. According to a previous long-term study of Chl-a, it can be inferred that the spatiotemporal RSTD does not exceed 0.8 within three representative stations considered over Lake Garda [
42]. This serves as a proxy that the RSTD values of WASI and OC3 are more reliable than those of C2RCC. In Lake Ledro, the largest discrepancies among retrievals of methods are again associated with relatively higher values of CDOM. The differences are not as large as those for the high-CDOM cases of Lake Garda. However, this indicates a potential confusion between CDOM and Chl-a spectral characteristics while processing the data with C2RCC. The temporal profile of average Chl-a for Lake Idro is in line with the results from Lake Garda, which confirms that retrievals of Chl-a based on C2RCC-N are problematic (extreme values) for high-CDOM cases (
Figure 14). Although C2RCC-E provided more reliable estimates of Chl-a for the high-CDOM cases, the retrievals of CDOM and TSM were problematic (extreme values and noisy maps). C2RCC-E involves a very broad range of IOPs (e.g.,
up to 60 m
−1) that may introduce a risk of diverging from the actual solution through the inversion. However, the effect of the range of training IOPs requires more investigations. C2RCC-E showed benefits in the estimation of constituents in Lake Trasimeno. The flexibility of WASI in parametrization allowed for better characterization of the high-CDOM cases in the subalpine lakes and high-Chl-a cases in Lake Trasimeno. In situ matchups of Chl-a reveal the better overall performance of WASI. OC3 captures the overall relative spatiotemporal changes of Chl-a, although the values are mainly overestimated. This problem was noted also in other studies [
37,
38].