Monitoring Water Transparency in Shallow and Eutrophic Lake Waters Based on GOCI Observations

Water transparency represented by the Secchi disk depth (Zsd) plays an important role in understanding water ecology environment variations, especially for optically complex and shallow lake waters. In this study, using in situ measured remote sensing reflectance (Rrs), diffuse attenuation coefficient (Kd), and Zsd data collected in Lake Taihu (China), a regional algorithm for estimating Kd from Rrs was designed, and the semi-analytical model proposed by Lee et al. (2015) (hereafter called Lee_2015 model) was refined using a linear scaling correction for remote sensing of Zsd. The results showed that a good agreement between the derived Kd and in situ measured data (mean absolute percentage error (MAPE) = 26% for Kd(490); MAPE < 5% for Kd at 443, 555, and 660 nm). The in situ Rrs-derived Zsd results using the refined Lee_2015 model compared well with the in situ measured Zsd (R2 = 0.72 and MAPE = 36%), which was an obvious improvement over the Lee_2015 model in our study region. Subsequently, the refined Lee_2015 model was applied to the geostationary ocean color imager (GOCI) observations between 2012 and 2018 to yield the spatial and temporal variations of water transparency in the Lake Taihu waters. The long-term mean distribution of Zsd revealed that water transparency values in the northeastern Lake Taihu were generally higher than those in the southwest part. Monthly climatological Zsd patterns suggested that the Zsd distributions had large temporal variability, and distinct monthly patterns of Zsd existed in different subregions of Lake Taihu. The significant interannual variations of Zsd in Lake Taihu are probably affected by a combination of the water column stability mainly caused by wind, water temperature, human activity, and riverine discharge. The present study can provide a new approach for quantifying water visibility and serve for water-color remote sensing of optically complex and highly turbid waters.


Introduction
Water transparency (or water clarity) is a first-order indicator of water quality and generally quantified as the Secchi disk depth (Zsd) [1]. It is related to the light penetration and attenuation of the underwater ecosystems [2], which plays an important role in understanding water ecology environment variations and biogeochemical processes, such as phytoplankton photosynthesis and growth [3,4], heat transfer in the upper water layer [5][6][7], sediment transportation, and resuspension [8,9]. Actually water transparency is a function (with a generally negative relationship) of the concentrations of chlorophyll-a (Chla), total suspended matter (TSM), and colored dissolved organic matter (CDOM) [10]. Therefore, knowledge of water transparency could provide valuable information when studying aquatic environmental changes and marine ecosystem.
Traditionally, the access to Zsd measurements largely depend on the cruise surveys. However, the traditional method is time-consuming and laborious. The main disadvantage of this method is the fact that large spatiotemporal variations Zsd are difficult to reveal due to sparse sampling data. Luckily, satellite remote sensing technology can overcome the limitation of the field method by providing a synoptic observation at large spatial and temporal scales. Over the past decades, satellite remote sensing as an effective tool has been widely used to monitor water clarity or water quality at both global and regional scales [11,12]. The remote estimation of Zsd in the oceanic waters or inland lakes were frequently based on various satellite images recorded by both ocean color sensors (e.g., MERIS, MODIS, SeaWiFS) [2,13,14] and land-observation systems (e.g., Landsat series and Huanjing-1) [15,16]. The revisit periods of these polar-orbiting satellites are at least longer than one or two times a day (e.g., 16 days for Landsat series) [17]. They are largely limited to perform comprehensive analysis without the adequate data of high temporal resolution, not to mention the pixels contaminated by opaque clouds. Compared to polar-orbiting satellite sensor, the geostationary satellite sensors can provide observations at a short revisit period to monitor the daily dynamics of water clarity. As the world's first ocean color geostationary satellite sensor, the geostationary ocean color imager (GOCI) provides near real-time observations over northeast Asia (including Lake Taihu as our study region), with hourly intervals up to eight times a day and the spatial resolution of 500 m [18]. Thus, the GOCI satellite data are very useful to monitor the water transparency in submesoscale regions such as the estuaries and large individual lakes.
In recent years, many remote sensing models and algorithms have been designed to estimate Zsd using satellite observations [19], including empirical models and semi-analytical models. The empirical methods are based on the empirical relationships between Secchi disk depth and Rrs of single band or band combination [20][21][22]. The empirical methods are easy to develop and implement, once an in situ field dataset in a certain region are available. However, these methods are locationdependent and may not be suitable to other waters, especially for the mixed inherent optical properties [2]. Compared to empirical models, semi-analytical models utilize a radiative transfer theory or bio-optical models, coupled with some empirical relationships to derive several important parameters. Thus, the semi-analytical models are generally feasible without the need to recalibrate [23]. Up to now, two main semi-analytical models were widely used to estimate Zsd from remote sensing data. One model was developed by Doron et al. [24] based on a classic underwater visibility theory, and Zsd was well described by the inverse of diffuse attenuation coefficient (Kd; m -1 ) and the beam attenuation coefficient (c; m -1 ). Here, c is the sum of the total absorption and scattering coefficients. Consequently, the remotely sensed estimation of Kd and c became the critical process of detecting vertical visibility of water columns. The Kd can be directly obtained using remote sensing data. However, because remote sensing only obtain backscattering (not forward scattering) using the inversion models [25], the scattering coefficient for c calculation is impossible to obtain from passive remote sensing in theory, introducing uncertainty into the retrieved c data and further into remote sensed Zsd estimations. In addition, Lee et al. [26] addressed some limitations and shortcomings in the classic underwater visibility theory. Thus, based on a novel underwater visibility theory, Lee et al. [26] [10,12,15]. However, Feng et al. [2] found that the Zsd estimation using the Lee_2015 model showed systematically lower than the measured results, and thus used a linear scaling correction over the Zsd retrievals to improve the performance of the Lee_2015 model on fifty large lakes on the Yangtze Plain. More importantly, the in situ measured Zsd values in most shallow inland lakes with optically complex waters are distributed only in a narrow range below 1.5 m. Therefore, in the current study, it is still necessary to examine the effectiveness and feasibility of the Lee_2015 model in such inland lake waters.
In this paper, we focus on the applicability of the Lee_2015 model for remote sensing of water transparency in shallow and eutrophic lake waters, taking Lake Taihu (China) as a typical example. We first validated the performance of the Lee_2015 model in Lake Taihu using in situ measured observations; and then applied the refined model to long-term GOCI data between 2012 and 2018 for documenting the spatiotemporal variability of Zsd in Lake Taihu. The structure of the paper is as follows. First, the limnological conditions of the Lake Taihu waters is introduced and then the spectral variations of Rrs(λ) and Kd(λ) were analyzed. We highlight the assessment and modification of the Lee_2015 model for regional application in Lake Taihu, and the seasonal and spatial variation patterns of Zsd in Lake Taihu are subsequently done. Finally, corresponding discussion and conclusions are presented.

Study Area
Lake Taihu (30.9-31.6° N, 119.9-20.6° E) is one of five major freshwater lakes in China. It has an area of 2338 km 2 and an average depth of 1.9 m, and is a typically shallow inland eutrophic lake [27]. It is situated at the border between Jiangsu and Zhejiang Provinces in East China (Figure 1a), which is linked to 220 rivers and canals of various sizes. Waters in the lake have suffered from a drastic deterioration in quality over recent years due to discharge of wastewater from near-shore industries and residents into it. The widespread and frequent algal blooms have become commonplace [28], and sediment resuspension due to wave effect also strongly affects water environment in the lake [29].
In this study, a total of 69 groups of field data were collected during May, August, and October 2012. The sampling stations of surface measurements were distributed throughout Lake Taihu, as shown in Figure 1b. At each station, the optical measurements were performed on the surface water, and water samples were collected using the Niskin bottles. These water samples were immediately preserved under a low temperature (2-4 °C), and were then taken to the laboratory for further analysis within the same day. Meanwhile, considering geographic locations, eight specific subregions in Lake Taihu were selected for further investigating regional discrepancy in Zsd variation (red squares in Figure 1c), including Zhushan Bay, Meiliang Bay, Gonghu Bay, West region, Central region, Xukou Bay, southwest region, East Lake Taihu. These subregions had the same size with 36 GOCI pixels, and the arithmetic averages of all valid pixels within each subregion were obtained for further detailed analysis.

In Situ Measurements
The parameters measured from water samples in laboratory analysis included Chla, TSM, organic suspended matter (OSM), and inorganic suspended matter (ISM) concentrations. To obtain Chla data, the water samples were filtered on Whatman GF/C fiberglass filters, and then were extracted with ethanol (90%) at 80 °C. The Chla concentration was determined spectrophotometrically using the method of Lorenzen [30] and Chen et al. [31]. The concentrations of TSM, OSM, and ISM were determined gravimetrically. Water samples were filtered through precombusted Whatman GF/C fiberglass filters (550 °C for 4 h) to remove organic traces, and dried (105 °C for 4 h), and weighed according to Huang [32]. The filters were then recombusted at 550 °C for 4 h in order to remove the organic fraction, and weighed again to obtain the ISM concentration. By subtracting ISM from TSM, the OSM concentration was obtained.
For the collections of the field of Zsd in Lake Taihu, a standard Secchi disk, a circular white-andblack disk with 30 cm diameter, was used to measure Zsd at the sampling stations shown in Figure 1c. At each station, the depth at which the Secchi disk cannot be observed by the human eye of an observer when deployed in the water is recorded as a Zsd value.
Remote sensing reflectance (Rrs) spectra were determined from radiance measurements using a portable ASD FieldSpec spectroradiometer under suitable conditions of solar illumination. With a viewing field of 25°, ASD instrument has a sensitivity range from 350 to 1050 nm at an increment of 1.58 nm with 512 wavebands. The measurement of water surface reflectance spectra followed the method described by Mueller et al. [33]. When the boat was anchored, the radiance spectra of the reference panel, water, and the sky were measured 10 times, respectively. The spectra at each sampling station were averaged to derive Rrs(λ). The Rrs(λ) data were obtained as Equation (1). In the process of deriving Rrs(λ), the reflectance of skylight at the air-water surface (γ) was taken as 2.2% for calm weather, 2.5% for wind speed of up to 5 m/s, and 2.6%-2.8% for wind speed of about to 10 m/s [34], and the reflectance of the gray board (ρp) has been accurately calibrated to 30%.
where Lt, Lsky, and Lp are the radiance values measured from the water, sky, and reference panel, respectively.
For diffuse attenuation coefficient measurements, a hyperspectral irradiance sensor (TriOS GmbH) was used to simultaneously measure the downwelling irradiance from 320 to 950 nm with a spectral sampling of 3.3 nm. The observed depths were designed beforehand from just beneath water surface (0) to 1.4 m at a 0.2 m interval. In practical terms, the practical measured depths are dependent on water depths and penetration depths. When measuring, the weather is sunny and with no or little wind. Even so, we also tried our best to avoid water wave for reducing its effect on our measurement. Subsequently, abnormal spectral irradiance curves would be deleted when checking the collected data. On this basis, diffuse attenuation coefficients for downward irradiance (Kd(λ)) were obtained from the nonlinear regression of the underwater irradiance profile. Additionally, only Kd(λ) values with a fit of R 2 ≥ 0.95 were regarded to be accurate. The number of depths used in these fits was no less than three, depending on the penetration depth [35].
where Ed(λ, z) is downward irradiance at depth z at wavelength λ; Ed(λ, 0) is downward irradiance just beneath the water surface at wavelength λ.

Satellite Data
The geostationary ocean color imager (GOCI), the world's first geostationary ocean color satellite sensor, observes the Northeast Asian region eight times within a day from 08:15 am to 3:15 pm local time (temporal resolution of 1 h). In the visible region, GOCI contains six visible wavebands with central wavelengths of 412, 443, 490, 555, 660, and 680 nm and two NIR wavebands with central wavelengths of 745 and 865 nm, and has a spatial resolution of 500 m [18,36]. The GOCI satellite observations with fine temporal and spatial resolutions facilitate more detailed analysis (e.g., diurnal dynamics) for monitoring Zsd in our study area.
In this study, the GOCI Level-1B data from 2012 to 2018 were obtained from the Korea Ocean Satellite Center (KOSC, http://kosc.kiost.ac.kr/). The radiometric and geometric corrections were performed by using the GOCI Data Processing System (GDPS) version 1.3, and the remote sensing reflectance Rrs values were then obtained using the 6S atmospheric correction procedure. The hourly GOCI Zsd products were estimated from the hourly satellite Rrs data using the refined Zsd model in this study (see details in Section 2.3). However, it should be noted that the collecting time of the in situ measured dataset (the year 2010) was beyond the temporal coverage of GOCI observation available (since the year 2012); therefore, it was unlikely to obtain field-satellite matchup dataset for the satellite-derived Zsd validation. Alternatively, many previous studies have proved that the performance of the 6S atmospheric correction on the GOCI data for obtaining satellite Rrs(λ) information was encouraging, confirmed by the comparison between the matchup data of GOCI Rrs and the measured data in Lake Taihu (see Figure 3 in Huang et al. [37] study; Figure 3 in Du et al. [38] study).

Alogrithms for Zsd Estimation
Due to environmental and artificial factors, some probabilistic errors may be introduced into the Secchi depth measurement [1]. Lee et al. [26] reported that the fundamental problem is caused by the mismatch of the physical processes of the sighting of Secchi disk in water by human eyes [2]. Therefore, according to a novel underwater visibility theory, Lee et al. [26] developed a new semianalytical model for estimation of Zsd. The Lee_2015 model assumed that Zsd is inversely related to only the diffuse attenuation coefficient Kd parameter, as follows, where Min(Kd(λi)) is the minimum Kd within the visible wavelength region (410-665 nm); λi are 443, 490, 555, and 660 nm for GOCI sensor. Rrs,m is the remote sensing reflectance at the wavelength with the minimum Kd. Here, the in situ measurements of Kd and Rrs spectra were fed into the Lee_2015 model to obtain the Zsd,Lee retrievals. As shown in Figure 2a, the in situ spectra-derived Zsd,Lee data (black scatter) showed a significant correlation with the measured Zsd data, with higher R 2 values of 0.73 (p < 0.01). However, the Zsd,Lee retrievals did not agree with the measured Zsd, as evidenced by the relatively higher RMSE (0.955), MAE (2.6), and MAPE (58%) values. Luckily, the modelled Zsd,Lee data derived from the Lee_2015 model had some systematic error (black scatter in Figure 2a). Similar to previous studies [2], we used the linear correlation (as expressed in Figure 2a) to correct the bias of the estimated Zsd,Lee result (hereafter called Zsd). Thus, the refined Lee_2015 model can be expressed as Equation (4). As shown in Figure 2a, the accuracy of Zsd retrievals (red scatter) via Equation (4) was significantly improved, with the much lower RMSE value of 0.016, MAE value of 1.3, and MAPE value of 34%. They were in better agreement with the in situ measured Zsd than the Zsd,Lee retrievals ( Figure 2a). In addition, using the independent validation dataset (N= 14), the refined Zsd model for Zsd estimation were examined by comparison with the measured Zsd and the estimated Zsd data ( Figure 2b). The in situ spectra estimated Zsd were consistent with the measured data, with higher R 2 values of 0.85 and lower MAPE values of 23%. Most of the samples fell within a ± 10% fraction range.  In the Lee_2015 model, the Kd retrieval is the first and key step to derive Zsd. Previous studies indicated that there are good linear relationships between Kd(490) and Kd at other wavebands (e.g., Figure 4 in Mao et al. [10]). Therefore, the key to Zsd estimation is the calculation of Kd(490). In the current study, we examined the performance of two published Kd(490) algorithms (the semianalytical algorithm proposed by Lee et al. [39] and the standard Kd(490) algorithm by NASA Oceancolor; see https://oceancolor.gsfc.nasa.gov/atbd/kd_490/) for Kd(490) estimation in Lake Taihu. Comparison of the measured Kd(490) with the estimated Kd(490) values uisng the two published Kd(490) algorithms, respectively indicated that the semi-analytical Kd(490) algorithm and the standard Kd(490) algorithm had poor performance with low estimation accuracy in our study region (see details in Section 4.1). In our study, Kd(490) were quantified from Rrs data using a multivariable linear relationship, as follows: where  ,680)). The details of why and how to achieve Kd(490) from Rrs(λ) using Equation (5) can be found in Section 3.2. Therefore, once the accurate in situ or at-satellite Rrs data are available, Kd at 490 nm and other visible wavebands are inferred, and thereby Zsd can be derived using Equation (4).

Statistical Analysis and Accuracy Evaluation
Statistical description including minimum, maximum, mean, standard deviation (SD), correlation analysis, and regression analysis were performed using the MATLAB software. Additionally, to quantitatively assess the performance of the estimation algorithms, correlation coefficient (R), determination coefficient R 2 , mean absolute percentage error (MAPE), and root mean square error (RMSE), bias, and mean absolute error (MAE) were calculated between the estimated and measured data [40], according to the following formula: where n is the number of samples; yi is the measured value; yi ' is the estimated value.

Limnological Conditions
The variations of water quality parameters in Lake Taihu, including concentrations of Chla (mg/ m 3 ), TSM (mg/l), OSM (mg/l) and ISM (mg/l), and the constituent ratios including ISM:TSM and OSM:TSM, were shown in Table 1. In the collected dataset, the Chla covered a large range of 0.55-349.75 mg/m 3 with a mean value of 25.54 mg/m 3 (SD = ± 43.32 mg/m 3 ). TSM concentration varied from 3.40 to 132.33 mg/l. Note that ISM accounts for a larger part in TSM than OSM, and the mean ISM:TSM and OSM:TSM ratios are 0.73 and 0.24, respectively. This result indicated that the suspended particulates in Lake Taihu are mainly subject to sediment resuspension, which undoubtedly and significantly reduce the vertical visibility of waters. Meanwhile, a very close linear regression relationship between ISM and TSM were fitted with the high determination coefficient (R 2 = 0.93); while OSM has no very close correlation with TSM (data not shown). The Kd(490) ranged from 1.27 to 9.52 m -1 with a mean of 4.72m -1 (SD = 1.99 m -1 ). 1.99 Here, the shape and coefficient of variation (CV, defined as the ratio of SD to mean) of in situ Rrs(λ) and Kd(λ) spectra in Lake Taihu were analyzed to better show the variability in limnological conditions ( Figure 3). As shown in Figure 3a, the obtained Rrs(λ) spectra in Lake Taihu were similar to those found in other turbid productive waters [41][42][43]. The reflectance peaks in the vicinity of 570 and 700 nm were very clear, when there was a reflectance trough near 675 nm caused by Chla absorption. The in situ Rrs(λ) spectra of all samples showed low variability in magnitude with CV lower than about 20% for < 700 nm and high CV values (25%-50%) for > 700 nm. The Rrs(λ) spectra were generally characterized by low values in the short wavelength domain (approximately between 400 and 500 nm). This is directly related to the high absorption of colored dissolved organic matter and suspended particles (including phytoplankton and non-algal particles). The relatively mild spectral variations usually emerge in the long wavelengths domain between about 740 to 800 nm, which are mainly affected by particulate backscattering and pure water absorption.
It can be seen from Figure 3b that the Kd(λ) has more spectral variations. For all samples, CV varied from 20% to 45% with relatively high CV between 450 and 700 nm. At the short wavelengths, the Kd(λ) spectra presented a rapid decrease from 400 to about 580 nm. In the spectral domain of 580-700 nm, the Kd values showed a small peak emerges near 675 nm. There were some Kd(λ) troughs in the vicinity of 700 nm, and subsequently a sharp increase emerges from 700 to about 750 nm. Interestingly, note that the Kd(λ) spectra from 700 nm to near infrared (NIR) were very similar to that of pure water absorption. This is because the non-water constitutes (CDOM, non-algal particles, and phytoplankton) all have quite weak absorption in this spectral range, as well as smooth particulate backscattering spectra. Generally speaking, Kd(λ) at the short wavebands has a larger variable range than that at the long wavebands.

Performance of the Estimated Kd(λ) Algorithm
As mentioned above, the key for Zsd estimation using Lee_2015 model is the calculation of the diffuse attenuation coefficient Kd at 490 nm. Many previous studies have demonstrated that Kd can be directly obtained from satellite remote sensing data (e.g., water-leaving radiance and Rrs) using either empirical or semi-analytical algorithms [39,44,45]. Therefore, for Lake Taihu, a regional empirical Kd(490) algorithm was developed here to infer Kd(490) directly from Rrs data.
In this study, many empirical models (using single band, band ratios, and band combinations) were established to investigate the relationship between Kd(490) and Rrs(λ) [46,47], considering that empirical algorithms can be simple and established efficiently. In addition, empirical models using different forms of bands are also widely designed for water quality parameters estimation such as TSM and turbidity [48,49]. Here, seven forms of Rrs(λ) were tested, including single bands, band ratios in linear and logarithmic space, and different-bands combinations ( Table 2). For each form of Rrs(λ), all of band combinations from eight GOCI bands were trained and tested using various functions (linear, power, quadratic, and cubic polynomial functions) based on the field dataset of Rrs(λ) and Kd(490) collected in Lake Taihu. The optimal band combinations using the linear function with the highest R 2 were provided after comparison, as shown in Table 2. X5 = ln(Rrs,555)/ln(Rrs,745) displayed with the best correlation with Kd(490), with R 2 value of 0.56. However, the band combinations of ln(Rrs,555)/ln(Rrs,745) did not have the higher performance as we expected. Thus, to further improve the retrieval accuracy of Kd(490) with a reasonable error, this study used ln(Rrs,λi)/ln(Rrs,λj) forms of GOCI multiband with positive correlations to develop the Kd(490) estimation algorithm.  Figure 4 showed variations of the R values between the X5 functions ln(Rrs,λi)/ln(Rrs,λj) and Kd(490) in the spectral range of 400-800 nm. The wavelengths of positive correlations were basically consistent with that of negative correlations. After comparison of these results, the maximal positive R value appears to be 0.748 (p < 0.01) with λi = 555 and λj = 745 nm, while the maximal negative R value is -0.744 (p < 0.01) with λi = 745 and λj = 555 nm. Additionally, we examined the correlations between Kd(490) and all possible combinations of ln(Rrs,λi)/ln(Rrs,λj) from eight bands. Eight band ratios all had relatively high and significant correlations with Kd(490). Here, Table 3 only summarized the six combinations of the X5 functions with the relatively high R 2 values. Considering the better accuracy assessment of multivariate regression model, we designed the multivariate linear relationship (Equation (5)) with different quantitative variables using the band combination forms above for Kd(490) estimation in the Lake Taihu waters.   (5)) was fitted to the independent in situ dataset of Rrs(λ) and Kd(490) (N = 46) to obtain the regression coefficients using a nonlinear least-squares fitting procedure (Figure 5a). The established parameters (i.e., wi and C) were presented in the explanation of Equation (5). Figure 5a showed the strong agreement between the in situ Rrs-derived Kd(490) and measured results, with R 2 , RMSE, MAE, and MAPE values of 0.73%, 0.16%, 1.2%, and 21%, respectively. The samples were close to the 1:1 line, and almost all of the samples fell within the ± 20% fraction range. Meanwhile, the performance of the multivariate linear model was evidently improved and better than those of the X5 functions with a single variable shown in Table 3. To validate the performance of Kd(490) estimation algorithm, Equation (5) with the established coefficients were applied to the other independent Rrs(λ) dataset (N = 23) to estimate the Kd(490) data. Figure 5a showed the comparison of obtaining Kd(490) retrievals with in situ measured data in Lake Taihu. Most of the samples were clustered around the 1:1 line within the ± 10% fraction range, although a slight overestimation of Kd(490) was found in three samples only. The R 2 , RMSE, MAE, and MAPE values were 0.46%, 0.3%, 1.2%, and 26%, respectively. Overall, the results of Figure 5 suggested that the multivariate linear model designed in this study for Lake Taihu waters is able to accurately obtain the Kd(490) from remote sensing reflectance Rrs. As mentioned above, the calculation of Kd values of the visible bands is the important step to estimate Zsd. Following the previous studies [10,50], the diffuse attenuation coefficient spectra Kd(λ) were calculated using simple linear models, by considering the good relationship between Kd(490) and Kd(λ) in other visible wavelengths. Therefore, the independent simple linear models were developed for estimating Kd(λ) at 443, 555, and 660 nm in Lake Taihu, respectively (Figure 6a). The corresponding model expression that determined parameters and associated R 2 values for each of the linear models were also shown in Figure 6a. It was found that Kd(490) was significantly related to the Kd(λ) at 443, 555, and 660 nm, with both R 2 values above 0.97. Using 23 measurements of Kd(λ) from the independent validation dataset, we assessed the performance of Kd(λ) estimation models by comparison against the in situ measured Kd(λ) at other GOCI wavebands, as shown in Figure 6b. It can be seen that the retrievals of Kd(443), Kd(555), and Kd(660) were both in greet agreement with the measured values, respectively, and their scatter distributions were closer to the 1:1 line. According to the statistical indicators (Figure 6b), all the Kd(λ) estimation models shown in Figure 6a had good performance. Their R 2 values were all above 0.97, and MAPE values were all below 5.0%. Additionally, based on our field dataset collected in Lake Taihu, we built a series of linear models with very high determination coefficients (R 2 ≥ 0.91) for estimating Kd(λ) in the spectral range of 400-800 nm (the detailed coefficients and R 2 data were not shown here). Here, two sampling stations (station S6 and S19) were selected to better display the comparison of the measured and modelled Kd(λ) spectra, as shown in Figure 6c. We found that the good fit emerged between the measured and modelled Kd(λ) spectra, while noting that some underestimations appeared in the spectral range of 750-800 nm for station S6. More positively, the good agreement between the measured and modelled Kd(λ) were observed at 443, 490, 555, and 660 nm, the wavelengths used for the Zsd calculation. These comparison results above suggested that the Kd(443), Kd(555), and Kd(660) can be accurately retrieved using the simple linear models in our study region.
Overall, the results of Figures 5 and 6 indicated that the two developed methods for the estimation of Kd(490) and Kd(λ) had reasonable simulations, and were suitable for characterizing the spectral variations of Kd(λ) in the Lake Taihu waters.

Validation of Zsd Estimation with Independent Measurements
Using the independent validation dataset (N = 23) with the related in situ measured Rrs and Zsd data, we examined our regionally tuned Zsd model (i.e., Equation (4)) for estimating Zsd results in the Lake Taihu waters. First, we used the multivariate linear model (Equation (5)) to retrieve Kd(490) from in situ measured Rrs data, and subsequently used the simple linear models (three equations in Figure  6a) to obtain Kd(λ) at 443, 555, and 660 nm. Then, the Zsd values were inferred from the retrieved Kd(λ) using Equation (4), as shown in Figure 7. The in situ Rrs-derived Zsd values were consistent with the measured data, and most of the sampling points fell with ± 10% fraction range. According to Figure  7 and the statistical indicators (R 2 = 0.72, RMSE = 0.06, MAPE = 36%), our refined Zsd model had a good performance on remote sensing reflectance Rrs and yielded reasonable Zsd inversion. Therefore, we further investigated the spatiotemporal variability of the Zsd in Lake Taihu based on satellitederived Zsd products from GOCI satellite Rrs data.

Interannual Variability of Zsd in Lake Taihu
To investigate the interannual variability of Zsd in Lake Taihu, the refined Lee_2015 model as described in Section 2.3 was applied to the hourly GOCI Rrs data between 2012 and 2018 to yield the hourly satellite-derived Zsd products. Then, the mean values over a seven-year period and monthly climatology of Zsd for Lake Taihu between 2012 and 2018 were estimated for further analysis as shown below.

Spatial and Variability Distributions of Zsd
The spatial and variability distributions of Zsd in Lake Taihu were generated by the temporal mean and SD values during 2012 -2018, respectively ( Figure 8). In general, water transparency values were high in the northeast part of Lake Taihu, while those were low in the southwest part ( Figure  8a). The highest water transparency values (about 0.90 m) were found in East Lake Taihu. As shown in Figure 8b, the spatial pattern of the variability of water transparency in Lake Taihu was generally similar to the pattern of the mean value. The high variability values of water transparency (0.25-0.30 m) were mainly distributed in the northeast and central regions, whereas these low variability (< 0.12 m) values were generally observed in the eastern and southern Taihu Lake.

Monthly Climatological Zsd
The monthly mean Zsd of Lake Taihu from 2012 to 2018 (i.e., monthly climatology) were obtained and were demonstrated in Figure 9. Monthly variation patterns of Zsd indicated that the water transparency had large temporal variability in the entire Lake Taihu. The temporal dynamic of water transparency in Lake Taihu resulted in a depletion process from May to October and an increase from November to April. The Zsd values during the winter and early spring (November to April) (with most of the values > 0.6 m) were significantly higher than those from late spring to early fall (May to October) (with most of the values < 0.4 m). The mean water transparency of the entire lake varied from 0.33 ± 0.16 m in October to 0.93 ± 0.21 m in April during the whole year.

Regional Difference in the Monthly Climatological Zsd
In this current study, eight different subregions were selected to represent the central and typical bays of Lake Taihu (Figure 1c), in order to investigate these temporal variation patterns more clearly. As shown in Figure 10, although some regional differences were observed, water transparency of the most subregions showed peaks in spring and winter. Specifically, in the Zhushan Bay, west region, and Xukou Bay (Figure 10a, d, and f), the temporal variation patterns of water transparency were generally similar, showing two peaks which were in April and November, respectively. The maximum water transparency of these subregions was observed in April, with values of about 1.07 ± 0.04, 0.92 ± 0.08, and 1.27 ± 0.10 m, respectively; and the minimum values that occurred in September or August were about 0.27 ± 0.14, 0.32 ± 0.07, and 0.43 ± 0.10 m, respectively. The East Lake Taihu (Figure 10h) had a similar temporal variation pattern with those observed for regions in the Zhushan Bay, west region, and Xukou Bay, while the inter-annual variability was quite large, especially during late spring and summer. For the Meiliang Bay and Gonghu Bay (Figure 10b and c), water transparency also showed low values from late spring to early fall and high values during late fall to early spring; while the first peak occurred in March which was slightly different from those observed for regions in the Zhushan Bay, west region, and Xukou Bay. For both the Meiliang Bay and Gonghu Bay, the maximum water transparency was observed in March with values of about 1.25 ± 0.11 and 1.28 ± 0.09 m, respectively, and the minimum water transparency appeared in September with values of 0.24 ± 0.04 and 0.24 ± 0.05 m, respectively (Figure 10b and c). Different from these subregions above, the temporal variation pattern of water transparency in the central region and southwest region (Figure 10e and g) were generally irregular, in which the water transparency respectively varied from 0.35 ± 0.10 to 1.04 ± 0.08 m and from 0.30 ± 0.02 to 0.78 ± 0.03 m during the whole year without clear peaks or troughs.

Satellite Application of the Modified Zsd Algorithm
The water clarity which is widely quantified as Zsd, is an important parameter for monitoring water quality. It is also used for deducing some inherent properties for ocean, such as Chla and primary productivity. Fortunately, the application of satellite remote sensing could obtain optical property parameters in a wide range, and thereby enabled us to complete the derivation and inversion of Zsd. The novel Zsd algorithm (Equation (3)) was designed by Lee et al. [26] based on the new underwater visibility theory, and it has been verified with high accuracy in many regions [10,26,51]. However, it was unsatisfactory when applying in Lake Taihu with its low accuracy ( Figure  2a). This is possibly caused by the systematic errors introduced from those in situ measurements on the optical parameters, yet it needs to be finely studied in subsequent works. Therefore, in this study we refined the Lee_2015 model for deriving Zsd in Lake Taihu from remote sensing reflectance (Equation (4)). The refined Lee_2015 model could yield the reasonable Zsd retrievals using both the measured Kd ( Figure 2) and the estimated Kd data from Rrs data (Figure 7).
The successful application of the refined Lee_2015 model to satellite remote sensing reflectance depends largely on the estimation accuracy of Kd from Rrs. Many empirical or semi-analytical algorithms have been designed for obtaining Kd directly using remote sensing observations [10,39,44]. Here, we attempted to use the semi-analytical algorithm proposed by Lee et al. [39] and the standard algorithm by NASA Ocean Color for estimating Kd(490) from the Rrs data in Lake Taihu, respectively (the detailed description of the semi-analytical algorithm and standard Kd(490) algorithm can be found in Lee et al. [39] and https://oceancolor.gsfc.nasa.gov/atbd/kd_490/; not described here). The modelled Kd(490) by Lee et al. [39] with original coefficients had relatively large deviations from the measured data, with MAPE = 41.7% (figure not shown here). The low accuracy of the Kd(490) would thereby introduce and affect the estimation accuracy of the satellite-derived Zsd, which suggest that the semi-analytical method proposed by Lee et al. [39] for estimating Kd(490) is not very suitable in our investigated water column. Additionally, more steps in the process of using the semi-analytical models are objectively existent and also unavoidable presently, which may accumulate and propagate errors, and finally cause a poor performance of models. Meanwhile, the standard Kd(490) algorithm also had poor performance with lower accuracy in the Lake Taihu waters (MAPE = 91%), which was evident in comparison between the measured Kd(490) and the derived Kd(490) values (figure not shown here). Therefore, in this current study, the empirical methods (Equation (5) and Figure 6a) developed by the local dataset the was employed to successfully estimate Kd(490) and Kd(λ) at other GOCI bands using the Rrs data ( Figure 6), which were thereby used to obtain reasonable Zsd retrievals (Figure 7) in Lake Taihu. Therefore, we presently believe that our proposed models are a tried and true approach for the Kd estimation in Lake Taihu waters.
Although the high accuracy of water transparency can be obtained with our refined Zsd model utility in Lake Taihu, we get the limitation of our model as the other empirical model, which is largely dependent on the quality and integrity of the dataset for model development. In addition, we only get the empirical relationship between Kd and Rrs without considering the specific physical process. Note that although a certain amount of the local field data was used in our study, most of them were historical data. Meanwhile, our study region suffered from a paucity of plenty recent data to get better accuracy for the model, and thus there was also no assessment of the credibility of GOCI satellite-derived Zsd. Therefore, further investigations are still required to verify the satellite-derived product and to evaluate the applicability and validation of the Kd model and the refined Zsd model in other regions, when a large amount of near-by measured data are available in the future.

Spatial and Temporal Variations of Zsd in Lake Taihu
When applying the refined Zsd model to GOCI long-term data during 2012-2018, clear spatial and temporal variations in water transparency were observed in Lake Taihu (Figures 8, 9 and 10). It is well known that Lake Taihu is a typical shallow and turbid lake, with a maximal depth of less than 3 m and an average depth of only 1.9 m [52]. For such shallow waters in Lake Taihu, the optical properties (such as light absorption, scattering, attenuation, and thereby water clarity, etc.) have been found to be mainly regulated by the total suspended matter, especially the inorganic suspended matter [53,54]. The spatial distribution of water transparency derived from GOCI data in this study generally showed high values in the northeast part while low values in the southwest part during the whole year (Figures 8 and 9). This distribution pattern might be related to changes in TSM concentrations that are driven by wind. In the southwest and central regions of Lake Taihu, due to the shallow water depth, the wind forcing often can drive strong water mixing and therefore sediment re-suspension as suggested by previous studies [27,55]. This process further causes increase in TSM concentrations, which then results in strong light attenuation and low water transparency [28,56]. However, in the northeast part of Lake Taihu, especially the East Lake Taihu, the submerged aquatic vegetation is usually abundant, which has functions of wave abatement, water filtration, and water purification. This process may significantly reduce the sediment resuspension, and causes low TSM concentrations, thereby results in high water transparency levels. In addition, river discharge which may bring a large amount of suspended matters in the southern part of the Lake Taihu is also a possible factor responsible for the low water transparency there.
The temporal variation pattern of GOCI-derived water transparency in most regions of Lake Taihu generally showed high values during spring and summer but low values during autumn and winter (Figures 9 and 10). This is generally consistent with the findings of Shi et al. [56] who found that light attenuation in Lake Taihu are high in spring and summer and low in autumn and winter. Regarding the driving factors responsible for the temporal variation pattern of the water transparency in Lake Taihu, the seasonal variability in wind speed may be an important one, which has the maximum values in spring and the minimum values in winter, respectively [56]. The other important factor may be attributed to the algal growth. Previous studies have demonstrated that algal blooms have frequently occurred in most regions of Lake Taihu during late spring and summer [57][58][59]. These blooms can significantly increase the light attenuation and therefore cause low water transparency.
The spatial and temporal distributions of water transparency obtained in this study may provide useful information for better understanding the water dynamics, ecosystem, and environment etc., in Lake Taihu; while regarding the driving mechanisms of these variation patterns which are usually complex, this study only conducted preliminary investigations. Detailed studies specifically focusing on this topic is still required in the future based on a complete dataset composed of both satellite data and in situ measurements of water biogeochemical and environmental parameters, such as phytoplankton abundant, light field, and water mixing, etc. In addition, only seven years old satellite data (2012-2018) were used in this study due to the limited temporal coverage of GOCI observation. In the future study, the climatological water transparency derived from multiple ocean color satellite dataset with longer time scale may yield more reasonable spatial and temporal variation patterns of water transparency in Lake Taihu.

Future Implications for Aquatic Environmental Changes
The use of GOCI in our study was largely because of very high temporal resolution (eight hourly revisits per day) and good spatial resolution with 500 m. These characteristics can provide a sufficient amount of valid satellite data for studying water transparency in Lake Taihu. However, GOCI satellite observations were only available since 2012. Alternatively, the MERIS, MODIS, and Landsat series measurements could be used for similar application to improve the satellite-derived products with a higher spatial resolution or long-term data archive (2002-present). Moreover, the satellite imageries with higher spatial resolution (such as Landsat8-OLI of 30 m, GF-WFV of 16 m) are more advantageous to estimate water transparency in small inland waters [60]. In addition, the Zsd estimation model presented in this study could also be applied to the airborne hyperspectral sensors of remote sensing, such as the compact airborne spectrographic imager (CASI), the airborne visible/infrared imaging spectrometer (AVIRIS), and the NASA HYPERION sensor. However, it may require a cross-sensor calibration between different satellite sensors or regional adjustments of the model coefficients due to the difference in spectral characteristics from one sensor to another [17].
Spatiotemporal variability in water clarity is a good indicator for water quality and aquatic environmental changes [10]. For instance, record-breaking algal blooms have frequently occurred in the inland waters, which are affected by a combination of human activities and climate changes [61,62]. The occurrence of algal blooms is largely sensitive to the accumulation of nutrients, due to pollutant discharges, cultivation, and runoffs [63,64]. Therefore, using synoptic satellite observation of the historical and updated information, more reliable products (such as Chla and water transparency in the current study) could contribute to our understanding the aquatic environmental changes and its response to natural (such as, SST, precipitation, and climate changes) and anthropogenic forcing (such as agricultural and industry activities).

Conclusions
In the present study, a novel semi-analytical model (Lee_2015 model) for the water transparency estimation were linearly corrected to improve the regional retrieval accuracy in Lake Taihu, using a local field dataset collected from the Lake Taihu waters. The refined Lee_2015 model showed the reliable performance of the Zsd estimation from remote sensing reflectance with almost all of the samples falling within the ±10% fraction range, which demonstrated the potential of satellite ocean color data to estimate Zsd distribution from space in shallow and productive lake waters (e.g., the Lake Taihu). Long-term GOCI Zsd products showed that the Zsd patterns in Lake Taihu were varied across both temporal and spatial scales. The interannual variations of Zsd in Lake Taihu was likely to be related to wind, water temperature, riverine discharge, and human activity. The water transparency results presented in this study provide critical datasets and basic knowledge for further water quality assessment and monitoring. However, further studies on the validation of satellitederived products and the mechanisms of the Zsd variations in Lake Taihu are still required due to the lack of abundant in situ measurements for support.