Inversion and Monitoring of the TP Concentration in Taihu Lake Using the Landsat-8 and Sentinel-2 Images

: Eutrophication is a signiﬁcant factor that damages the water ecosystem’s species balance. The total phosphorus (TP) concentration is a vital water quality indicator in assessing surface water eutrophication. This paper predicts the spatial distribution of TP concentration using remote sensing, measured data, and the partial least squares regression (PLSR) method. Based on the correlation analysis, the models were built and tested using the TP concentration and Sentinel-2 Multispec-tral Instrument (MSI) and Landsat-8 Operational Land Imager (OLI) image spectra. The results demonstrated that the best technique based on band combinations of the Sentinel-2 and Landsat-8 images achieved good precision. The coefﬁcient of determination (R 2 ), root mean square error of prediction (RMSEP), and residual prediction deviation (RPD) were 0.771, 0.023 mg/L, and 2.086 for Sentinel-2 images and 0.630, 0.032 mg/L, and 1.644 for Landsat-8 images, respectively. The TP concentration maps were interpolated using the inverse distance weighting method, and the inversion results obtained from the images were in good agreement. The western and northwestern regions of Taihu Lake, where signiﬁcant cyanobacterial blooms occurred, had TP concentrations greater than 0.20 mg/L; nevertheless, the central and eastern regions had amounts ranging from 0.05 to 0.20 mg/L. In order to prove the extensibility of the model, the optimal algorithm was applied to the Sentinel-2 and Landsat-8 images in 2017. The optimal algorithm based on Landsat-8 images has a better veriﬁcation effect (RMSEP = 0.027 mg/L, and R = 0.879 for one Landsat-8 image), and the optimal algorithm based on Sentinel-2 images has moderate veriﬁcation effect (RMSEP = 0.054 mg/L and 0.045 mg/L, and R = 0.771 and 0.787 for two Sentinel-2 images). The interpolation and inversion maps are in good agreement, indicating that the model is suitable for the Landsat-8 and Sentinel-2 images, which can be complementary for higher temporal resolutions. Monitoring water quality using multiple remote sensing images can provide the scientiﬁc basis for water quality dynamic monitoring and prevention in China.


Introduction
Water is a vital foundation for supporting human survival and social development, and freshwater lakes are critical natural resources that serve as drinking water sources, tourism destinations, and habitats within the biosphere [1,2]. However, pollution emissions increased with rapid economic development and the increased intensity of land development and human activities [3]. Lakes face severe problems, including shrinking lake areas and eutrophication [4]. Water eutrophication has been identified as a significant cause of freshwater resources since the 1960s [5], and 54%, 53%, 48%, 41%, and 28% of the lakes in Asia, Europe, North America, South America, and Africa are eutrophic [6]. Eutrophication Studies found that water quality remote sensing monitoring methods have developed through multivariate data analysis [34][35][36], machine learning [12,31,37], and integrated methods. Machine learning methods work well in remote sensing studies of water quality parameters, but accurate models require many training samples. The partial least squares (PLS) method was developed in the 1960s by Herman Wold [38] for constructing predictive models when many factors are collinear. Partial least squares regression (PLSR) is one of the multivariate statistical analysis methods that focuses on regression model with multiple dependent variables and multiple independent variables. The basic algorithm of PLSR is through successive least squares fit on variables to find linear combinations of the original regression variables [38]. The method was used in water quality monitoring, such as the Chl-a concentration of Shitoukoumen Reservoir in China [39] and the TP concentration of Morse Reservoir in the USA [40].
The objectives of this study are to (1) construct the PLSR model by combining the measured TP concentration with Sentinel-2 and Landsat-8 multispectral images, (2) compare different maps of interpolated TP concentration with the retrieved results of remote sensing images to determine the spatial and temporal distribution of TP concentration in Taihu Lake, and (3) discuss the related factors of the TP variations. Using the results of TP remote sensing inversion to accurately depict the fluctuations in eutrophication levels and the status of water quality conditions in Taihu Lake, the study also aims to offer environmental managers vital information.

Study Site and Experimental Design
Taihu Lake (30 • 55 -31 • 33 N, 119 • 52 -120 • 37 E) is mainly located in the southern Jiangsu province and covers parts of the Jiangsu and Zhejiang provinces, including some cities such as Wuxi, Suzhou, Changzhou, and Huzhou. The water area, total shoreline length, average water depth, maximum water depth, and total water storage of Taihu Lake are 2428 km 2 (the water surface area is 2338 km 2 , and the island area is 90 km 2 ), 393.2 km, 1.90 m, 3.34 m [41], and 4.40 × 10 9 m 3 [35], respectively [19]. Most of the runoff comes from the southwestern mountainous watershed and flows out at the east side of the lake after regulation and storage [5]. It has a subtropical monsoon climate, with an average temperature of 14.90-16.20 • C and an annual precipitation of 1000-1400 mm [13,42].
Twenty sampling sites were evenly spaced throughout Taihu Lake for this investigation, with no sampling sites in east Taihu Lake ( Figure 1). Water samples were collected from monitoring sites monthly from January to December 2016, totaling 240 samples. Some of the measured data used in this study are shown in Table S1. The water samples were collected using 500 mL plastic bottles and stored at 4 • C before subsequent laboratory analysis.

Water Quality Parameters and Environmental Factors
The water quality monitoring process complies with the China Inspection Body and Laboratory Mandatory Approval (CMA) metrology certification. The water samples were filtered through a 0.45 µm membrane and left to settle for 30 min. The supernatant was taken by a siphon into sampling bottles and stored with an H 2 SO 4 fixative to make the pH of the sample ≤ 1. The water quality parameters to be measured include TP, total nitrogen (TN), ammonia nitrogen (NH 3 -N), permanganate index (COD Mn ), biochemical oxygen demand (BOD 5 ), chemical oxygen demand (COD), DO, and potential of Hydrogen (pH). The detection methods of all water quality parameters are based on the environmental quality standards for surface water (State Environmental Protection Administration, 2002). All parameters except for pH are expressed in milligrams per liter (mg/L). All water quality indicators are assessed according to the standard Chinese method, Standard Methods for the Examination of Water and Wastewater Editorial Board, published in 2002. The data on the TP concentration in 2017 were downloaded from a paper published previously [43],

Water Quality Parameters and Environmental Factors
The water quality monitoring process complies with the China Inspection Body and Laboratory Mandatory Approval (CMA) metrology certification. The water samples were filtered through a 0.45 μm membrane and left to settle for 30 min. The supernatant was taken by a siphon into sampling bottles and stored with an H2SO4 fixative to make the pH of the sample ≤ 1. The water quality parameters to be measured include TP, total nitrogen (TN), ammonia nitrogen (NH3-N), permanganate index (CODMn), biochemical oxygen demand (BOD5), chemical oxygen demand (COD), DO, and potential of Hydrogen (pH). The detection methods of all water quality parameters are based on the environmental quality standards for surface water (State Environmental Protection Administration, 2002). All parameters except for pH are expressed in milligrams per liter (mg/L). All water quality indicators are assessed according to the standard Chinese method, Standard Methods for the Examination of Water and Wastewater Editorial Board, published in 2002. The data on the TP concentration in 2017 were downloaded from a paper published previously [43], with 15 sampling sites in Taihu Lake. Twelve groups of TP concentration data were obtained once a month by extraction. The monthly precipitation and temperature data in 2016 were provided by Loess Plateau Science Data Center, National Earth System Science Data Sharing Infrastructure, and National Science & Technology Infrastructure of China (http://loess.geodata.cn (accessed on 27 November 2021)). The spatial resolution of the precipitation and temperature data is 0.008333°. The average precipitation in August was 57.695 mm, and that in September was 238.87 mm. The downloaded temperature and precipitation data were vector files and processed by ARCGIS 10.3 software.

Remote Sensing Images Collection and Correction
The Landsat-8 and Sentinel-2 images were analyzed and compared in this research. The Landsat-8, including two sensors, the OLI and the Thermal Infrared Sensor (TIRS), was launched in 2013. The OLI has eight multispectral bands with a 30 m spatial resolution

Remote Sensing Images Collection and Correction
The Landsat-8 and Sentinel-2 images were analyzed and compared in this research. The Landsat-8, including two sensors, the OLI and the Thermal Infrared Sensor (TIRS), was launched in 2013. The OLI has eight multispectral bands with a 30 m spatial resolution and a panchromatic band with a 15 m spatial resolution [44]. The MSI optical sensor on Sentinel-2 includes 13 bands with three different spatial resolutions: the visible and Near Infrared (NIR) bands with a 10 m resolution; the red edge, Narrow NIR, and Shortwave Infrared (SWIR) bands with a 20 m resolution; and the coastal aerosol, water vapor, and SWIR-Cirrus bands with a 60 m resolution [30]. Band1 to Band7 of the Landsat-8 image were selected as the corresponding bands of Band1, Band2, Band3, Band4, Band8, Band11, and Band12 of the Sentinel-2 image. The Landsat 8 image (Path: 119; Row: 038) and the Sentinel-2 image (Relative Orbit Number R089) were from the USGS Landsat Look Viewer https://landlook.usgs.gov/landlook/viewer.html/ (accessed on 4 September 2020) and the Copernicus Open Access Hub https://scihub.copernicus.eu/ (accessed on 21 November 2020), respectively. Two Landsat-8 images acquired on 27 July and 13 September 2016, along with two Sentinel-2 images acquired on 23 July and 1 September 2016, were chosen for further analysis due to their acquisition times being close to the sampling times. One Landsat-8 image acquired on 30 July 2017, along with two Sentinel-2 images acquired on 28 July and 27 August 2017, were chosen for verification. The time interval between the images and sampling sites is shown in Table S2.
The pre-processing of the Landsat-8 images included radiance calibration and atmospheric correction through the Fast Line-of-sight Atmospheric Analysis of the Spectral Hypercubes (FLASH) model [45], which could remove atmospheric effects and convert the images to surface reflectance [46]. The sensor height, surface elevation, and atmospheric model parameters for atmospheric correction were similar to those in other studies [45,47]. The mean value and standard deviation of the corrected image were smaller than those of the pre-corrected image, but the signal-to-noise ratio of the corrected image was larger than that of the pre-corrected image. The pre-processing of the Sentinel-2 image was mainly the atmospheric correction and resampling. The raw Sentinel-2 images (L1C level) were atmospherically corrected by the Sentinel-2 toolbox, Sen2Cor v2.5.5, developed by the European Space Agency (ESA), and converted to the reflectance data (L2A level). After all Sentinel-2 images were atmospherically corrected, Band 10 became unavailable. All bands of the atmospheric correction image were resampled by nearest-neighbor interpolation to a 10 m resolution using the Sentinel Application Platform (SNAP) software provided by ESA and were saved as the Environment for Visualizing Images (ENVI) standard format.
The image spectra of the 20 samples were extracted by ArcGIS 10.3 based on the coordinates of the samples. The correlation between the image spectra (single-band reflectance and band combination reflectance) and the measured TP concentration was analyzed, and sensitive bands were selected based on the correlation coefficients.

Model Simulation and Assessment
The PLSR model was chosen to retrieve the measured TP concentration based on the image spectra. PLSR is a method developed in the 1960s by Herman Wold for constructing predictive models when the factors are many and highly collinear [38]. The PLSR model was developed with sensitive bands serving as the independent variables and measured concentrations serving as the dependent variable. The model coefficients were acquired using the MATLAB (R2018a, USA) program during the modeling phase. The accuracy and validity of the model were evaluated based on the validation results. The validation parameters of the PLSR model were the root mean square error of prediction (RMSEP), the coefficient of determination (R 2 ), and the residual prediction deviation (RPD).
The model with a lower RMSEP and a higher R 2 and RPD is suitable for prediction. The R 2 is an essential factor in evaluating the prediction model with the following classifications: excellent prediction (R 2 > 0.90), good prediction (R 2 = 0.82-0.90), approximate quantitative prediction (R 2 = 0.66-0.81), prediction that can possibly distinguish between high and low values (R 2 = 0.50-0.65), and unsuccessful prediction (R 2 < 0.50) [48]. In addition, the model has poor results and cannot obtain accurate values when the RPD ≤ 1.5, the model is considered moderately effective when the RPD > 1.5 and the RPD < 2, and the model has an excellent predictive ability when RPD ≥ 2.

Correlation Analysis
Correlation analysis is a term used to denote the association or relationship between two (or more) quantitative variables [49]. The experimental data were analyzed using statistical product service solutions (SPSS 22.0) and Origin 2022. Band reflectance and TP concentration were subjected to correlation analysis to investigate the relationship between the spectral reflectance and TP concentration. Pearson's correlation coefficient was calculated to reveal the relationship between TP and various environmental factors (COD Mn , BOD 5 , NH 3 -N, TN, DO, pH, COD) and meteorological factors (precipitation and temperature). The equation for the correlation coefficient is: Here, R is the correlation coefficient, and x and y are the mean values of variables x and y, respectively.

The Variation in the TP Concentration in Taihu Lake
The TP concentration changed from 0.02 mg/L to 0.27 mg/L, and the average was 0.07 mg/L ( Figure 2). The mean TP concentration in summer and autumn (0.07 mg/L) was higher than that in spring and winter (0.06 mg/L). The TP concentration at all sampling sites is higher in February, May, June, July, August, and September. In months except for February, the most significant changes in TP concentration in Taihu Lake were observed in the middle of each year. The suitable water temperature in the summer and fall provides better environmental conditions for algae growth [50], while Taihu Lake is hypereutrophic. Therefore, the TP concentration in the warm months was selected for the study. Most of the water enters Taihu Lake from the west and flows towards the east; therefore, a large amount of domestic and industrial wastewater is discharged into Taihu Lake from the upstream rivers in the northwestern and western regions [51]. The rainy season in the study area is from May to September [52], and the center of rainfall in 2016 was in the western part of the lake, resulting in a high TP concentration [53]. Comparing the different sites, TP concentrations on MLH, ZSH, DPK, and JS located in the northwestern and northern parts of Taihu Lake are higher than others ( Figure 2). Particularly, at ZSH and DPK, the average TP concentration was 0.127 mg/L and 0.103 mg/L, respectively, which exceed the limit for type IV water, showing that the P pollution in the western part of Taihu Lake was more severe than that of other regions [54] (Figure 2). Based on the prior analysis results, combined with the proportional contribution of P in eutrophication [11], TP concentrations in the middle of 2016 were selected for analysis in this study.

Correlation between TP Concentration and Sentinel-2 Image Spectra
Band1, Band2, Band3, Band4, Band8, Band11, and Band12 of the Sentinel-2 images were selected, and the image spectra of the sampling sites were collected for further analysis. Table 1 shows all the correlation coefficients between the single-band reflectance and TP concentration and some high-correlation coefficients between the band combinations and TP concentration. The single-bands and TP concentration were positively correlated, with the correlation between Band8 and the TP concentration being the highest (0.484). The Sentinel-2 image spectra were combined in several methods, including the sum, difference, ratio, and ratio combination (ri + rj)/(ri − rj), where ri and rj denote the reflectance Most of the water enters Taihu Lake from the west and flows towards the east; therefore, a large amount of domestic and industrial wastewater is discharged into Taihu Lake from the upstream rivers in the northwestern and western regions [51]. The rainy season in the study area is from May to September [52], and the center of rainfall in 2016 was in the western part of the lake, resulting in a high TP concentration [53]. Comparing the different sites, TP concentrations on MLH, ZSH, DPK, and JS located in the northwestern and northern parts of Taihu Lake are higher than others ( Figure 2). Particularly, at ZSH and DPK, the average TP concentration was 0.127 mg/L and 0.103 mg/L, respectively, which exceed the limit for type IV water, showing that the P pollution in the western part of Taihu Lake was more severe than that of other regions [54] (Figure 2). Based on the prior analysis results, combined with the proportional contribution of P in eutrophication [11], TP concentrations in the middle of 2016 were selected for analysis in this study.

Correlation between TP Concentration and Sentinel-2 Image Spectra
Band1, Band2, Band3, Band4, Band8, Band11, and Band12 of the Sentinel-2 images were selected, and the image spectra of the sampling sites were collected for further analysis. Table 1 shows all the correlation coefficients between the single-band reflectance and TP concentration and some high-correlation coefficients between the band combinations and TP concentration. The single-bands and TP concentration were positively correlated, with the correlation between Band8 and the TP concentration being the highest (0.484). The Sentinel-2 image spectra were combined in several methods, including the sum, difference, ratio, and ratio combination (r i + r j )/(r i − r j ), where r i and r j denote the reflectance of bands i and j (i = j). The band combination correlation coefficients ranged from −0.533 to 0.540. Then, the selection of the optimal band combination was performed by the exhaustive method, which allowed us to establish a high-accuracy regression model of the water quality parameters in Taihu Lake [6]. It was found that the R 2 of the PLSR model constructed using a single band and the TP concentration was about 0.30. The results of several trials indicated that the R 2 of the models constructed on sensitive spectral variables and the TP concentration ranged from 0.20 to 0.77, and the sensitive spectral variables, including (( , had a better effect. Thus, the ratio combinations were used to construct the PLSR model. Table 1. The correction between band combinations and TP concentration on Sentinel-2 and Landsat8.

Sensors
Band To eliminate the effects of cloud, land, and aquatic vegetation from the images, three sampling sites (DLS) on 23 July 2016 and (MLH, JS) on 1 September 2016 were removed from the experiment. The measured data was divided with the 3:1 ratio into the training dataset (26 sample sites) and testing dataset (11 sample sites). The model was constructed by combining the Sentinel-2 image spectra on 23 July 2016 and 1 September 2016 with the measured TP concentration on 1-4 August 2016 and 1-8 September 2016. The optimal model was constructed as follows (formula 5): where, B 1 , B 2 , B 3 , B 4 , B 8 are the spectral reflectance of Sentinel-2 Band1, Band2, Band3, Band4, and Band8, respectively.

Verification of the Optimal Algorithm
A relatively satisfactory correlation is observed between the estimated and measured TP concentration (Figure 3). Figure 3a is the scatter plot of the training dataset, which has an RMSEP of 0.023 mg/L, an R 2 of 0.771, and an RPD of 2.086, indicating that it is suitable for predicting the TP concentration in Taihu Lake. The scatter plot of the testing dataset ( Figure 3b) has a moderate correlation coefficient of 0.782 between the estimated and measured TP concentration and a modest RMSEP of 0.020 mg/L, which indicates that the model has a medium prediction effect. Remote Sens. 2022, 14, x FOR PEER REVIEW 9 of 22  To demonstrate the model's accuracy, the optimal algorithm was validated using data on the TP concentration for the same period in 2017. The algorithm was calculated for the same band combinations on the Sentinel-2 images acquired on 28 July 2017 and 27 August 2017. Similarly, several sampling sites were removed from the experiment to eliminate the effects of cloud, land, and aquatic vegetation from the images (three and four sites on 28 July and 27 August 2017, respectively). The algorithm has a moderate verification effect, where the R and RMSEP of the verification set (28 July 2017) reached 0.771 and 0.054 mg/L (Figure 3c), and the R and RMSEP of the verification set (27 August 2017) ran 0.787 and 0.045 mg/L (Figure 3d). Figure 3c shows that low and medium values have better predictions, but the high values are overestimated. Figure 3d shows that the low and medium values have better predictions, but the high values are underestimated.

Temporal and Spatial Distribution of TP Concentration Based on Sentinel-2 Images
The inverse distance weighting (IDW) approach was applied to spatially interpolate the TP concentration of 1-4 August and 1-8 September 2016. Figure 4a,d showed that the TP concentration gradually increased from the southeast to the northwest, with a high TP concentration in the west and northwest of Taihu Lake. The model was constructed with a two-month dataset representing different quality water conditions. Formula 5 was applied to Sentinel-2 images on 23 July and 1 September 2016 to obtain the inversion maps of TP concentration (Figure 4b,e). The inversion maps show an increasing trend of TP concentration from the southeast to the north and northwest, with a higher TP concentration in Taihu Lake's west and southwest parts. The TP concentration in the south and some areas of the north of Taihu Lake was low, with a concentration of about 0.00-0.05 mg/L, which belongs to type II and type III water based on the environmental quality standards for surface water published by the State Environmental Protection Administration in 2002; the TP concentration in the central and north regions was higher, with a concentration of 0.05-0.10 mg/L, which belongs to type IV water. The highest TP concentration exceeds 0.10 mg/L in the northwest and southeast regions, belonging to type V and low V water. Specifically, a series of anomalies with a significantly high TP concentration occurred in the central to the west part of Taihu Lake in the inversion map of 23 July 2016 (Figure 4b).
3.3. Modeling Using the Landsat 8 Image 3.3.1. Correlation between TP Concentration and Landsat-8 Image Spectra Table 1 shows all the correlation coefficients between the image spectra of Landsat-8 and the measured TP concentration. Similarly, the correlation between the single band and TP concentration was lower, except for Band5, with a correlation coefficient of 0.572. The sensitive bands were selected from band combinations based on the relatively high correlation coefficients. With the same band combination method as the Sentinel-2 image, correlation coefficients of band combinations ranged from −0.576 to 0.590. An exhaustive process was used to select the optimal band combinations. The results of several trials showed that models constructed using sensitive spectral variables and the TP concentration have an R 2 ranging from 0.30 to 0.63, and band combinations including B 1 -B 4 , B 1 -B 5 , B 2 -B 3 , B 2 -B 4 , B 2 -B 5 , and B 5 -B 6 indicate a better effect, with an R 2 of 0.63. Therefore, these band subtractions were used to construct the PLSR model.
Four sampling sites, XMK and DLS on 27 July 2016 and JS and YXG on September 13, 2016, were removed from the experiment. The rest data were divided with a 3:1 ratio into a training dataset (27 sites) and testing datasets (9 sites). The formula 6 was the optimal model of the TP concentration: where, B 1 , B 2 , B 3 , B 4 , B 5 , and B 6 are the spectral reflectance of Band1, Band2, Band3, Band4, Band5, and Band6 on the Landsat-8 image, respectively. standards for surface water published by the State Environmental Protection Administration in 2002; the TP concentration in the central and north regions was higher, with a concentration of 0.05-0.10 mg/L, which belongs to type IV water. The highest TP concentration exceeds 0.10 mg/L in the northwest and southeast regions, belonging to type V and low V water. Specifically, a series of anomalies with a significantly high TP concentration occurred in the central to the west part of Taihu Lake in the inversion map of 23 July 2016 (Figure 4b).

Verification of the Optimal Model
A relatively satisfactory correlation is observed between the estimated and measured TP concentration (Figure 3). Figure 3e shows the ideal model of the training dataset, which had an R 2 , RPD, and RMSEP of 0.630, 1.644, and 0.032 mg/L, respectively, indicating the algorithm could be used to estimate the TP concentration. The scatter plot of the testing dataset revealed that the correlation coefficient and the RMSEP between the estimated and measured TP concentrations were 0.802 and 0.016 mg/L, respectively (Figure 3f). To prove the accuracy and extensibility of the model, formula 6 was applied to the Landsat-8 images acquired on 30 July 2017 2017. After eliminating the sampling sites related to the effects of cloud, land, and aquatic vegetation from the images, satisfactory results were achieved from the optimal algorithm, where the R and RMSEP of the verification set (30 July 2017) reached 0.879 and 0.027 mg/L (Figure 3g).

Temporal and Spatial Distribution of the TP Concentration
Formula 6 was applied to the Landsat-8 images acquired on 27 July and 13 September 2016 to obtain the remote sensing inversion maps of the TP concentration (Figure 4c,f), and the TP concentration distribution conditions on inversion results are in good agreement with the spatial interpolation results. Based on the TP concentration of inversion results, the water quality for most areas of Taihu Lake belongs to type II and type III, which was mainly distributed in the south and central regions of Taihu Lake, with a value of about 0.00-0.05 mg/L. The high TP concentration, with a value exceeding 0.05 mg/L, was primarily distributed in the northwestern and southeastern regions of type IV, type V, and inferior V water.

Validation and Extension of the Optimal Model
The optimal models were applied to the Sentinel-2 and Landsat-8 images in 2017 to prove the accuracy and extensibility of the models. Figure 4g,j show the interpolation maps of the TP concentration for August and September 2017 using the IDW method. The interpolation maps showed that the TP concentration was higher, with a value of 0.10-0.20 mg/L, in the northwestern regions and lower, with a value of 0.025-0.10 mg/L, in the other areas. Formula 5 and 6 were applied to the Sentinel-2 image (28 July 2017) and Landsat-8 image (30 July 2017), respectively, to estimate the TP concentration. According to Figure 4h, TP concentration is higher in the northwestern and southeastern parts of Lake Taihu (higher than 0.20 mg/L) and low in the east and south of the lake (less than 0.10 mg/L). The TP concentration derived from the Landsat-8 image (Figure 4i) shows values ranging from 0.10 mg/L to 0.20 mg/L in the northwestern regions and values ranging from 0.025 mg/L to 0.10 mg/L in other areas.
Similarly, the optimal models were applied to the Sentinel-2 image acquired on 27 August 2017, to map the TP concentration of late August 2017. In Figure 4k, the low TP concentration is mainly located in the eastern and southern regions, while the high concentration (higher than 0.20 mg/L) is dominant in northwestern areas.

Correlation Analysis between TP Concentration and Influencing Factors
To delve further into the factors influencing TP concentration changes in Taihu Lake, the environmental factors (COD Mn , BOD 5 , NH 3 -N, TN, DO, pH, COD) and meteorological factors (precipitation and temperature) in this study were correlated with the TP concentration, and the correlation analysis is shown in Figure 5. Most of the variables have a favorable relationship with the TP concentration. At the 0.01 level, the relationship between TP and the factors, including COD Mn , BOD 5 , NH 3 -N, TN, DO, and pH, is significant. At the 0.05 level, the relationship between TP and COD is substantial. COD Mn , BOD 5 , and TP had the best association, with correlation coefficients of 0.671 and 0.677. The correlation coefficients between NH 3 -N, DO, pH, COD, and TP concentration are 0.468, 0.443, 0.557, and 0.332, respectively. The correlation between TN and TP concentration is poor, with a correlation coefficient of 0.244. The meteorological factors, including precipitation and temperature, do not correlate with TP concentration. Similarly, the optimal models were applied to the Sentinel-2 image acquired on 27 August 2017, to map the TP concentration of late August 2017. In Figure 4k, the low TP concentration is mainly located in the eastern and southern regions, while the high concentration (higher than 0.20 mg/L) is dominant in northwestern areas.

Correlation Analysis between TP Concentration and Influencing Factors
To delve further into the factors influencing TP concentration changes in Taihu Lake, the environmental factors (CODMn, BOD5, NH3-N, TN, DO, pH, COD) and meteorological factors (precipitation and temperature) in this study were correlated with the TP concentration, and the correlation analysis is shown in Figure 5. Most of the variables have a favorable relationship with the TP concentration. At the 0.01 level, the relationship between TP and the factors, including CODMn, BOD5, NH3-N, TN, DO, and pH, is significant. At the 0.05 level, the relationship between TP and COD is substantial. CODMn, BOD5, and TP had the best association, with correlation coefficients of 0.671 and 0.677. The correlation coefficients between NH3-N, DO, pH, COD, and TP concentration are 0.468, 0.443, 0.557, and 0.332, respectively. The correlation between TN and TP concentration is poor, with a correlation coefficient of 0.244. The meteorological factors, including precipitation and temperature, do not correlate with TP concentration.    Table 2 shows the related studies of TP remote sensing inversion. Multispectral images were extensively used for TP inversion studies [6,8,13,14,26,51,55,56]. Although hyperspectral images usually have a higher accuracy than multispectral images, the low temporal resolution hinders practicability. Landsat TM, ETM+, OLI imagery, and Sentinel-2 were widely used to estimate the TP concentration in freshwater lakes [6,8,[13][14][15]56], and the obtained R 2 mainly ranged from 0.582 to 0.810. The inversion studies of TP concentration using Landsat-8 images were mostly in Taihu Lake (R 2 = 0.712), Poyang Lake (R 2 = 0.582), Dongting Lake (R 2 = 0.758) [6], and Xin'an River (R 2 = 0.660) [26], and the study using Sentinel-2 images was mainly in Huaihe River (R 2 = 0.690 and 0.810) [14]. This study's models derived from the TP concentration and remote sensing images (Sentinel-2 and Landsat-8) show a higher accuracy and could be used to predict the TP concentration in Taihu Lake. The R 2 , RPD, and RMSEP of the PLSR model built using the modeling dataset of Sentinel-2 images were 0.771, 2.086, and 0.023 mg/L, respectively, whereas the validation dataset had a medium R of 0.782 and a small RMSEP of 0.020 mg/L between the predicted and observed TP concentration. The R 2 , RPD, and RMSEP training datasets on Landsat-8 were 0.630, 1.644, and 0.032 mg/L, respectively, and the R and RMSEP of the validation dataset were 0.802 and 0.016 mg/L, respectively.

Validation of Model Accuracy and Inversion Results on Sentinel-2 and Landsat-8 Images
The Landsat-8 and Sentinel-2 images used in this study were applied to verify the model accuracy in Table 2. The predictions of the different models were compared with the measured TP concentration in this study, and the correlation coefficients and root mean square errors were calculated (Table 2). Because the bands of Landsat-8 and Sentinel-2 images are relatively similar, the prediction accuracy based on these two images is similar, which further indicates that the two images are complementary. The prediction accuracy of both Landsat-8 and Sentinel-2 images for Taihu Lake is relatively high compared to other lakes. The model has the best validation results (R = 0.720) [6] with the data collected from Taihu Lake in July and August 2018. The study [6] had the same sampling locations and dates as this study, so the accuracy of predictions was relatively high. This demonstrates that stable regional models can estimate TP concentrations in different years.
Previous studies mainly used the TP concentration of a single time to construct remote sensing models in freshwater lakes [8,13,17,55,57,58], reservoirs [22], and rivers [56], while its applicability was greatly limited by study area and data. Moreover, an analysis of the TP concentration in Taihu Lake ( Figure 1) revealed that the TP concentrations were relatively stable in the summer and fall. The suitable water temperature provides better environmental conditions for algae growth [50]. The sources of pollutants discharged from the same area are similar; therefore, the same model can be used for TP monitoring in the same area at different times. In this paper, forty sampling sites from August and September were used to develop the TP estimation model to enhance the time availability in this research. The optimal model applied to Landsat-8 and Sentinel-2 images achieved satisfactory results, with both high and low values in good agreement with the measured values. There was a relatively large error in predicting high values due to the small amount of dissolved phosphorus (DP), which was in agreement with another study [16]. The optimal algorithm based on Landsat-8 images has a better verification effect (RMSEP = 0.027 mg/L, and R = 0.879 for one Landsat-8 image), and the optimal algorithm based on Sentinel-2 images has moderate verification effect (RMSEP = 0.054 mg/L and 0.045 mg/L, and R = 0.771 and 0.787 for two Sentinel-2 images).

Stability of the Models
In order to prove the accuracy of the model, the optimal models of formula 5 and formula 6 were verified and applied to the Sentinel-2 and Landsat-8 images in 2017. The scatter plots derived from Sentinel-2 images (Figure 3c,d) show that the model predicted moderately well. There was a relatively large error in predicting high values, which agreed with the inversion results (Figure 4b,e). The reason for the large error in the scatter plot was the small sample points in the verification datasets (Figure 3c,d). The optimal model applied to Landsat-8 image provided better results, with high and low values in good agreement with the measured values (Figure 3g). According to Figure 3g, the optimal algorithm based on Landsat-8 images and in situ data has a better verification effect (RMSEP = 0.27 mg/L, and R = 0.879). Therefore, the model constructed based on the band combinations of Landsat-8 images had a better accuracy than that of Sentinel-2 images.
To better verify the accuracy of the model and the complementarity of the two images, 160 sampling points (Figure 6a) were uniformly laid in Taihu Lake, and the remote sensing inversion values of different images were extracted to draw the scatter plots ( Figure 6).

Causes of the TP Concentration Distribution
This study's results showed that high TP concentration values are located in the northwest, north, and southeast regions, while the low values are located in the east and central areas. The water quality in the northwest of Taihu Lake is type V class and inferior A comparison of the inversion results revealed that the distribution of TP concentration in the interpolated and inverse results in August 2017 was consistent, with the highand low-value areas coinciding well with each other, proving that the model had good prediction results for the 2017 images. According to the distribution of TP concentration in the inverse results (Figure 4k) in September 2017, the TP concentration is lower in eastern and southern regions, which is in agreement with the interpolated map, and the TP concentration is higher (greater than 0.20 mg/L) in northwestern areas, which are different from the interpolated map. In the central part of Taihu Lake, the TP concentration is higher in the interpolated map than it is in the inversed map. By comparing the interpolation plots (Figure 4a,d,g) and Landsat-8 inversion plots (Figure 4c,f,i), we find that Landsat-8 predictions are relatively effective, but the Landsat-8 image has a long revisit period of 16 days and is susceptible to missing images affected by the cloud. There was a lack of available Landsat-8 imagery in September 2017. Without Landsat-8 images, Sentinel-2 concurrent images can be selected for TP concentration monitoring.

Causes of the TP Concentration Distribution
This study's results showed that high TP concentration values are located in the northwest, north, and southeast regions, while the low values are located in the east and central areas. The water quality in the northwest of Taihu Lake is type V class and inferior V class. Furthermore, the northern region close to the MLH and WLH and the southeast region of Taihu Lake are seriously polluted (Figure 4).
The P in Taihu Lake mainly comes from both internal and external sources. The emissions of farmland drainage and domestic wastewater are the primary external sources. The industrial waste discharges in the Taihu Lake area have been effectively controlled since the implementation of "Operation Zero" in 1998 [52]. P is mainly discharged into Taihu Lake by runoff and drainage processes, and the driving forces are the precipitation and landform. The rainy season of the Taihu Lake region is from May to September, and the TP concentration from May to September was higher than that from November to March (Figure 2). The northwest region is the Wujin District of Changzhou city, a large agricultural area in Jiangsu Province. The pesticide residue problem adversely affected the water quality condition of Wujin District and even the whole Taihu Lake basin [59]. Hence, the TP concentration is higher in the northwest of Taihu Lake. The water pollution in MLH, ZSH, and WLH in the north of Taihu Lake is severe. MLH and ZSH, located in the north of Taihu Lake, where the TP content was more significant than other regions, is one of the most seriously polluted areas, which is consistent with other studies [51,60]. The estuaries of WLH are located in Wuxi city, with a population of more than seven million, and domestic sewage is discharged into Taihu Lake. The topography of Taihu Lake is high in the west and low in the east ( Figure 1); most runoff comes from the western mountainous watershed and flows out at the east side of the lake. Therefore, the TP concentration was higher in Taihu Lake's western and northwestern regions [54].
The internal sources are the P release from sediment and the P decomposition of cyanobacteria. The sediment P release is affected by various factors such as light, temperature, pH value, oxygen concentration, biological activity, Microcystis blooms [61], and sediment resuspension [62]. Various factors indirectly affect P release through biological activity. The light has adverse effects on P release. Light gives additional energy to the algal growth [63], and more oxygen is excreted, which is harmful to the anaerobic bacteria and thus limits the cyanobacterial decomposition. Meanwhile, some of the P could be assimilated by the growing algae, which would further reduce the amount of TP concentration in the systems [64]. The temperature could remarkably improve P release, and higher temperatures increase biological activity, thereby promoting the P release from the sediment. The peak of TP concentration generally occurs in July, April, or September ( Figure 2). Increased water column stability resulting from increased temperatures is likely to increase cyanobacteria dominance by relieving light limitation [65]. The correlation analysis showed that the pH value is positively correlated with TP concentration (Figure 5), and the correlation coefficient is 0.446. Higher pH values affect the growth of algae in lakes, especially cyanobacteria, and even lead to the outbreak of blooms [66]. The biological action of the water column promotes the P release. The correlation analysis showed that TP was positively correlated with COD Mn and BOD 5 , which were probably related to organic pollution. Cities surround the northwest of Taihu Lake, and household and industrial wastes are discharged into the water through the rivers [67]. Organic matter and nutrients in household and industrial wastes uptake the DO concentration [68]. In the anoxic and anaerobic system, P can be quickly released from the sediments to the overlying water [69]. Taihu Lake is a shallow lake; wind waves exert apparent disturbances and lead to a large amount of sediment resuspension [70]. Studies have shown that wind and wave disturbances can increase P levels in the water column to about 10 times, or even 20-30 times, in the short term. Taihu Lake experiences a monsoon climate, with prevailing southeast winds in the summer and northwest winds in the winter. The monsoon climate will cause the accumulation of pollutants, and the degree of pollution is more severe in the marginal areas than it is in the lake's center [51].
The enclosure culture area in the southeast region of Taihu was affected by residual bait, which increased the nutrients such as P in the lake, causing severe eutrophication of the water in East Taihu Lake. The TP concentration in the marginal zone of Taihu Lake is affected by drainages of domestic wastewater because Taihu Lake is surrounded by five cities [67].

Conclusions
This paper predicts the spatial distribution of TP concentration using remote sensing and measured data to monitor the water quality of Taihu Lake. Sentinel-2 and Landsat-8 images acquired in 2016 were used for modeling and inversion TP concentration. It was found that the band combinations of (B 1 + B 8  The optimal model was applied to the remote sensing images of 2016 and 2017 to obtain the inversion maps, which were comparable with the interpolation results. Therefore, it can be concluded that it is feasible to monitor the TP concentration using high-resolution Landsat-8 and Sentinel-2 multispectral imagery. The inversion maps based on Landsat-8 and Sentinel-2 images are in good agreement, indicating that two remote sensing images can be complementary for water quality monitoring. The inversion results analyzed the spatial variation in TP concentration. Comparing all the inversion maps, pollution in the northwestern part of Taihu Lake was more severe, whereas the TP concentration in the south and east of the lake was relatively low. This method can potentially allow for monitoring the TP concentration on a large scale and contributes to a more targeted treatment of water pollution.
Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs14246284/s1, Table S1: Dates of each sampling site in Taihu Lake in August and September, 2016; Table S2: Time interval between images and sampling sites.