Retrieval of Chlorophyll a Concentration in Water Considering High-Concentration Samples and Spectral Absorption Characteristics

The envelope removal method has the advantage of suppressing the background spectrum and expanding the weak absorption characteristic information. However, for second-class water bodies with a relatively complex water quality, there are few studies on the inversion of chlorophyll a (Chl-a) concentration in water bodies that consider the spectral absorption characteristics. In addition, the current research on the inversion of the Chl-a concentration was carried out under the condition of sample concentration equilibrium. For areas with a highly variable Chl-a concentration, it is still challenging to establish a highly applicable and accurate Chl-a concentration inversion model. Taking Dongting Lake in China as an example, this study used high-concentration samples and spectral absorption characteristics to invert the Chl-a concentration. The decap method was used to preprocess the high-concentration samples with large deviations, and the envelope removal method was used to extract the spectral absorption characteristic parameters of the water body. On the basis of the correlation analysis between the water Chl-a concentration and the spectral absorption characteristics, the water Chl-a concentration was inverted. The results showed the following: (1) The bands that were significantly related to the Chl-a concentration and had a large correlation coefficient were mainly located in the three absorption valleys (400–580, 580–650, and 650–710 nm) of the envelope removal curve. Moreover, the correlation between the Chl-a concentration and the absorption characteristic parameters at 650–710 nm was better than that at 400–580 nm and 580–650 nm. (2) Compared with the conventional inversion model, the uncapped inversion model had a higher RP2 and a lower RMSEP, and was closer to the predicted value of the 1:1 line. Moreover, the performance of the uncapped inversion model was better than that of the conventional inversion model, indicating that the uncapped method is an effective preprocessing method for high-concentration samples with large deviations. (3) The predictive capabilities of the ER_New model were significantly better than those of the R_New model. This shows that the envelope removal method can significantly amplify the absorption characteristics of the original spectrum, which can significantly improve the performance of the prediction model. (4) From the inversion models for the absorption characteristic parameters, the prediction models of A650–710 nm_New and D650–710 nm_New exhibited the best performance. The three combined models (A650–710 nm&D650–710 nm_New, A650–710 nm&NI_New, A650–710 nm&DI_New) also demonstrated good predictive capabilities. This demonstrates the feasibility of using the spectral absorption feature to retrieve the chlorophyll concentration.


Introduction
Lake eutrophication is one of the most important water-related environmental problems facing humanity [1][2][3]. Although the current degree of eutrophication in Dongting Lake is relatively low, as a result of the influence of social and economic development in the basin, the nitrogen and phosphorus pollution in Dongting Lake is evident, the water quality has deteriorated, and the eutrophication trend has intensified [4,5]. Chlorophyll is an important component of phytoplankton, and chlorophyll a (Chl-a) is the type of chlorophyll contained in all phytoplankton categories. Chl-a concentration is an important index used to characterize the autotrophic biomass of light energy. It can also be used to estimate the biomass and productivity of phytoplankton, and it is also an important parameter reflecting the degree of nutrition of water bodies [6][7][8]. Monitoring the spatial-temporal changes in the chlorophyll a concentration in lakes helps to assess the progress towards the sustainable development goals (SDGs) outlined by the government, especially in terms of measuring the SDG 6.6.1 indicators. The current water quality monitoring method can accurately determine the various indicators of water quality at a certain location, but it is costly, time-consuming, and the range of monitoring points is limited, which does not reflect the distribution of water quality in time and space [9][10][11]. Hyperspectral remote sensing has the advantages of multiple and narrow bands. It can quickly obtain surface reflection spectra of water bodies, detect the relationship between spectral characteristics and water quality indicators, and provide a powerful tool for real-time, rapid, and large-scale water quality monitoring [12][13][14][15][16].
At present, the method of measuring Chl-a concentration based on hyperspectral remote sensing mainly uses the change rule of hyperspectral reflectance with the Chl-a concentration, and quantitative inversion of the Chl-a concentration through various related indicators [17][18][19]. There is little in-depth discussion on the potential of using spectral absorption characteristic parameters for quantitative inversion of the Chl-a concentration. Studies have shown that the envelope removal method is an effective spectral analysis method [20][21][22]. The envelope refers to the background contour line covering the upper part of the spectrum curve, which is composed of the connecting line of the local maximum at the inflection point of the reflectance spectrum [23,24]. Taking the envelope as the background, the envelope is removed to obtain the characteristic absorption band of the spectrum, so that the spectral absorption characteristic parameters can be extracted. This method can eliminate irrelevant background information, enhance the absorption feature of interest, and normalize it to a uniform spectral background, thus having the advantage of suppressing the background spectrum and expanding the weak absorption feature information. The envelope removal method was originally used for mineral mapping, and later extended to soil composition inversion, vegetation mapping, etc. [25][26][27][28][29][30][31]. However, there is no report on the use of the envelope removal method to extract the spectral absorption characteristic parameters and invert the Chl-a concentration of water bodies.
In addition, the current research on the inversion of the Chl-a concentration is carried out under the condition of sample concentration equilibrium [32][33][34][35][36]. For areas with highly variable Chl-a concentrations in water bodies, in order to improve the accuracy of inverting Chl-a concentrations, existing studies often remove samples that exceed the standard as outliers [37][38][39][40][41][42][43]. However, these outliers are often the focus of researchers. Therefore, on the basis of existing research, this paper proposes a new preprocessing method (the decap method) for the hyperspectral retrieval of the Chl-a concentration in order to improve the accuracy of Chl-a concentration retrieval for excessive samples with large deviations.
Taking Dongting Lake in China as an example, this study takes into account high concentration samples and spectral absorption characteristics to retrieve the Chl-a concentration. The decap method was used to preprocess the high-concentration samples with large deviations, and the envelope removal method was used to extract the water spectral absorption characteristic parameters. On the basis of Chl-a concentration correlation analysis and the spectral absorption characteristics, the Chl-a concentration was inverted. It is expected to provide a useful reference for the use of hyperspectral data to Sustainability 2021, 13, 12144 3 of 14 carry out large-scale normalized Chl-a concentration monitoring, and to provide theoretical and technical support for the environmental protection and sustainable development of Dongting Lake.

Study Area
Dongting Lake is located in the northern part of Hunan Province, China, south of the Yangtze River. It is the second largest freshwater lake in China and an internationally important wetland. The geographical location is between 28 • 30 -29 • 31 N and 111 • 40 -113 • 10 E. Four waters (Xiangjiang, Zishui, Yuanshui, and Lishui) flow into Dongting Lake from the south, and the north side of Dongting Lake is connected to the Yangtze River. It is a typical water-passing lake. At present, it is divided into East Dongting Lake, South Dongting Lake, West Dongting Lake, and Datong Lake (Figure 1), covering an area of about 2789 km 2 . With the rapid development of the national economy, population growth, siltation, lake construction, the discharge of a large number of pollutants into the lake, and other natural and human factors, Dongting Lake faces many water-related environmental problems [44][45][46]. At present, Dongting Lake is mainly mesotrophic, and eutrophic water bodies are mainly concentrated in areas with slow water flow such as the Inner Lake and the western part of East Dongting Lake [4]. Chl concentration correlation analysis and the spectral absorption characteristics, the -a Chl concentration was inverted. It is expected to provide a useful reference for the use of hyperspectral data to carry out largescale normalized -a Chl concentration monitoring, and to provide theoretical and technical support for the environmental protection and sustainable development of Dongting Lake.

Study Area
Dongting Lake is located in the northern part of Hunan Province, China, south of the Yangtze River. It is the second largest freshwater lake in China and an internationally important wetland. The geographical location is between 28°30′-29°31′ N and 111°40′-113°10′ E. Four waters (Xiangjiang, Zishui, Yuanshui, and Lishui) flow into Dongting Lake from the south, and the north side of Dongting Lake is connected to the Yangtze River. It is a typical water-passing lake. At present, it is divided into East Dongting Lake, South Dongting Lake, West Dongting Lake, and Datong Lake (Figure 1), covering an area of about 2789 km 2 . With the rapid development of the national economy, population growth, siltation, lake construction, the discharge of a large number of pollutants into the lake, and other natural and human factors, Dongting Lake faces many water-related environmental problems [44][45][46]. At present, Dongting Lake is mainly mesotrophic, and eutrophic water bodies are mainly concentrated in areas with slow water flow such as the Inner Lake and the western part of East Dongting Lake [4].

Sample Collection and Measurement
In accordance with the "Guidance on sampling from lakes, natural and man-made" (ISO/TC 147/SC 6) [47], water quality sampling points were set up in Dongting Lake (Figure 1). Sunny and less cloudy weather was chosen to carry out field ground experiments.  Figure 1. Location of sampling points in Dongting Lake.

Sample Collection and Measurement
In accordance with the "Guidance on sampling from lakes, natural and man-made" (ISO/TC 147/SC 6) [47], water quality sampling points were set up in Dongting Lake ( Figure 1). Sunny and less cloudy weather was chosen to carry out field ground experiments. A total of 86 water samples were collected and brought back to the laboratory for the Chl-a concentration measurement using the hot ethanol method. A GF/C filter membrane was used to filter water samples. The filter membrane was placed in a refrigerator for Sustainability 2021, 13, 12144 4 of 14 more than 48 h, and 90% hot ethanol was used for extraction. With 90% ethanol as the reference solution, the absorbance at 665 and 750 nm was measured on the Shimadzu UV2401 spectrophotometer. One drop of 1% dilute hydrochloric acid acidified was added, and the chlorophyll a concentration was calculated. The water surface spectrum was collected by the water surface spectrum measurement method. For each sample point, the "ASD Field Spec" portable field spectrometer (350~1050 nm) was used to collect 10 water surface spectra, the abnormal values were eliminated, and then averaging processing was performed to obtain the measured spectrum value of the water surface sampling point.

Spectral Data Processing and Absorption Feature Extraction
In view of the random error in the process of spectral measurements and the weak chlorophyll spectral response signal, spectral denoising, spectral resampling, spectral smoothing, and envelope removal were used to preprocess the spectral data to enhance the characteristic band of the spectral response. In this study, the low signal-to-noise ratio bands (350-399 nm and 891-1050 nm) were removed. Adjacent hyperspectral data bands are often highly correlated, and there is information redundancy. The 491 bands between 400-890 nm collected by the spectrometer were resampled with an interval of 10 nm to reduce the correlation between the bands and improve the efficiency of data processing. The Savitzky-Golay algorithm was used to smooth the spectrum.
The envelope removal method is able to effectively highlight the characteristics of the reflectance spectrum curve and normalize the reflectance to between 0-1; the absorption characteristics of the spectrum are also normalized to a uniform spectral background. On the basis of envelope removal, some spectral absorption characteristic parameters have been developed, mainly including the absorption depth (D) and absorption valley area (A). The absorption depth is the difference between 1 and the minimum value of the envelope removal curve in the absorption valley, and the absorption area is the integrated area of the absorption valley. The ENVI software was used to establish a water sample spectrum database. The "Continuum Removed" function in ENVI was utilized to remove the water sample spectrum data envelope. In the Matlab environment, the code was written to calculate the spectral absorption characteristic parameters of each water sample.

Spectral Index Construction
The chlorophyll concentration and the envelope removal curve after 10 nm resampling were used for the correlation analysis. The maximum positive correlation band (550 nm) and the maximum negative correlation band (680 nm) were selected to construct the normalized index (NI), difference index (DI), and ratio index (RI). The formula is as follows: In the formula, ER 550 and ER 680 represent the envelope removal values at 550 nm and 680 nm, respectively.

Model Construction and Accuracy Evaluation
The coefficient of determination (R 2 ) and the root mean square error (RMSE) were selected as the criteria for judging the predictive ability of the model. Among them, R 2 included the determination coefficient of prediction samples (R P 2 ) and the determination coefficient of training samples (R T 2 ), and the RMSE included the root mean square error of prediction samples (RMSE P ) and the root mean square error of training samples (RMSE T ), as follows: where y is the measured sample value, n is the number of samples denoted by i = 1, 2 · · · , n, − y is the mean of the measured sample, and ∧ y is the predicted value of the sample. Generally, the larger the R T 2 and R P 2 , the smaller the RMSE T and RMSE P , the higher the model accuracy, and the smaller the deviation.

Descriptive Statistical Analysis of Chlorophyll a Concentration
Dongting Lake is divided into several lakes, and the exchange water rate in each lake is different. In water bodies with a faster water flow, the Chl-a concentration is low. In relatively closed water bodies, the Chl-a concentration is higher (the maximum value is 124.25 mg/m 3 ), and the sample average value is 31.35 mg/m 3 . Moreover, the Chl-a concentration varies greatly between different sample points (with a standard deviation up to 30.00 mg/m 3 , and a coefficient of variation up to 0.94). According to the calculation formula of the comprehensive nutritional status index [4], the critical chlorophyll concentration value is 63.03 mg/m 3 when the water body exhibits severe eutrophication. Among the 86 water samples, 13 samples exceeded the critical value, with a point exceeding rate of 15.12%.
The measured data exceed the standard at a high rate, and the standard deviation and the coefficient of variation are large, which indicates that the study area exhibits more eutrophication, that the study area is relatively varied, and that the spatial heterogeneity is more significant. In order to improve the inversion accuracy, existing studies often remove samples that severely exceed the standard as outliers. However, these outliers are often the focus of researchers. The basic idea is as follows: First, samples that exceed the standard (>63.03 mg/m 3 ) are selected, the critical severe eutrophication value (63.03 mg/m 3 ) is utilized to replace the concentration of the sample, and the difference between the critical value of the outlier sample and the critical severe eutrophication value is calculated. Then, the new sample concentration data and hyperspectral data are used to establish an inversion model. Finally, the model inversion value of the outlier sample plus the difference between the outlier sample and the critical severe eutrophication value is used as the predicted value of the outlier sample.

Spectral Absorption Characteristic Analysis
The 86 water body samples collected in Dongting Lake were sorted from low Chl-a concentration to high Chl-a concentration. According to the Chl-a concentration, all samples were divided into six groups at equal intervals, and the spectral reflectance of all samples in each group was averaged to obtain the grouped spectral characteristic curve ( Figure 2). The averages of the envelope removal curves of all samples in each group were utilized to obtain the group envelope removal curve ( Figure 3). It can be seen from Figure 2 that, due to the strong absorption of chlorophyll a and yellow substances in the blue-violet light band, the reflectivity of water bodies in the range of 400-500 nm was low. As a result of the weak absorption of chlorophyll and carotene in the range of 550-580 nm and the scattering effect of cells, the water spectrum formed a reflection peak. As a result of the absorption of phycocyanin pigment near 620 nm, absorption valleys appeared in the water spectrum Due to the strong absorption of chlorophyll near 675 nm, absorption valleys appeared in the water spectrum. There was an obvious reflection peak at 685-715 nm, which is generally considered to be the fluorescence peak of chlorophyll a, which shifts in the long-wave direction as the Chl-a concentration increases. Due to the absorption of pure water in the infrared band, the reflectivity decreased rapidly after 730 nm. and the scattering effect of cells, the water spectrum formed a reflection peak. As a result of the absorption of phycocyanin pigment near 620 nm, absorption valleys appeared in the water spectrum Due to the strong absorption of chlorophyll near 675 nm, absorption valleys appeared in the water spectrum. There was an obvious reflection peak at 685-715 nm, which is generally considered to be the fluorescence peak of chlorophyll a, which shifts in the long-wave direction as the -a Chl concentration increases. Due to the absorption of pure water in the infrared band, the reflectivity decreased rapidly after 730 nm.  It can be seen from Figure 3 that, after the envelope removal treatment of the reflectance, the absorption characteristics were significantly enlarged. For example, the weak absorption bands at 440, 490, 620, 670, 740, and 840 nm can be clearly observed in the envelope removal curve. This is not obvious in the reflectance curve. The envelope removal curves of different  of the absorption of phycocyanin pigment near 620 nm, absorption valleys appeared in the water spectrum Due to the strong absorption of chlorophyll near 675 nm, absorption valleys appeared in the water spectrum. There was an obvious reflection peak at 685-715 nm, which is generally considered to be the fluorescence peak of chlorophyll a, which shifts in the long-wave direction as the -a Chl concentration increases. Due to the absorption of pure water in the infrared band, the reflectivity decreased rapidly after 730 nm.  It can be seen from Figure 3 that, after the envelope removal treatment of the reflectance, the absorption characteristics were significantly enlarged. For example, the weak absorption bands at 440, 490, 620, 670, 740, and 840 nm can be clearly observed in the envelope removal curve. This is not obvious in the reflectance curve. The envelope removal curves of different  It can be seen from Figure 3 that, after the envelope removal treatment of the reflectance, the absorption characteristics were significantly enlarged. For example, the weak absorption bands at 440, 490, 620, 670, 740, and 840 nm can be clearly observed in the envelope removal curve. This is not obvious in the reflectance curve. The envelope removal curves of different Chl-a concentrations have five typical absorption characteristic bands, i.e., 400-580, 580-650, 650-710, 710-820, and 820-880 nm. Among them, at 400-580, 580-650, and 650-710 nm, the Chl-a concentration of high-concentration samples were positively correlated with the absorption depth and absorption area, while low-concentration samples exhibited no obvious regularity. Figure 4 shows the correlation curve between Chl-a concentration and envelope removal. It can be seen from Figure 4 that the water chlorophyll concentration and the envelope removal data were significantly negatively correlated at 421-517 nm, of which the maximum negative correlation was at 450 nm, reaching −0.69. There was a significant positive correlation at 540-562 nm, and the maximum positive correlation was at 553 nm, reaching 0.58. There was a significant negative correlation at 587-692 nm, and the largest negative correlation was at 681 nm, reaching −0.71. There was a significant positive correlation at 702-758 nm, but the correlation coefficient was small. The bands that were significantly related to the chlorophyll concentration and had a large correlation coefficient were mainly located in the three absorption valleys (400-580, 580-650, and 650-710 nm) of the envelope removal curve. Figure 4 shows the correlation curve between -a Chl concentration and envelope removal. It can be seen from Figure 4 that the water chlorophyll concentration and the envelope removal data were significantly negatively correlated at 421-517 nm, of which the maximum negative correlation was at 450 nm, reaching −0.69. There was a significant positive correlation at 540-562 nm, and the maximum positive correlation was at 553 nm, reaching 0.58. There was a significant negative correlation at 587-692 nm, and the largest negative correlation was at 681 nm, reaching −0.71. There was a significant positive correlation at 702-758 nm, but the correlation coefficient was small. The bands that were significantly related to the chlorophyll concentration and had a large correlation coefficient were mainly located in the three absorption valleys (400-580, 580-650, and 650-710 nm) of the envelope removal curve.  Table 1 shows the correlation analysis results of the chlorophyll concentration and the spectral absorption characteristic parameters extracted by the envelope removal method. It can be seen from Table 1 that the correlation between the six spectral absorption characteristic parameters and the chlorophyll concentration all passed the extremely significant test level, i.e., at the 0.01 level. Among them, A650-710 nm had the largest correlation coefficient, reaching 0.76, followed by D650-710 nm, which also had a correlation coefficient of 0.73. The correlation between the -a Chl concentration and the spectral absorption characteristic parameters at 650-710 nm was better than that of 400-580 nm and 580-650 nm. In brief, the correlation analysis between the -a Chl concentration and the spectral absorption characteristic parameters shows that the spectral absorption characteristic parameters had the potential to quantitatively evaluate the -a Chl concentration.    Table 1 shows the correlation analysis results of the chlorophyll concentration and the spectral absorption characteristic parameters extracted by the envelope removal method. It can be seen from Table 1 that the correlation between the six spectral absorption characteristic parameters and the chlorophyll concentration all passed the extremely significant test level, i.e., at the 0.01 level. Among them, A 650-710 nm had the largest correlation coefficient, reaching 0.76, followed by D 650-710 nm , which also had a correlation coefficient of 0.73. The correlation between the Chl-a concentration and the spectral absorption characteristic parameters at 650-710 nm was better than that of 400-580 nm and 580-650 nm. In brief, the correlation analysis between the Chl-a concentration and the spectral absorption characteristic parameters shows that the spectral absorption characteristic parameters had the potential to quantitatively evaluate the Chl-a concentration.

Model Performance Analysis under Different Processing Methods
In order to study the effect of the chlorophyll a model in inverting high-concentration samples under different pretreatment methods, the samples exceeding the critical value of severe eutrophication (63.03 mg/m 3 ) were used as prediction samples, and the other samples were used as training samples. N1, DI, D 650-710 nm , and A 650-710 nm are four spectral parameters that exhibited a good correlation with the chlorophyll concentration. Taking these four spectral parameters as input independent variables and the chlorophyll concentration under different preprocessing methods as the dependent variable, using linear regression method, eight chlorophyll a concentration inversion models were constructed. The inversion model included four conventional inversion models (N1_Con, DI_Con, D 650-710 nm _Con, and A 650-710 nm _Con) and four uncapped inversion models (N1_New, DI_New, D 650-710 nm _New, and A 650-710 nm _New). The performance of these models is shown in Table 2 and Figure 5. In order to study the effect of the chlorophyll a model in inverting high-concentration samples under different pretreatment methods, the samples exceeding the critical value of severe eutrophication (63.03 mg/m 3 ) were used as prediction samples, and the other samples were used as training samples. N1, DI, D650-710 nm, and A 650-710 nm are four spectral parameters that exhibited a good correlation with the chlorophyll concentration. Taking these four spectral parameters as input independent variables and the chlorophyll concentration under different preprocessing methods as the dependent variable, using linear regression method, eight chlorophyll a concentration inversion models were constructed. The inversion model included four conventional inversion models (N1_Con, DI_Con, D650-710 nm_Con, and A650-710 nm_Con) and four uncapped inversion models (N1_New, DI_New, D650-710 nm_New, and A650-710 nm_New). The performance of these models is shown in Table 2 and Figure 5.  As far as the RP 2 is concerned, the values of the eight inversion models from large to small are as follows: D650-710 nm_New > A650-710 nm_ New > DI_New> N1_New > A650-710 nm_Con > DI_Con > N1_Con > D650-710 nm_Con. Among them, the average value of the inversion models of four uncapped methods was 0.83, and the average value of the inversion models of four conventional methods was 0.09. In terms of the RMSEP, the values of the eight inversion models from small to large are as follows: A650-710 nm_New < D650-710 nm_New < DI_New < N1_New < A650-710 nm_Con <DI_Con < D650-710 nm_Con < N1_Con. Among them, the average value of the four inversion models with the uncapped method was 24.88, and the average value of the four inversion models with the conventional method was 47.97. In summary, the inversion model of the uncapped method had a higher RP 2 , and a lower RMSEP than the conventional inversion model. This indicates that the performance of the uncapped inversion model was better than that of the conventional inversion model. In addition, it can be seen from Figure 5 that, in the conventional inversion model, the predicted value of high-concentration samples was much lower than the 1:1 line, indicating that the inversion model significantly underestimated the high-concentration samples. In the uncapped inversion model, the predicted value of high-concentration samples was close to the 1:1 line, indicating that, as compared with the conventional inversion model, the uncapped inversion model exhibited a greatly improved inversion performance for high-concentration samples. As far as the R P 2 is concerned, the values of the eight inversion models from large to small are as follows: D 650-710 nm _New > A 650-710 nm _ New > DI_New> N1_New > A 650-710 nm _Con > DI_Con > N1_Con > D 650-710 nm _Con. Among them, the average value of the inversion models of four uncapped methods was 0.83, and the average value of the inversion models of four conventional methods was 0.09. In terms of the RMSE P , the values of the eight inversion models from small to large are as follows: A 650-710 nm _New < D 650-710 nm _New < DI_New < N1_New < A 650-710 nm _Con <DI_Con < D 650-710 nm _Con < N1_Con. Among them, the average value of the four inversion models with the uncapped method was 24.88, and the average value of the four inversion models with the conventional method was 47.97. In summary, the inversion model of the uncapped method had a higher R P 2 , and a lower RMSE P than the conventional inversion model. This indicates that the performance of the uncapped inversion model was better than that of the conventional inversion model. In addition, it can be seen from Figure 5 that, in the conventional inversion model, the predicted value of high-concentration samples was much lower than the 1:1 line, indicating that the inversion model significantly underestimated the high-concentration samples. In the uncapped inversion model, the predicted value of high-concentration samples was close to the 1:1 line, indicating that, as compared with the conventional inversion model, the uncapped inversion model exhibited a greatly improved inversion performance for high-concentration samples.

Model Performance Analysis Taking into Account High-Concentration Samples and Spectral Absorption Characteristics
Herein, 70% of the samples were randomly selected as training samples and 30% of the samples were selected as prediction samples. In the decap mode, considering the spectral absorption characteristics, the chlorophyll concentration inversion models were established. The performance of each model is shown in Table 3. In order to study the performance of the envelope removal method in terms of retrieving the Chl-a concentration, this study used the envelope removal value and the original reflectance value as the inversion factors, and established stepwise regression models (ER_New and R_New) to retrieve the Chl-a concentration. It can be seen from Table 3 that, although the modeling effect of the R_New model (R T 2 = 0.80, RMSE T = 13.90) was slightly better than that of the ER_New model (R T 2 = 0.78, RMSE T = 14.89), the predictive capabilities of the ER_New model (R P 2 = 0.79, RMSE P = 13.12) were much better than those of the R_New model (R P 2 = 0.66, RMSE P = 16.61). According to the correlation analysis result, the band with the largest positive correlation and the largest negative correlation was selected to establish the spectral index of the envelope removal value. Spectral indices (NI, DI, and RI) were used as independent variables, and linear regression models (NI_New, DI_New, and RI_New) were constructed to invert the Chl-a concentration. The predictive capabilities of the three models (NI_New(R P 2 = 0.78, RMSE P = 17.20), DI_New(R P 2 = 0.78, RMSE P = 13.64), RI_New (R P 2 = 0.77, RMSE P = 13.64)) had little effect on the prediction. The bands that were significantly related to the chlorophyll concentration and had a large correlation coefficient were mainly located in the three absorption valleys (400-580, 580-650, and 650-710 nm) of the envelope removal curve. In order to extract the absorption characteristic information of the water samples with different Chl-a concentrations in the three absorption valleys, MATLAB software was used to calculate the absorption depth (D) and the absorption valley area (A) of the three absorption valleys. Taking the absorption depth and absorption area of the three absorption valleys as independent variables, six linear regression models were constructed to invert the Chl-a concentration. The models included D 400-580 nm _New, D 580-650 nm _New, D 650-710 nm _New, A 400-580 nm _New, A 580-650 nm _New, and A 650-710 nm _New. In terms of the predictive capabilities of the model, A 650-710 nm _New (R P 2 = 0.88, RMSE P = 13.46) was better than the other five models, followed by D 650-710 nm _New (R P 2 = 0.81, RMSE P = 12.80); D 580-650 nm _New (R P 2 = 0.58, RMSE P = 18.46) had the worst performance. Taking the spectral parameters with the best inversion effect (A 650-710 nm ) and the spectral parameters with better inversion results (D 650-710 nm , NI, DI) as independent variables, three linear regression models were constructed to perform the chlorophyll concentration inversion. The models included A 650-710 nm &D 650-710 nm _New, A 650-710 nm &NI_New, and A 650-710 nm &DI_New. It was found that these three models had good predictive capabilities (A 650-710 nm &D 650-710 nm _New(R P 2 = 0.81, RMSE P = 14.05), A 650-710 nm &NI_New (R P 2 = 0.83, RMSE P = 17.77), A 650-710 nm &DI_New(R P 2 = 0.80, RMSE P = 15.25)).

Discussion
The envelope removal method has the advantage of suppressing the background spectrum and enhancing the weak absorption characteristic information. However, for second-class water bodies with a relatively complex water quality, there are few studies that focus on the inversion of the Chl-a concentration while considering the spectral absorption characteristic parameters. In addition, the current research on the inversion of the Chl-a concentration was carried out with a balanced sample concentration. For areas with a highly variable Chl-a concentration, it is still challenging to establish a Chl-a concentration inversion model with strong applicability and high accuracy. This study explored the pos-sibility of retrieving the Chl-a concentration in high-concentration samples using spectral absorption characteristics. Its novelty lies in the pretreatment of high-concentration samples with large deviations using the capping method, and the extraction of the characteristic water spectrum absorption parameters using the envelope removal method. On the basis of the correlation analysis between the Chl-a concentration and the spectral absorption characteristics, the Chl-a concentration spectral inversion was performed. It provides an optimal solution for inversion modeling of the Chl-a concentration, which uses high concentration samples and spectral absorption characteristics.
The predictive capabilities of the ER_New model were significantly better than those of the R_New model, indicating that the envelope removal method can significantly amplify the absorption characteristics of the original spectrum, which can significantly improve the performance of the prediction model. The prediction models of the two absorption characteristic parameters (A 650-710 nm _New and D 650-710 nm _New) and the three combined models (A 650-710 nm &D 650-710 nm _New, A 650-710 nm &NI_New, and A 650-710 nm &DI_New) also demonstrated good predictive capabilities. This shows that using the spectral absorption feature to retrieve the Chl-a concentration is a feasible solution. This is similar to reports from previous studies. Peng, J. et al., in "Research on Inversion of Soil Salt Content Based on Continuum Removal Method", found that the original reflectance after continuum removal processing can significantly improve the prediction performance of the inversion model [48]. Peng, X. et al., in "spectral inversion of soil parameters based on envelope removal and partial least squares", reported that, as compared with the model established using the reflectance curve, the prediction ability of the model established using the envelope removal curve improved significantly [49]. Liu, X. et al., in "retrieving the moisture content of yellow cotton soil using the envelope elimination method", observed that the spectral absorption characteristic parameters at 1900 nm have the best correlation with the soil moisture content. The logarithmic model established by the absorption area at 1900 nm is the best predictive model for the inversion of soil moisture content [50]. However, before this, the envelope removal method was not used to extract the characteristic spectral absorption parameters and carry out research focused on the hyperspectral inversion modeling of the Chl-a concentration in water bodies.
In view of the high variability of the Chl-a concentration in Dongting Lake, in this study, we constructed four conventional inversion models and four uncapped inversion models, and carried out a comparative study of model performance under different processing methods. The results show that the performance of the uncapped inversion model is better than that of the conventional inversion model. It shows that, in the Dongting Lake area where the Chl-a concentration has a high degree of variability, the decap method can significantly improve the generalization ability of outliers, and it is an effective and feasible pretreatment method. In this study, the critical severe eutrophication value was used as the decap standard; however, the decap standard should be based on achieving the best inversion performance. How to determine the standard is a question that needs to be considered in future research. In addition, this research is an uncapped study under the condition of known outliers. However, in hyperspectral remote sensing inversion research, outliers are the target of detection and are unknown. In lake areas with a high variability in Chl-a concentration, a study of the decap method that takes into account the autocorrelation characteristics of spatial data is required for the fast and accurate inversion of outliers.
Water spectrum characteristics are complex, regional, and seasonal, etc., which means that a single algorithm can only guarantee accuracy in a certain area. Limited experimental data lead to time and space limitations in the built model. If more sample data are used, this impact can be eliminated or weakened, so that the effectiveness of Chl-a concentration inversion modeling can be further analyzed and discussed.

Conclusions
Taking Dongting Lake in China as an example, this study uses high-concentration samples and spectral absorption characteristics to invert the Chl-a concentration. The decap method was used to preprocess high-concentration samples with large deviations, and the envelope removal method was used to extract the spectral absorption characteristic parameters of the water body. On the basis of the correlation analysis between the water Chl-a concentration and the spectral absorption characteristics, the water Chl-a concentration was inverted. The results showed that: (1) The bands that were significantly related to the Chl-a concentration and had a large correlation coefficient were mainly located in the three absorption valleys (400-580, 580-650, and 650-710 nm) of the envelope removal curve. Moreover, the correlation between the Chl-a concentration and the absorption characteristic parameters at 650-710 nm was better than that at 400-580 nm and 580-650 nm; (2) as compared with the conventional inversion model, the uncapped inversion model exhibited a higher R P 2 , a lower RMSE P , and was closer to the predicted value of the 1:1 line. The performance of the uncapped inversion model was better than that of the conventional inversion model, indicating that the uncapped method is an effective preprocessing method for highconcentration samples with large deviations; (3) the prediction effect of the ER_New model was significantly better than that of the R_New model. This indicates that the envelope removal method can significantly amplify the absorption characteristics of the original spectrum, which can significantly improve the performance of the prediction model; (4) among the inversion models of absorption characteristic parameters, the prediction models of A 650-710 nm _New and D 650-710 nm _New had the best performance. The three combined models (A 650-710 nm &D 650-710 nm _New, A 650-710 nm &NI_New, A 650-710 nm &DI_New) also exhibited good predictive capabilities. This indicates that using the spectral absorption feature to retrieve the chlorophyll concentration is a feasible solution.