Water Environmental Capacity Analysis of Taihu Lake and Parameter Estimation Based on the Integration of the Inverse Method and Bayesian Modeling

An integrated approach using the inverse method and Bayesian approach, combined with a lake eutrophication water quality model, was developed for parameter estimation and water environmental capacity (WEC) analysis. The model was used to support load reduction and effective water quality management in the Taihu Lake system in eastern China. Water quality was surveyed yearly from 1987 to 2010. Total nitrogen (TN) and total phosphorus (TP) were selected as water quality model variables. Decay rates of TN and TP were estimated using the proposed approach. WECs of TN and TP in 2011 were determined based on the estimated decay rates. Results showed that the historical loading was beyond the WEC, thus, reduction of nitrogen and phosphorus input is necessary to meet water quality goals. Then WEC and allowable discharge capacity (ADC) in 2015 and 2020 were predicted. The reduction ratios of ADC during these years were also provided. All of these enable decision makers to assess the influence of each loading and visualize potential load reductions under different water quality goals, and then to formulate a reasonable water quality management strategy.


Introduction
In recent years, the deterioration of surface water bodies is becoming more and more serious, which has become one of the most important environmental issues facing national and local governments worldwide. Therefore, it is very important to make effective water quality management strategies to reduce the resulting environmental pressure and protect the water ecological system. The water environment management system in China has evolved from a concentration control and goal amount control focus to capacity amount control. Water environmental capacity (WEC), as one of the core contents of any capacity amount control system, plays an important role in water quality management. WEC is defined as the maximum load that can be carried by natural waters to meet the required water quality target. There are many kinds of methods to calculate WEC [1][2][3][4][5]. Among them, the formula method is the most basic method, which is deduced under certain conditions according to the WEC definition and the water quality model (WQM). There are also many kinds of WQMs [1,[6][7][8]. Therefore, many WEC calculation formulas applying to different conditions can be obtained. The required parameters in the formula are obtained by the WQM. Therefore, WQM is considered very important for water quality management [9][10][11].
In view of the uncertainty of the water environmental system, the unascertained mathematical method is widely used to calculate the WEC [7,12]. In the unascertained mathematical method, water environment parameters in the water system (such as flow rate, pollutant concentration, and pollutant degradation coefficient) are all defined as unascertained numbers and often generated by stochastic simulation. Then, in combination with the WEC model, the unascertained WEC calculation model is established. Possible values of WEC and their credibility are calculated, and then the WEC is obtained. However, errors may be caused during the process of defining parameters as unascertained numbers. Therefore, it will be more reasonable if using the known data to estimate the unknown data. The integration of Bayesian and inverse methods can solve this problem well by getting the posterior distributions of the uncertain parameters according to their prior distributions, and then WEC is obtained under the uncertainty condition.
The prominent advantage of Bayesian statistics is their capability to transform the uncertainty problem into estimated model parameters or loads in terms of a joint posterior distribution. It can easily assess the uncertainty for decision makers. In addition, Bayesian methods incorporate existing expert knowledge and experiences via the prior distribution [13,14], and so can result in more precise estimation than traditional methods do, especially when the observed data are insufficient. Inverse methods have been widely applied to address environmental issues in recent years, including model parameter estimation [10,15,16], groundwater [17][18][19], non-point source estimation [20,21], and wastewater systems [22].
The utility of the integrated approach has been confirmed in many environmental case studies, including ground water quality modeling [23], contaminant source identification [24], shellfish aquaculture ecosystem modeling [25], non-point source load estimation [16], loading estimation and uncertainty assessment [26,27], and the multi-pollution source water quality model [28]. However, this combination has seldom been used in lake eutrophication water quality model and corresponding WEC analysis.
In this study, a Bayesian approach combined with inverse method for the lake eutrophication water quality model was developed to estimate parameters and analyze WEC under uncertainty conditions. Taihu Lake in the Jiangsu Province in eastern China was studied. The model results will help local decision makers reduce and allocate loads in an effective and efficient manner.

Study Area and Data Source
Taihu Lake is located in the south of Jiangsu Province and the south of the Yangtze River Delta. The entire water area of Taihu Lake is in Jiangsu Province. The south lake is adjacent to Zhejiang Province. Taihu Lake is the largest lake in east China. It is also the third largest freshwater lake in China. The area of the lake is 2338 km 2 ; the average lake capacity is 44.3 × 10 8 m 3 ; the average surface area is 2425 × 10 6 m 2 ; and the watershed area is 36,895 km 2 . As a storage center of river water resources, Taihu Lake has the functions of flood control, water supply, ecology, shipping, tourism and aquaculture, so its WEC is directly related to the social and economic development of Jiangsu Province, Zhejiang Province and Shanghai City. In 2007, cyanobacteria malignantly bloomed in Taihu Lake, which caused huge economic losses. Given the serious perniciousness of the cyanobacteria blooms, state and local governments have adopted various measures to reduce emissions of pollutants in the basin and restore the lake's ecological environment. In recent years, the Taihu Lake management has made certain progress. At present, the permanganate index and ammonia nitrogen in Taihu Lake have been obviously improved; total nitrogen (TN) and total phosphorus (TP) are still the main pollutants. A study on Taihu Lake's environmental capacity has an important significance for Taihu Lake management and the sustainable development of the Taihu Lake basin. Research on Taihu Lake WEC started in the 1990s, and there have been many scholars discussing it over the last twenty years [29][30][31][32][33].
Previous studies were based on deterministic methods. Considering the uncertainty of the water environmental system, the Bayesian and inverse methods are used to get the posterior distributions of the uncertain parameters according to their prior distributions, and then WEC is obtained under uncertainty conditions. Data from Taihu Lake from 1987 to 2010 was used to estimate the parameters. Monitored parameters included pollutant concentration, and many other important variables. Concentration of TN and TP are derived from routine monitoring data of the environmental protection department [30,32]. Out flow amount and loads of TN and TP are calculated according to routine patrol data of the water conservancy department [31,34]. In this study, a lake eutrophication water quality model and Bayesian statistics were developed for parameters estimation and variable WEC calculation with limited data requirements. The used data refer to Table S1 in the Supplementary Materials.

Lake Eutrophication Water Quality Model
When one takes years as the time scale to study the process of lake eutrophication, lakes often can be seen as a completely mixed reactor. The Vollenweider model [35] is adopted here. This model often does not describe physical, chemical and biological processes in lakes, and also does not consider the thermal stratification of the lake. According to Vollenweider model assumption, the change rate of pollutant concentration in lake is a function of input, output and the deposition amount of this pollutant in the lake, which can be expressed by the following equilibrium equation: where V is the storage capacity of the lake (10 8 m 3 ), C is the water quality component's concentration in the lake (t/10 8 m 3 ), I is the total load of some nutrient (t/a), Q is the out flow amount of the lake (10 8 m 3 /a), S is the decay rate for nutrient in the lake (1/a).
where r is the hydraulic scour rate (1/a) and = Q r V .
If t = 0, C = C0, then the analytical solution of this equation is: When t → ∞, the balance concentration of nutrient is: ( , , , ) ( ) The model from Equation (4) can be viewed as a forward model, which is from S to C. The reverse model for Equation (4) is: Thus, the problem of decay rate S estimation can be transformed into an inverse model with the aim of finding S so that defined objective function can be met. The Bayesian approach was then applied for S estimation.
According to Equation (4), the water environment capacity can be obtained as follows: where V, r are as mentioned above, CS (t/10 8 m 3 ) is the required water quality target, S adopts its posterior distribution from the Bayesian model results. By WEC, the corresponding allowable discharge capacity (ADC) can be got. According to reference [33], the ADC is: where W is the uncontrolled amount of pollutants into the lake, including the amount of pollutants into the lake through the river (including water diversion from the Yangtze), atmospheric dry and wet deposition, lake island and seine, α is the coefficient of pollutants into the lake. WTN = 12115 t/a, WTP = 808 t/a, αTN = 0.84, αTP = 0.90, calculated by the historical data in [33].

Bayesian Approach
Bayesian statistics have been increasingly used to address environmental and other scientific issues in recently years [36]. Bayesian methods provide a framework within which all unknown parameters are treated as random variables and their distributions are derived from pre-existing information, then a probability distribution on the parameter space can be obtained, and the uncertainty about the parameters is summarized. Thus, Bayesian statistics provide an important method for uncertainty analysis and present crucial information for management decision making [37]. The Bayes theorem can be written as: where P(θ / X) is the posterior distribution of θ and represents the conditional distribution of the parameter θ values given observed data X. In this study, the parameter θ refers to S, i.e., θ = {S1,S2,…Sn}. P(X) is the expected value of the likelihood function over the parameter distribution as a normalizing constant. P(θ) is the prior distribution of θ based on the observed data. P(θ / X) is the likelihood function, which describes the mechanistic and statistical relationship between the predictor and variables. While the Bayesian framework provides an attractive approach to parameter estimation and uncertainty analysis, it is impossible obtain the analytically posterior distributions, which hinder the practical application of Bayesian approach. To this end, the Markov chain Monte Carlo (MCMC) algorithm has been applied to obtain the numerical summarization of parameters [36].
There are three major steps in the Bayesian model using the MCMC sampling method [38]: (i) formulating the prior probability distributions for targeted parameters; (ii) specifying the likelihood function; and (iii) MCMC sampling for the posterior probability distributions. Gibbs sampling and Metropolis-Hastings algorithms are popularly used for MCMC sampling. The aim of MCMC sampling is to generate samples from the posterior distribution of the model parameters by simulating a random process, and the process has the posterior distribution as its stationary distribution [39]. Then the random sample can be provided and used for further estimation. In this study, Gibbs sampling was used to address the practical implementation of MCMC via a specialized and free software package, WinBUGS [40].
To specify the likelihood function of S, Equation (4) was incorporated into Equation (8). The observed concentration of C* is taken as the modeled value with a normally distributed random error ε, which is formulated by: where ε has zero mean and variance of σ 2 . The value of σ was assumed to follow a Gamma distribution [21] and can be estimated in the Bayesian model based on the prior distribution. Then the likelihood function for all twelve years of observations of Taihu Lake can be expressed as: Posterior simulation, based on Bayesian model results for Taihu Lake, was performed to understand the uncertainty associated with WEC.
The MCMC simulations for the TN and TP models were carried out separately in WinBUGS. All years were modeled in the same WinBUGS program. The prior distribution of S was determined using interval values from the literature at the range from 1 to 2.5 a −1 for TN and the range from 3 to 6 a −1 for TP [33]. The MCMC was carried out in WinBUGS with one Markov chain and 50,000 iterations [38]. The WinBUGS code used in this study can be found in the Supplementary Materials.

Estimation of S and Assessment of Model Fit
Posterior distributions of STN, STP are given in Table 1, including mean, 5%, 25%, 75% and 95% credible level values, the standard deviation and the Monte Carlo (MC) errors (an estimate of the difference between the mean of the sampled values and the true posterior mean). The MC errors for posterior STN, STP were less than 8% of the SD, indicating that the model converged well [40]. The posterior distributions shown in the above table are close to the average decay rates calculated from Vollenweider model in [33] and the decay rates used in other studies [41][42][43], so the estimated results are reliable. However, in this study, more historical data were used for parameter seeking than that in [33]. And the decay rate was estimated by the Bayesian method rather than calculating by the average of each year's data in [33]. Therefore, for complex and uncertain water environment systems, the method in this study is more reasonable.
The observed TN and TP concentrations of the 24 years (1987-2010) were compared with corresponding simulated data. The contrast results are shown in Figure 1. The simulated concentrations and observed ones are strongly correlated with R 2 > 0.69 (p < 0.01) and Nashe-Sutcliffe efficiency [44] > 0.47 for TN, and R 2 > 0.57 (p < 0.01) and Nashe-Sutcliffe efficiency > 0.34 for TP. Thus, the calculation of water quality capacity and according water quality management strategies could be carried out based on the model.

Variable WEC and Required Load Reduction Ratio
The water quality targets (annually average value) for TN and TP of Taihu Lake in 2015 and 2020 are described in Table 2. Under the water quality targets of 2015, WECs in 2011 in Taihu Lake are calculated, using the estimated decay rates in Table 1 and according to the Equation (6). The calculation results are shown in Table 3. The historical loads of TN and TP in 2011 are 58336 t/a and 3308 t/a, which have beyond the WECs at any confidence level. Then load reduction ratios were calculated at different confidence levels. These are also shown in Table 3.   It is worth noting that the historical loads are all above the WEC Thus, pollutant reduction for TN and TP is necessary to meet the water quality goals. According to Table 3, load reduction ratio of TN is above 15% and load reduction ratio of TP is above 30%. Therefore, both TN and TP loads are quite high and among them the TP pollution is more serious. Therefore, the effectiveness for TN and TP pollution control and management is not obvious. Therefore, the urgency to control the nutrients should be recognized. New nitrogen and phosphorus pollution sources must be prohibited. Management and examination of TN and TP should be strengthened according to the relative plans [45], so as to reach the standard.
In order to control TN and TP pollution and reduce the emissions reasonably, the WEC in 2015 and 2020 were predicted according to Equation (6), where the average hydraulic scour rate r was used, that is, out flow Q used the annual average data of 1987-2010. Decay rate S used the estimated data. CS (t/10 8 m 3 ) used the required water quality target in Table 2. The predicted results are shown in Table S2 in the Supplementary Materials. Then, the ADC in 2015 and 2020 can be predicted according to Equation (7) and data in Table S2. As the lake pollutant emissions haven't changed much in the past few years according to [33], here it is assumed unchangeable in the next few years. The TN and TP emissions of Taihu in 2010 are 66082 t/a and 3301 t/a, respectively. Therefore, the reduction ratio from 2010 to 2015 and from 2015 to 2020 can be calculated. All these are shown in Figure 2. In the figure, the 5%, 25% mean, 75%, and 95% are corresponding to the different confidence levels in Table S2. The ADCs in 2015 and 2020 are predicted by the estimated S. The ADC in 2010 is assured, so it is the same value under different confidence levels.
From Figure 2, it can be seen that from 2010 to 2015 the TN emission reduction ratio is 49%-54%, and the TP emission reduction ratio is 60%-71%, which are both very high. From 2015 to 2020, the reduction ratios of TN and TP are above 13% and 28%-32%, respectively, which both decline when compared with the former five years, but are still not low. In a word, to achieve the water quality goals is a very difficult task. In addition, over a fairly long period of time, TP and TN concentrations in Taihu Lake will be in an interval which is quite prone to cyanobacteria blooms [46]. It is difficult to fundamentally eliminate the possibility of cyanobacteria blooms. In order to ensure Taihu Lake water quality security, some effective emergency measures, such as "water diversion from the Yangtze" [47], cyanobacteria salvage, should be carried out over a longer period of time.

Conclusions
An integrated approach of inverse method and Bayesian inference, combined with the lake eutrophication water quality model, was applied to estimate the decay rate of TN and TP of Taihu Lake, China. The model was fitted to water quality monitoring data from 1987 to 2010. The estimated parameters were used to calculate WEC of Taihu Lake in 2011, and the reduction ratios of pollutant loads were given. From the calculated results, the effectiveness for TN and TP pollution control and management is not obvious, and there is still a lot of work to do to meet the water quality target. Therefore, some effective emergency measures should be carried out. The estimated parameters were also used to predict the WEC and ADC in 2015 and 2020, and the reduction ratio of ADC from 2010 to 2015 and the one from 2015 to 2020 were given, respectively. This calculation will provide certain theoretical basis for strategy formulation in the next few years.
The model result is based on annual average data. The model can be improved by incorporating a longer-period dataset for better interpretation of water quality changes and monthly data for seasonal variations. WEC could then be estimated on a seasonal scale or an even shorter period. Then, more detailed water quality management strategies can be made.