Quantitative Assessment of Climatic and Reservoir-Induced Effects on River Water Temperature Using Bayesian Network-Based Approach

: River ﬂow regulations and thermal regimes have been altered by human-induced interventions (such as dam construction) or climate change (such as air temperature variations). It is of great signiﬁcance to adopt a well-performed data-driven model to accurately quantify the impact of human-induced interventions or climate change over river water temperature (WT), which can help understand the underlying evolution mechanism of the river thermal regimes by dam operation or climate change. This research applied the Bayesian network-based model (BNM), which can easily identify inherently stronger associated variables with a target variable from multiple inﬂuencing variables to predict the daily WT and make a quantitative assessment of the effect produced by Three Gorges Reservoir (TGR) construction in the Yangtze River, China. A comparative study between the proposed model and two other models was implemented to verify the predicted accuracy of the BNM. With the help of the BNM model, the impact of reservoir impoundment over water temperature was quantitatively analyzed by calculating the difference between reconstructed water temperature series and observed series during the post-TGR period. The construction of the TGR posed more impact on variations in WT than the impact induced by the climate change according to results. The effect of TGR on WT can be concluded as follows: WT from October to January in post-TGR showed a remarkable warming tendency and an increase in released warmer water volumes than before, while WT showed a cooling tendency during March to June because of the hysteretic effect of WT response to increasing air temperature. The proposed BNM model shows great potential for WT prediction and ecological risk management of rivers.


Introduction
Water temperature (WT) is the key monitoring indictor of aquatic ecosystems, as evidenced by the fact that temporal variations in WT could drive biogeochemical evolution regime, such as biochemical reaction velocity and water surface evaporation rate [1,2]. The multi-scaled characteristics (day, month, year) of WT fluctuations could be attributed to the complex heat flux interchange process at air-water interface and temperature of contributing hydrological elements to the river ranging from surface to underground level [3]. Various studies of WT have issued that thermal behavior is significantly related with large-scale climate change [4,5] and anthropogenic perturbations, such as constructions and operations of dams [6,7] at a local scale. As a result, it is of great importance to model and forecast the WT time series accurately to attribute the contributions of climate change and human-induced perturbations over WT.
Data-driven models are commonly used water temperature models over recent decades [8,9]. Data-driven models establish the predictor-predictand association between WT and other covariates such as air temperature (AT) and discharge (Q) through statistical or data-mining techniques. AT is always assumed to be the potential predictor due to its the dominant role in deciding the heat fluxes at the air-water interface. Besides the natural heat fluxes quantified by AT, river flow patterns are another key driver of the thermal evolution regime of the river, which is particularly the same situation after constructing and operating of large reservoirs. Thus, discharge (Q) should be selected as another potential predictor. Recent studies have implemented comparative research of the performance of a variety of machine-learning methods for WT forecasting and modelling [10,11]. However, considering the lag effect for the relation between AT and WT, a great number of interdependent influencing variables (different potential time-lag air temperature series) should be considered, which is laborious for selecting optimal predictors among the tremendous number of influencing variables. It is of great importance to master relatively complete conditional independent information for the potential predictors identification. The concepts of network structures that can help to gain a wonderful insight into the behavior of a complex system with multiple random variables have been addressed in the field of hydrology and hydroclimatology [12,13]. In this study, we used a graphical modelling (GM)-based Bayesian network approach [14] to identify the directly associated variables and establish the predictor-predictand association between WT and air temperature and discharge.
The Yangtze River, the longest river in China, has been a hotspot after the operation of the Three Gorge Reservoir (TGR), which is the largest water conservancy project in the world [15]. Recent studies about TGR focus on quantitative assessment of the effects of TGR over hydrological processes, risk of hydrometeorological extremes, and the impact of aquatic animal habitat sustainability and diversity along the Yangtze River [16][17][18]. A multivariate regression method [19] and a semi-physical air2stream model [6] have been used to diagnose the potential influence over WT variations posed by TGR operation. However, few studies in the literature have addressed the above effect at the daily scale. Further investigation should be made to verify the potential of Bayesian network-based models (BNM) to quantify daily WT variations accurately under the impact of humaninduced perturbations.
This study focused on the following two angles: (1) to evaluate the prediction accuracy of the proposed BNM model to estimate mean daily WT observed at the Yichang station of the Yangtze River compared with other two commonly used data-driven models; and (2) to use the BNM to make a reconstruction of the natural daily river water temperature time series, ignoring the operation of TGR in order to isolate the human-induced effect (caused by TGR) on the thermal regime of this river.

Study Area and Data
The Yangtze River, which is the third longest river in the world, originates from the Qinghai-Tibet Plateau, ranging from 4000 to 5000 m, and flows eastward to the East China Sea. The TGR, which is located along the main stream of the Yangtze River between Chongqing and Yichang, started to impound water, with the water level rising from 70 m in 2003 to 143 m in 2005. For the purpose of flood control, irrigation, and power generation, the TGR was set at a standard normal operation phase by October 2010 [6].

Methodology
The framework of this study can be considered to consist of the three following sequential components: Model performance evaluation: Daily water temperature (WT), air temperature (AT), and discharge (Q) data were fitted by the Bayesian network-based models (BNM) to establish potential predictor-predictand association between them. Considering the delayed change in WT relative to AT fluctuations, it is necessary to incorporate a time-lag effect for AT [20]. If we use WT(t) denote the target variable, time lags ranging from 1 to 30 days of air temperature, which can be represented by AT(t − 1), AT(t − 2), . . . , AT(t − 30) in this study, are considered as possible input variables. In addition, discharge Q(t) was set as another input variable. As a result, a 32 node-based Bayesian network model should be established to predict WT by plugging in the new values for the above potential input variables. Two other data-driven models including an adaptive neuro-fuzzy inference system (ANFIS) [21] and a support vector regression (SVR) [22] were also used as benchmark models to evaluate the performance of the proposed BNM models in daily WT prediction.
Quantify the influence of TGR over WT: The pre-TGR period was used for calibrating the BNM model. Daily inflow discharge to the TGR was assumed as natural discharge of the Yichang station, which can reconstruct the expected WT ignoring the operation of TGR by the calibrated BNM model. Based on several change impact quantifying indexes, the impact of TGR over WT can be easily quantified.

Bayesian Network-Based Prediction Models
The workflow of Bayesian network modelling involves three fundamental steps: (1) identify the potential BNM structure; (2) estimate the parameters of the network; (3) perform prediction of the target variable from a Bayesian network.
In this study, graphical model-based (GM) Bayesian networks (BNs) were utilized using the bnlearn package in R software [14]. A scored-based Hill-Climbing (HC) greedy search algorithm was implemented to optimize the potential BNM structure. On the basis of the Bayesian Information Criterion (BIC) score, the graph structure was updated iteratively by optimizing the number of edges with the BIC value. Since the score represents the model fit performance, the highest BIC score suggests the best model fit. After GM-based BNM structure optimization, the connecting edge strength (CES) for each pair variables was quantified in the following steps. Here CES can be quantified by the BIC score. As shown in Figure 1 by the network structure for January, the variables, Q, AT_1, AT_2, AT_4, AT_6, AT_10, AT_13, AT_17, AT_25, AT_27, and AT_30, have stronger association with WT based on BIC score. These variables, also called the parent set of variable WT in this graph, were selected as the predictors. Here AT_i represents lag-i for air temperature (AT) series.
After the best network structure and the parent variables were identified, the parameters of the selected network were estimated by the maximum likelihood estimation (MLE) method. Let { X 1 , X 2 , . . . , X m } be a data set composed of m random variables with an underlying BNM structure. The joint distribution of m random variables, also called global distribution of the data can be expressed as follows: where, θ i represents a vector of parameters for conditional distribution of X i given Λ X i . Λ X i denotes the parent set of X i in the graphical modelling of the network. In other words, Λ X i is just the set of several other random variables (also called parents) that are associated with X i . Θ = {θ 1 , θ 2 , . . . , θ m }. F(X i Λ X i , θ i ) is just the local distribution function to exhibit the association between X i and its parent variables. Since the graph structure is identified from the previous step, this can be accomplished efficiently by estimating the parameters of the local distributions.
Once the parameters of global and local distribution function are estimated, the BNMbased prediction model was initiated. Subsequently, forecasting of the output variable could be performed by plugging in the new values for the parents of the node in the local distribution (also called as potential input variables).

Performance Metrics
In order to achieve the unbiased evaluation of BNM modelling accuracy, three kinds of residue error analyzing metrics were applied: (1) root-mean-square error (RMSE); (2) Kling-Gupta efficiency (KGE) [23]; (3) Nash-Sutcliffe efficiency coefficient (NSE). These three metrics of the BNM-based prediction model were compared with those of ANFIS-based prediction model and SVR-based prediction models.

Reconstruction of Water Temperature
The operation of TGR has directly affected the hydrological process of Yangtze River, particularly for the observed Q at Yichang station. Due to the effect of Q over thermal inertia of the river, the WT of Yichang station would be influenced by the operation of TGR. As a result, the BNM model was applied to reestablish the WT series ignoring the impact of TGR to make a quantitative assessment of the influence of TGR on WT. In order to realize the above research goal, the original data set was spilt into two parts: (1) a pre-TGR period from 1984 to 2002 and (2) a post-TGR period from 2003 to 2013.The data during pre-TGR period was fitted for the purpose of model verification. The calibrated BNM-based model was applied to derive the WT predictions during the post-TGR period in the absence of the TGR by replacing the observed value of Q to TGR with daily inflow values of Q, which represents natural discharge series, getting rid of human-induced interventions (i.e., operation of TGR). These inflow series (Q in ) could be obtained from information on the temporal variations in water storage of TGR.
Based on the predictions quantified by the BNM model, the influence posed by the operation of TGR (∆ TGR ) and climate change (∆ CLI ) over the WT from post-TGR to pre-TGR period can be quantified as follows [6]: ∆ TOT is just the total change in WT because of dam construction. WT obs 1 and WT obs 2 are the WT observations in the pre-TGR and post-TGR periods, respectively.
∆ CLI denotes the change in WT attributed to the meteorological forcing (possibly influenced by climate change). WT sim 1 and WT sim 2 are the simulated WT values in the pre-TGR and post-TGR periods, respectively.
∆ TGR represents the change in WT attributed to TGR construction.
ε is just model bias. WT sim 2 was derived from the fitted BNM model with natural discharge (Q in ) and observed time lag-based AT as inputs, while WT sim 1 was derived from the fitted BNM model with observed discharge and time lag-based AT as inputs.
In this study, other human-induced activities besides the TGR construction were not taken into consideration. According to the above equations, ∆ TOT = ∆ CLI + ∆ TGR + ε. Once the model bias meets the condition that ε ∆ TGR , the proposed BNM model can be credible when performing a quantitative assessment of the influence of the TGR on WT. In this study, |ε| < |1.1 × 10 −3 × ∆TGR|) reflects that the proposed model is reasonable for quantifying the influence of the TGR on WT.

Performance of Bayesian Network-Based Prediction Models
In this study, since a month-wise prediction strategy was adopted, a month-wise network structure was derived in Figure 1 and Figure S1a-k according to the procedures of establishing the BNM model. According to the Bayesian Information Criterion (BIC) and graph in Figure 1, Q, AT_1, AT_2, AT_4, AT_6, AT_10, AT_13, AT_17, AT_21, AT_25, AT_27, and AT_30 have the strongest associations in the January WT series, which were identified as the final potential input variables. In the same way, potential input variables for other months' WT series can be decided ( Figure S1a-k). Based on maximum likelihood estimation (MLE) method, the parameters of these local distributions can be estimated. The number of the final potential input variables is different from month to month, which suggests that the possible stronger association for WT varies from month to month. In addition, this issue just helps validate the rationality of a month-wise prediction strategy to improve the forecasting accuracy.
After choosing the best-fitted BNM models for WT, the performance of daily WT forecast for the pre-TGR and post-TGR are evaluated in this section ( Table 1). Details of the control parameters of ANFIS and SVR are shown in Table S1a,b. According to three performance metrics, the proposed BNM model has reduced RMSE by 16.4% and raised the KGE and NSE by 15.5% in average compared with the other two models, which suggests that the BNM model provides a superior simulation performance in comparison with the ANFIS and SVR models. The annual mean value of AT during pre-TGR period was 16.89 • C, and it rose to 17.50 • C during the post-TGR period, while annual mean value of Q experienced an 8.8% increase during the post-TGR period in comparison with the pre-TGR period. The above two variations in AT and Q would more or less influence the BNM performance during those above two periods (pre-TGR and post-TGR). Fortunately, the BNM model still outperformed the other two models in the post-TGR period. For most months' simulation, all three models showed relatively poor forecasting performance in the post-TGR period, with RMSE becoming larger and KGE and NSE becoming smaller. Totally speaking, the performance for three prediction models can be ranked as BNM > SVR > ANFIS.

Quantifying the Impact of Climate and Reservoir Construction over WT
Due to the superior performance of the proposed BNM model, the BNM model was used as a powerful tool for quantifying the effects of climate change (as AT variations here) and human-induced interventions (TGR construction in this study) on the mean daily thermal regime by successful reconstruction of natural variation in WT.
In the follow-up study, the expected WT values forecasted by the BNM model during the post-TGR with natural discharge and observed time lag-based AT as inputs were reconstructed to separate the impact over WT only because of AT variations in comparison with the pre-TGR period (∆ CLI ), and only because of the presence of human-induced activities, here construction of the TGR was assumed as the main attribution (∆ TGR ). The four statistics by Equations (2)-(5) at annual and seasonal scale are presented in Table 2. As listed in Table 2, all seasonal and annual model bias ε was smaller than the calculated ∆ TGR (generally |ε| < |1.1 × 10 −3 × ∆TGR|), reflecting that the BNM model is reasonable for quantifying the TGR impact over WT. From the annual scale, both the meteorological variations in AT and the TGR have contributed to the rise in annual mean WT value (∆ TGR accounting for 53% of ∆ TOT and ∆ CLI accounting for 47%). When it comes to the intra-annual variability in ∆ TGR , the TGR effect can be decomposed into two cases. The construction of TGR has led to a 1.42 • C decrease in monthly mean WT on average during spring-summer period, while the construction of TGR has caused a 2.1 • C increase during autumn-winter period. The most affected season by TGR was winter, with a 2.50 • C increase, which accounts for 96.8 of total change ∆ TOT . The summer season experienced a cooling of 0.57 • C, which indicates that the construction of the TGR had the least impact on changes in water temperature during this period. Figure 2 reveals the difference in response mechanism of WT within a year from the perspectives of thermal inertia and river regulation mechanism. The discharge series experienced a downward trend for annual peak flow and an upward trend for annual low river flow during the post-TGR period (Figure 2b), caused by flow storage for the purpose of flood prevention during the rainy period (May to October) and flow surge for agricultural irrigation and domestic water during the dry period (November to April) [6].
As a combined result, WT from October to January in Yichang showed a remarkable warming due to the high degree of the TGR's thermal inertia and the increased release of warmer water (Figure 2d). In the same way, WT from March to June showed a cooling tendency because of hysteretic effect of WT in response to increasing AT. The cooling effect of TGR on WT reached the strongest in April, while the strongest warming effect of TGR on WT happened in December. According to Figure 2c, the variations in seasonal cycle of WT under the impact of TGR can be concluded as: WT in spring becomes colder, while autumn and winter becomes warmer. Focus on Figure 2d, the impact induced by TGR was greater than that caused by climate change, with |∆ TGR | greater than |∆ CLI |. Figure 3a shows the variations in response relation between WT and AT from pre-TGR to post-TGR period, while Figure 3b shows the variations in response relation between WT and Q. The wider hysteresis cycles between WT and AT can be displayed after the TGR construction, which is caused by the high degree of TGR's thermal inertia. The effect should be not only on the breadth of the hysteresis but also on the slope. From Figure 3a,b, the slope of both two pairs of response relations experienced a decrease. According to the thermal classification in the previous literature [24], the slope extracted from the linear regression between WT and AT is a quantitative tool for identifying the difference between comparative patterns. A slope larger than 0.55 suggests a strong response of WT and AT. In this way, the slope for WT and AT reduced from 0.67 to 0.54, indicating strong effect of the operation of TGR on the river thermal response. The response between Q and WT was also influenced by the TGR.

Distinguishing the Impact of Different Operating Phases at TGR
In order to make a fair comparison of three phases of the operation of TGR, we adjusted the time period for the above three phases (detailed information in Section 3.  Figure 4 shows the seasonal variations in WT for different operation phases of the TGR. The divergence between observed and simulated WT can help verify the remarkable thermal impact caused by the operation of TGR. For phase 1, a minor effect caused was identified by TGR inducing the seasonal variations of cooling and warming of WT (Figure 4a). The TGR acted as a cold source from March to June, while acting as a heating source from October to January of next year, which is in accordance with the conclusion in Section 4.2. In phases 2 and 3, the above cooling-heating process varies from season to season. Due to the difference between different operation phases of TGR, there was a difference in the extent of the cooling (light-blue shadow in Figure 4) and heating (pink shadow in Figure 4). For example, the extent of cooling and heating in phase 2 was stronger than that in phase 1, which was caused by the lower similarity between observed discharge and natural discharge series during phase 2 (higher correlation coefficient in Figure 4d). Although the similarity between the observed discharge and natural discharge series in phase 3 was greater, the extent of cooling and heating became stronger with the highest water level in the reservoir, which indicates that the increase in reservoir volume leads to an increase in thermal inertia. The obvious correlation of cooling-heating processes corresponding to different operation phases at the TGR would further verify the assumption that the overall effect of the human-induced interventions on WT can be mainly attributed to the TGR.

Conclusions
A Bayesian network-based approach was proposed to accurately quantify the impact of human-induced interventions or climate change over river water temperature (WT), which can help understand the underlying evolution mechanism of the river thermal regimes due to the construction of the dam or climate change. The high efficiency of the proposed Bayesian network-based (BNM) approach to exhibit the potential association between hydroclimatic variables is verified in this study. The developed BNM models made the best use of the capacity of graphical modelling based-network structures. According to comparison of the performance metrics of this BNM and two other commonly used data-driven models (ANFIS and SVR), the proposed BNM model reduced RMSE by 16.4% and raised the KGE and NSE by 15.5% on average, as compared with the other two models, which suggests that the BNM model outperformed two benchmark models for the data in Yichang station of the Yangtze River, China.
Based on the reconstruction of water temperature, ignoring the Three Gorges Reservoir (TGR) construction by the proposed BNM model at daily and seasonal scales, we can address the following conclusions: For seasonal scale, the presence of TGR has caused a 1.42 • C decrease in monthly mean WT on average during the spring-summer period and a 2.1 • C increase during the autumn-winter period. Monthly mean WT in winter showed the most obvious heating trend, with 2.50 • C increase accounting for 96.8 of total change ∆ TOT , while monthly mean WT in summer experienced most obvious cooling of 0.57 • C. From the annual scale, both the variations in AT (meteorological forces) and the TGR increased the annual mean WT value (∆ TGR accounting for 53% of ∆ TOT and ∆ CLI accounting for 47%).
For the daily scale, WT from October to January in Yichang showed a remarkable warming, which was caused by the high degree of TGR's thermal inertia and its increased release of warmer water than before. In the same way, WT from March to June showed a cooling tendency because of the hysteretic effect of WT response to increasing AT. The cooling effect of TGR on WT reached the strongest in April, while the strongest warming effect of TGR on WT happened in December.
According to the analyzed results of the seasonal variations in WT for different operation phases of the TGR, the extent of cooling and heating becomes stronger with the highest water level in the reservoir, which reflects that the thermal inertia of the reservoir is directly proportional to the increase in reservoir volume.
By quantitatively isolating the independent influence of the TGR, which is regarded as a human-induced intervention, from climate change on water temperature with the help of BNM model, this study provides a new perspective for the study of river ecosystems under the background of climate change and anthropological activities.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/w14081200/s1, Figure S1a-k: Typical network structure obtained for February to December, respectively. Notations for the variables are as follows: Water temperature (WT), discharge (Q). AT_1, AT_2, . . . , AT_30 represent the lag-1, lag-2, . . . , lag-30 for air temperature (AT) series. The nodes denoted with blue circle are selected as final potential predictors. WT is target variable while other variables are set as input variables; Table S1b: Control parameters for support vector machine (SVM) for pre-TGR period and post-TGR period; Table S1b: Control parameters for adaptive-network-based fuzzy inference system (ANFIS) for pre-TGR period and post-TGR period.

Informed Consent Statement: Not applicable.
Data Availability Statement: All data that support the findings of this study are available from the corresponding author upon reasonable request: (1) original inflow data and water and air temperature data from the selected gauges; (2) the corresponding codes of the Bayesian network-based model implemented in the R environment.