Analysis of Factors Inﬂuencing the Trophic State of Drinking Water Reservoirs in Taiwan

: Eutrophication is an environmental pollution problem that occurs in natural water bodies. Regression analyses with interaction terms are carried out to identify the factors inﬂuencing the Shimen, Mingde, and Fongshan Reservoirs in Taiwan. The results indicate that the main factor inﬂuencing these reservoirs is total phosphorus. In the Shimen and Mingde Reservoirs, the inﬂuence of total phosphorus, when interacting with other factors, on water quality trophic state is more serious than that of total phosphorus per se. This implies that the actual inﬂuence of total phosphorus on the eutrophic condition could be underestimated. Furthermore, there was no deterministic causality between climate and water quality variables. In addition, time lagged effects, or the inﬂuence of their interaction with other variables, were considered separately in this study to further determine the actual relationships between water trophic state and inﬂuencing factors. The inﬂuencing patterns for three reservoirs are different, because the type, size, and background environment of each reservoir are different. This is as expected, since it is difﬁcult to predict eutrophication in reservoirs with a universal index or equation. However, the multiple linear regression model used in this study could be a suitable quick-to-use, case-by-case model option for this problem.


Introduction
Constructing reservoirs is one of the most effective ways of storing water in Taiwan. Surface runoff may be caught during periods of high flow and provide water for people's livelihood, industry, and agriculture during periods of water shortage.
Eutrophication, the most challenging water pollution problem in water bodies, will eventually become an issue in many reservoirs [1]. Eutrophication negatively affects the water quality, safety, ecological integrity, and sustainability of global water resources [2][3][4]. It has long been believed that excessive phosphorus is the main reason for eutrophication [5]. However, population density, urbanization, and agricultural activities are also factors that influence the water quality of freshwater systems [6][7][8][9][10]. Since the 1940s, a substantial population increase, land-use intensification, and the use of agricultural fertilizers from developed countries [11], as well as the use of detergents containing phosphate compounds since the 1950s, have accelerated the eutrophication of waterbodies [12].
Eutrophication influences the water volume and quality in reservoirs. Regarding water volume, algae distributed on the water surface causes water hypoxia and a decrease in water transparency [13,14], leading to a substantial death of aquatic organisms, which are then deposited on the bottom of the reservoir, which, in turn, reduces the reservoir's capacity over time. Regarding water quality, the proliferation of algae causes algal blooms and releases algal poison, which both influences water quality conditions such as dissolved oxygen, transparency, odor, and pH value, and it also causes problems during the filtration of drinking water, increasing health risks to people [15,16]. Eutrophication also causes the  The Mingde Reservoir provides more water for consumption in the Miaoli County, where there are more mountains and few fields ( Figure 2). The Mingde Reservoir has a catchment area of 61 km 2 , a full water level of 1.7 km 2 , a total storage capacity of 17.7 million m 3 , and an effective storage capacity of 12.2 million m 3 . The irrigation area of the Mingde Reservoir is 13 km 2 , and it provides 27,000 m 3 of water daily [25].    The Fongshan Reservoir is an off-site reservoir located in Kaohsiung City and provides Kaohsiung with a large additional water supply because of the high population concentration and the rapid development of industry and commerce, which increases water consumption in the area ( Figure 3). It has a catchment area of 2.75 km 2 , a full water level of 0.75 km 2 , a total storage capacity of 9.2 million m 3 , and an effective storage capacity of 8.5 million m 3 [26]. The Fongshan Reservoir supplies 1.6 million tons of water daily, of which 350,000 tons, 22%, caters for industrial consumption [27].

Dataset
The Taiwan Environmental Protection Administration (EPA) changed reservoir water quality monitoring from quarterly monitoring in previous years to monthly monitoring from January 2017. In this study, we used monthly weather and water quality data from the Shimen, Mingde, and Fongshan Reservoirs from January 2017 to December 2019.
Weather data included the daily statistics of rainfall (mm) and inflow (mm) data in the catchment area from 2017 to 2019 and was downloaded from the disaster prevention information service website of the Water Resources Agency (WRA). The data collection period corresponds to that of the EPA water quality monitoring data from each reservoir. In addition, monthly water temperature (WT) data from 2017 to 2019 were collected from the national water quality monitoring information website of the EPA [28].
Water quality data were collected from the monthly statistics on the national water quality monitoring information website of the EPA and included chlorophyll a (Chl-a), dissolved oxygen (DO), transparency (SD), total phosphorus (TP), pH, conductivity, suspended solids (SS), chemical oxygen demand (COD), and ammonia nitrogen (AN) sampled from 2017 to 2019 [28].
In order to investigate whether the different seasons affect the degree of influence, March to May were designated as Season 1 (S1), June to August were designated as Season 2 (S2), September to November were designated as Season 3 (S3), and December to February were designated as Season 4 (S4). S4 was used as a control, and S1 to S3 were analyzed to identify the degree of influence that each variable has on water quality in the different seasons.

Regression Analysis
We used multiple linear regression (MLR) using regression analysis (Equation (1)). In order to identify how much each factor influences the eutrophication of the reservoirs, we used rapidly adjusting variables in regression models to analyze weather and water quality factors.
where Y i is the i-th observation value in the dependent variable, which represents the concentration of chlorophyll a in this study. The independent variables X 1i to X ki are weather (rainfall, inflow, and water temperature) and water quality (Chl-a, DO, etc.) factors. β 0 is an intercept term, β 1 to β k are the slope terms, and also the unknown coefficients corresponding to the independent variables X 1i to X ki . ε i is a random error term.

Time-Lag
The reason of applying time-lag variables in this study is that the current weather or water quality factors do not necessarily have an immediate influence on the Chl-a concentration. These time lag situations might be one week, one month, or even more, so it is not suitable to use the monitoring current data in the regression analysis.
Basic analysis, Lag 1, and Lag 2 data in the regression model were selected using the following steps: Firstly, we select the significant data of the three types of data. Secondly, if more than one of the three types of data were significant, we selected the data collected closest to the monitoring date, i.e., basic analysis data take precedence over Lag 1 data, which takes precedence over Lag 2 data. Thirdly, if there was no significance in the three types of data, basic analysis data were selected as a representative term.

Interaction Terms
We also test whether the influence of weather and water quality factors on the water trophic state is affected by their interaction. The traditional ordinary least square (OLS) formula can be illustrated as shown in Equation (2): Regarding the interaction of factors, we need to consider whether factors are correlated. We tested the pair-by-pair interaction of factors that are theoretically related using an MLR equation and then analyzed whether the correlation was still significant after the interaction. If it was significant, the two factors were grouped to form an interaction term that was added in the equation, and the correlation was analyzed as shown in Equations (3) and (4). β 5 X 1 X 2 and β 6 X 3 X 4 are the interaction terms added on the basis of an MLR.
If the individual analysis results of interaction terms β 5 X 1 X 2 and β 6 X 3 X 4 in the regressions of Y 1 and Y 2 were both significant, then the two factors were analyzed in the same regression formula for Y 3 , as shown in Equation (5).
If the regression results of β 5 X 1 X 2 and β 6 X 3 X 4 in Y 3 were both significantly correlated, the combination was retained, and the method was repeated on other interaction terms. At most, two interaction groups were included in the regression formula of each reservoir, and the factors in the two groups did not overlap with each other.
We used three conditions for selecting two groups of interaction terms with simultaneous significant correlations: Firstly, we considered TP and AN that represent the nutrient factors in the two interaction groups. Secondly, we considered WT, rainfall, and inflow that are representative of the weather factors, and WT had priority over rainfall, and rainfall had priority over inflow. Thirdly, we considered the R 2 value that could be explained by applying each group to the regression formula as a final selection step.

Interrelationship of Interaction Terms
Equation (2) was simplified by reducing variables and adding an interaction term, as shown in Equation (6).
Equation (6) was rewritten as Equation (7), in which other variables are fixed, X 1 increases by 1 unit, and X 2 does not increase, and the dependent variable becomes Y 1 .
Equation (7) minus Equation (6) provides Equation (8), which represents a situation in which other variables are fixed, X 1 increases by 1 unit and X 2 does not increase, and the unit amount ∆Y 1 is the dependent variable that will increase.
Similarly, if other variables are fixed but X 1 does not increase and X 2 increases by 1 unit, the dependent variable becomes Y 2 (Equation (9)). Equation (9) minus Equation (6) provides Equation (10), which represents the unit amount ∆Y 2 -the dependent variable that will increase.
When both X 1 and X 2 increase by 1 unit, and other variables are fixed, Equation (11) is obtained. Equation (11) minus Equation (6) provides Equation (12), which represents the unit amount ∆Y 3 -the dependent variable that will increase.
Equations (8), (10), and (12) were integrated (as shown in Table 1), where ∆X 1 and ∆X 2 are the unit amounts by which the independent variables, X 1 and X 2 , are increased. Table 1 shows that when X 1 and X 2 are increased by 1 unit separately or simultaneously, the unit amount ∆Y, which is the dependent variable, will increase. Table 1. Interrelationship of interaction terms.

Model
In this study, Chl-a is set as the dependent variable, and the eleven weather and water quality factors, WT, DO, SD, TP, pH, conductivity, SS, COD, AN, rainfall, and inflow, are set as independent variables. STATA (version 13) is used in this study to perform correlation analysis and regression analysis.
The research model of this study is based on MLR. Using the result of a Hausman Test, we chose to include a random effects model in the MLR model to avoid or reduce ignoring differences between the data (which results in the omission of variables and leads to estimation errors) and also to reduce the occurrence of collinearity problems between variables.
The MLR model built in this study includes the random effects model and performs a time-lag analysis as well as adding specific interaction terms separately based on the difference in the reservoir data.

Descriptive Statistics
This study lists the descriptive statistics of the Shimen, Mingde, and Fongshan Reservoirs ( Table 2). The minimums of TP, SS, COD, and AN were below the detection limit (ND). The maximum Chl-a, TP, Conductivity, COD, and AN were recorded in the Fongshan Reservoir, the maximum SS was recorded in the Shimen Reservoir, and the maximum pH was recorded in the Mingde Reservoir. There were large variations in the daily rainfall and inflow data. It is possible that there was no rain on the day monitoring was conducted but heavy rainfalls the next day; therefore, we do not discuss the maximum and minimum rainfall and inflow.
The minimum DO of the Fongshan Reservoir was about three times lower than those of the Shimen and Mingde Reservoirs. However, its minimum TP concentration was still much larger than that of the Shimen and Mingde Reservoirs. The Fongshan Reservoir was the only reservoir with a pH of less than 7. The minimum conductivity of the Fongshan Reservoir was more than two times that of the Shimen and Mingde Reservoirs. The minimum SS of the Fongshan Reservoir was about two times that of the Mingde Reservoir.

Correlation Analysis
Results from the correlation analysis of the Shimen, Mingde, and Fongshan Reservoirs are show in Tables 3 and 4. The absolute value of the correlation coefficient was above 0.6, which can be regarded as a high correlation between the two factors.  In the Shimen Reservoir, WT and pH were highly correlated with a correlation coefficient of 0.700 (Table 3).
In the Mingde Reservoir, the pairs of WT and conductivity, DO and pH, pH and conductivity, and rainfall and inflow were highly correlated with correlation coefficients of −0.644, 0.634, −0.649, and 0.826 respectively (Tables 3 and 4).
In the Fongshan Reservoir, the pairs of Chl-a and COD, DO and pH, SD and SS, TP and Conductivity, TP and AN, and conductivity and AN were highly correlated with correlation coefficients of 0.602, 0.700, −0.600, 0.854, 0.827, and 0.696, respectively (Tables 3 and 4).
The correlation coefficient of rainfall and inflow in the Mingde Reservoir as well as the correlation coefficient of TP with conductivity and TP with AN in the Fongshan Reservoir exceeded 0.8. However, when performing an MLR analysis, a high correlation coefficient between the independent variables causes the problem of collinearity in the regression results. This problem means that the lower the correlation between the independent variables, the more it reflects the relationship with dependent variables.

Analysis Result
Tables 5-7 show the regression analysis result after including time-lag and interaction terms from the Shimen, Mingde, and Fongshan Reservoir data, respectively. This study divides independent variables into "Basic" analysis without time lag, "Lag 1" data with one month lagged, and "Lag 2" data with two months lagged. For example, there is a time lag between an increase in water temperature, the growth of algae, the flow time of rainfall runoff to the reservoir, etc. According to the three conditions listed in Section 2.3.3, 'WT × COD' and 'TP × AN' were selected as interaction terms for the Shimen Reservoir regression model; 'WT × pH' and 'DO × TP' were the interaction terms for the Mingde Reservoir regression model. Since there was no significant interaction term for the Fongshan Reservoir, we used the result of the time-lag analysis as the final analysis result. The coefficient and correlation of WT, TP, COD, and AN in the Shimen and Mingde Reservoirs should be considered using interaction terms.  For example, when using the interaction term 'TP × AN' for the Shimen Reservoir in Equation (6), the concentration of Chl-a is designated as the dependent variable Y, the concentration of TP is designated as the independent variable X 1 , and the concentration of AN is designated as the independent variable X 2 . β 1 , β 2 , and β 3 are the coefficients of TP, AN, and the 'TP × AN' interaction term, respectively.
Assuming the value of TP is 0.02 mg/L and AN is 0.03 mg/L, then 'TP × AN' is 0.0006 after multiplying the two. Inserting the above values and the coefficients '−26.874 , '−54.684 , and '3817.198 into Equation (6) When TP and AN both increase by 1 unit, the concentration of TP becomes 1.02 mg/L and AN becomes 1.03 mg/L, and, thus, the value of 'TP × AN' will become 1.0506. Then, the concentration of Chl-a is calculated as 3926.5 mg/L. This means that the unit value of Chl-a increases when other variables are fixed and TP and AN are both increased by 1 unit (Equation (14)).
(−26.874) + (−54.684) + 3817.198 + 3817.198 × (0.02 + 0.03) = 3926.5 (14) These results show that when evaluating the water quality trophic state, time-lag and the additive relationship of interactions between factors should also be taken into account to evaluate more accurately the potential for water eutrophication.
Note that the final model for each reservoir is different. This reflects the fact that the eutrophication process in reservoirs is a complex of many different factors, and the process is highly case-by-case. As shown in the introduction, influencing factors sometimes work in opposite directions in different cases from different studies. The result of this study further confirms that even reservoirs in Taiwan show very different patterns when it comes to the factors influencing the trophic state.

Standardization Coefficient
The regression coefficient of each factor was multiplied by the standard deviation of its data to form a 'standardized coefficient'. The basic amounts and units of each factor were different, and it is difficult to predict the dependent variable values based on only the regression coefficient value of each factor. The relative importance of each factor can be compared by standardizing the coefficients, making it is easier to intuitively understand the degree of influence of each independent variable on the dependent variable. Tables 8-10 show the standardization coefficient of the Shimen, Mingde, and Fongshan Reservoirs. These results show the degree of influence of each factor on Chl-a while taking into account time-lag and variable interactions. At the Shimen Reservoir (Table 8), the influence of interactions must be considered when analyzing WT, TP, COD, and AN, so these variables will not be discussed separately. Other highly influential factors are conductivity (standardization coefficient = −1.024), secondly pH (standardization coefficient = −0.630), and lastly SS with a lag of one month (standardization coefficient = 0.209).
At the Mingde Reservoir (Table 9), the influence of interactions must be considered when analyzing WT, DO, TP, and pH, so these variables will not be discussed separately. Other factors with a high influence were inflow (standardization coefficient = 11.295), followed by rainfall (standardization coefficient = −11.166), and lastly SD with two months lag (standardization coefficient = 0.548).  Data from the Fongshan Reservoir (Table 10) showed that COD had the greatest influence on Chl-a (standardization coefficient = 21.247), which was followed by conductivity (standardization coefficient = −20.725) and lastly SS (standardization coefficient = −3.651).
To summarize, at the Shimen Reservoir, Chl-a was significantly and immediately affected by WT, pH, Conductivity, COD, and AN, and significantly, but not immediately, affected by SS and rainfall. DO, SD, TP, and inflow did not significantly affect Chl-a at the Shimen Reservoir. Chl-a at the Mingde Reservoir was significantly and immediately affected by WT, DO, TP, pH, COD, AN, rainfall, and inflow, while the effect of SD was significant but not immediate. Conductivity and SS did not significantly affect Chl-a at the Mingde Reservoir. WT, SD, Conductivity, SS, and COD significantly and immediately affected Chl-a at the Fongshan Reservoir; TP and pH also significantly affected Chl-a, but not immediately; while the effects of DO, AN, rainfall, and inflow were not significant. Table 11 illustrates the correlation results of each weather and water quality factor. Table 11 clearly indicates that the correlation of WT in the Shimen and Mingde Reservoirs needs to be considered within an interaction and is significantly positive in the Fongshan Reservoir. The correlation of DO in the Mingde Reservoir needs to be considered within an interaction, and it is not significantly correlated in the Shimen and Fongshan Reservoirs. The correlation of SD is significantly positive in the Mingde Reservoir and significantly negative in the Fongshan Reservoir, but it is not significantly correlated in the Shimen Reservoir. The correlation of TP in the Shimen and Mingde Reservoirs needs to be considered within an interaction and is significantly positive in the Fongshan Reservoir. The correlation of pH in the Mingde Reservoir needs to be considered within an interaction and is significantly negative in both the Shimen and Fongshan Reservoirs. The correlations of conductivity in the Shimen and Fongshan Reservoirs are significantly negative but not significant in the Mingde Reservoir. The correlation of SS is significantly positive in the Shimen Reservoir and significantly negative in the Fongshan Reservoir, but there is no significant correlation in the Mingde Reservoir. The correlation of COD in the Shimen Reservoir needs to be considered within an interaction, but it is significantly positive in both the Mingde and Fongshan Reservoirs. The correlation of AN in the Shimen Reservoir needs to be considered within an interaction, and it is significantly positive in the Mingde Reservoir but not significantly correlated in the Fongshan Reservoir. The correlation of rainfall is significantly positive in the Shimen Reservoir and significantly negative in the Mingde Reservoir, but there is no significant correlation in the Fongshan Reservoir. The correlation of inflow is significantly positive in the Mingde Reservoir, but there is no significant correlation in the Shimen and Fongshan Reservoirs. Table 11. Correlation of factors from the Shimen, Mingde, and Fongshan Reservoirs.

Shimen
Mingde Fongshan The correlation of factors needs to take into account interaction relationships. X : Factors were not significantly correlated. + : Factors have a significant positive correlation. − : Factors have a significant negative correlation. N : The variable is not relevant for the reservoir.
There was no deterministic causality between climate and water quality variables. For example, the pH in the Fongshan Reservoir is negatively correlated with Chl-a, but Zang (2011) shows that Chl-a is positively correlated with pH and DO [29]. The same case as Blumberg (1990) finds that WT is negatively correlated with DO [30], but Chen (2007) shows that WT is positively correlated with DO [31]. In another, case Watson (2016) shows that Chl-a is negatively correlated with DO [32], but Zang (2011) shows a positive correlation [29].
The interaction combinations for the Shimen Reservoir are 'WT × COD' and 'TP × AN', and for the Mingde Reservoir, they are 'WT × pH' and 'DO × TP'. There were no significant correlation interactions for the Fongshan Reservoir. Note that temperature and total phosphorus are the only two factors that have a positive influence among all three reservoirs. However, the effect of temperature negatively interacts with COD in Shimen and with pH in Mingde, making the effect of temperature on the trophic state actually more minor than expected. On the other hand, the interaction terms related to the total phosphorus in Shimen and Mingde are magnifying the effect. This result further supports that total phosphorus is the main factor for the trophic state.
To summarize, these results indicate that the influencing factors of the trophic state in reservoirs defer from case to case; thus, it is difficult to find a one-size-fits-all equation to be perfectly suitable in all cases.

Conclusions
The main factor influencing the three reservoirs is total phosphorus. At the Shimen and Mingde Reservoirs, in particular, the interactive effect of TP with other factors on the water quality trophic state was greater than that of TP alone, indicating that more attention should be paid to the interaction effect between the influencing factors. However, there is no significant interaction effect found to further aggravate the trophic state between weather and water quality factors. In the case of these three reservoirs in Taiwan, an additional deterioration of eutrophication from the climate-change-related interaction effect is not a concern.
The analysis of characteristics influenced by time lags and the analysis of the interactions between factors provide a deeper understanding of the correlation between each factor and the degree to which they influence the water quality trophic state. Furthermore, the length of the time lag and the significant combinations of influencing factors vary from reservoir to reservoir, indicating that the patterns of eutrophication might differ according to different reservoir conditions. These results imply that factors influencing the tropic state in a reservoir might vary by reservoir type, geological and meteorological conditions, as well as other potential factors. In other words, forming a model that describes the tropic state for a reservoir is highly case sensitive. The perfect solution of a one-size-fits-all model might not exist. Researchers should carefully review all possible factors before finalizing a model.
In this study, the R 2 values of the MLR model developed for the three reservoirs were all above 0.5, indicating that the regression model for each reservoir explains more than half of the cause of the water quality trophic state. The results indicate that the regression model developed during this study and the methods used are both feasible for assessing the water quality trophic state.