Assessment of Reservoir Water Quality Using Multivariate Statistical Techniques: a Case Study of Qiandao Lake, China

Qiandao Lake (Xin'an Jiang reservoir) plays a significant role in drinking water supply for eastern China, and it is an attractive tourist destination. Three multivariate statistical methods were comprehensively applied to assess the spatial and temporal variations in water quality as well as potential pollution sources in Qiandao Lake. Data sets of nine parameters from 12 monitoring sites during 2010–2013 were obtained for analysis. Cluster analysis (CA) was applied to classify the 12 sampling sites into three groups (Groups A, B and C) and the 12 monitoring months into two clusters (April-July, and the remaining months). Discriminant analysis (DA) identified Secchi disc depth, dissolved oxygen, permanganate index and total phosphorus as the significant variables for distinguishing variations of different years, with 79.9% correct assignments. Dissolved oxygen, pH and chlorophyll-a were determined to discriminate between the two sampling periods classified by CA, with 87.8% correct assignments. For spatial variation, DA identified Secchi disc depth and ammonia nitrogen as the significant discriminating parameters, with 81.6% correct assignments. Principal component analysis (PCA) identified organic pollution, nutrient pollution, domestic sewage, and agricultural and surface runoff as the primary pollution sources, explaining 84.58%, 81.61% and 78.68% of the total variance in Groups A, B and C, respectively. These results demonstrate the effectiveness of integrated use of CA, DA and PCA for reservoir water quality evaluation and could assist managers in improving water resources management.


Introduction
Due to the serious pollution in the river, reservoirs have become the most important drinking water resource in Zhejiang Province, which is one of the most developed regions in China [1].Although kinds of regulations in terms of the water protection have been implemented for years, reservoirs have been undergoing increasing pressure from the development of the economy [2].Various pollutants from industrial production, agricultural practices and domestic activities find their way into reservoirs, impacting the aquatic environment and leading to water quality deterioration [2,3].Monitoring networks for reservoir water quality have been widely developed in this area as the critical manner for effectively managing water resources [4].However, continuous monitoring programs can provide a large amount of records including a variety of parameters with obscure interrelationships and complexities, and it is no easy task to organize, interpret, and obtain meaningful information from these data [5,6].Thus, it is significant to obtain useful knowledge from the monitoring records using practicable methods to analyze the spatio-temporal trends in the water quality and to characterize the factors impacting water quality [5,7].
Qiandao Lake (Xin'an Jiang reservoir) plays an important role in drinking water supply and sustainable development in eastern China [2].Thus, to provide meaningful suggestions to planners and managers for better water resource protection, it is essential to characterize the variation patterns of water quality in Qiandao Lake and its potential pollution sources.Some reports related to Qiandao Lake have been published on the assessment of biological community structure [8,9], chlorophyll-a concentration analysis [10,11], ecological assessment [2,8], and eutrophication condition evaluation [12], but no comprehensive water quality evaluation has been carried out using advanced mathematical methods.In this study, we collected data from 12 monitoring sites during four years (2010-2013) for analysis.
Numerous studies regarding water quality in varieties of water bodies have been carried out.Thus, various related approaches have been developed, such as water quality index (WQI) [13,14], multivariate statistical methods [15,16], artificial neural networks (ANNs) [17], and the multivariate relevance vector machine (MVRVM) [18][19][20].Multivariable statistical methods, including discriminant analysis (DA), cluster analysis (CA), and principal component analysis/factor analysis (PCA/FA), have served as robust tools in studies of environmental issues for complex data processing and interpretation [21,22].These methods can be applied in water studies for an in-depth knowledge of water quality conditions, and they can be used as practical tools for water resource management [6].According to previous studies, integrated use of various multivariable approaches is preferable for mutual verification of the results obtained [3].
The present study aims to characterize the spatial and temporal trends in water quality and the latent pollution sources in Qiandao Lake.To realize this objective, the water quality data sets obtained were explored by three multivariate methods: CA was used to divide the sampling sites and months into groups according to the spatial and temporal similarities; DA was applied to identify significant variables to assess the variation trends in water quality; and PCA was finally applied to detect the unidentified, latent pollution sources in Qiandao Lake.

Study Area
Qiandao Lake (Xin'an Jiang reservoir) is the largest artificial freshwater lake and a renowned place of interest in China, and it is located in Hangzhou City, Zhejiang Province.It was built in the 1950s and has served a variety of purposes, including power generation, flood control, tourism, drinking water supply, industrial usage and irrigation.The watershed of Qiandao Lake covers an area of 4402.4 km 2 , 60% of which is located in Anhui Province.The water surface area is 574 km 2 at high water (108 m), and the corresponding storage capacity is 17.8 billion m 3 .The water resources in Qiandao Lake account for 30% of the volume of the Qiantang River.Qiandao Lake is of great significance in maintaining the eco-environmental health and water function in the Qiantang River basin [23].Figure 1 presents the location of Qiandao Lake in Zhejiang Province and in China.

Data
Monthly data of nine water quality parameters (pH, Secchi disc depth, chlorophyll-a, ammonia nitrogen, total nitrogen, total phosphorus, permanganate index, dissolved oxygen, and fecal coliform) from 12 monitoring sites in Qiandao Lake covering four years (2010-2013) were obtained from the Zhejiang Institute of Environmental Science Research and Design.Secchi disc values were measured in situ, and the measurements of the other parameters were performed in the laboratory of the Chun'an Environmental Monitoring Stations in accordance with the standard methods for observation and analysis in China [24].Water samples were collected at 0.5 m below water surface, intermediate point, and 0.5 m above the bottom at each site, and the average values were calculated for analysis in this study.Table 1 shows the selected water quality parameters and their measurement methods.Statistical calculations were performed using the software IBM SPSS Statistics 20.The locations of the monitoring stations in Qiandao Lake are displayed in Figure 1.

Data
Monthly data of nine water quality parameters (pH, Secchi disc depth, chlorophyll-a, ammonia nitrogen, total nitrogen, total phosphorus, permanganate index, dissolved oxygen, and fecal coliform) from 12 monitoring sites in Qiandao Lake covering four years (2010-2013) were obtained from the Zhejiang Institute of Environmental Science Research and Design.Secchi disc values were measured in situ, and the measurements of the other parameters were performed in the laboratory of the Chun'an Environmental Monitoring Stations in accordance with the standard methods for observation and analysis in China [24].Water samples were collected at 0.5 m below water surface, intermediate point, and 0.5 m above the bottom at each site, and the average values were calculated for analysis in this study.Table 1 shows the selected water quality parameters and their measurement methods.Statistical calculations were performed using the software IBM SPSS Statistics 20.The locations of the monitoring stations in Qiandao Lake are displayed in Figure 1.CA is an unsupervised pattern recognition approach for classifying categories according to the nearness or similarity [25].The classified categories present strong internal homogeneity and strong external heterogeneity [26].CA utilizes a wide variety of algorithms, among which hierarchical agglomerative technique is most widely used, where a data set can be sequentially classified into categories by selecting the most similar variable pairs and generating categories in a stepwise way [27].The results of hierarchical agglomerative CA can be presented in a dendrogram that visually describes the clustering course [27,28].We employed hierarchical agglomerative CA to the normalized data to assemble monitoring sites and sampling months into aggregations using Ward's method, with squared Euclidean distances as a metric of similarity [29].The hierarchical agglomerative CA was performed by the software IBM SPSS Statistics 20.

Discriminant Analysis (DA)
DA is a well-known statistical classification technique that is developed to identify the parameters that make distinctions between two or more categories [25].DA classifies cases into categorical-dependent variables by constructing a discriminant function for each group.Discriminant functions (DFs) determine boundaries in predictor space between various classes [21].DA constructs DFs by using the standard, forward stepwise and backward stepwise modes.The standard mode uses all of variables for DF construction, whereas the stepwise procedures determine which variables should be contained in the model [15].The forward stepwise DA constructs DFs by adding variables one by one.It begins with the most significant parameter and ends with the least significant one.The procedure is the opposite in the backward-stepwise mode: by starting with the least significant one, the variables are removed one by one until no significant changes emerge [3,21].DA was performed on the raw measurement data without standardization, and DFs were constructed based on the three modes [30,31].

Principal Component Analysis (PCA)
PCA is a widely-used technique that attempts to transform an original data set into a smaller set comprising uncorrelated factors known as principal components (PCs).It uses eigenvalues and eigenvectors associated with the covariance matrix to generate PCs by multiplying the original correlated variables by the eigenvector [25,32].The PCs allow for data reduction by extracting the most important variables to interpret the original data sets with a minimal loss of information [33].
Then a new set of variables, called varifactors (VFs), is constructed by rotating the axis yielded from PCA [21].The VFs can be used to identify unobservable, hypothetical and latent variables [7,33].PCA was performed on three groups of normalized data sets (three sampling regions divided by CA) to generate VFs for each group and to further identify the most significant parameters in different regions [34,35].

Water Quality Classification Guideline
The water quality was analyzed according to the National Environmental Quality Standards for Surface Water (GB 3838-2002) [36].This standard classifies water quality into five levels: Categories I to V. Category I is the highest standard set to protect water sources and national natural conservation area.Category II is applicable to a key protection area of drinking water sources, habitats for rare aquatic organisms, aquaculture, etc. Category III is applied to determine the secondary protection zone for drinking water supplies, fishery area, and swimming regions.Category IV indicates the water bodies that are suitable for ordinary industrial use and recreational purposes that do not involve direct contact of the human body.Category V is set to establish the water area for irrigation and for ordinary landscape demand.

Descriptive Statistics of Data
The statistical characteristics for the obtained data sets comprising nine parameters are displayed in Table 2. Referring to the quality standards for surface water (GB 3838-2002), the mean values of TP and FC reached the category II, the mean concentration of TN reached the category III, and the remaining parameters all reached category I.The maximum concentrations of COD Mn and TP exceeded the category I, and the maximum FC concentration exceeded the second category.Significantly, the minimum and maximum values of TN reached the second and fourth levels of the standard, respectively.Chl-a and SD are not listed in the standard (GB 3838-2002), but they are important parameters for the indication of water quality, and they have been widely used in previous investigations [37,38].

Cluster Analysis
The 12 sampling sites were clustered through CA process into groups A, B and C (Figure 2).Group A included four sites (sites 9, 5, 6 and 8), Group B included six sites (sites 4, 10, 3, 7, 11 and 12), and group C consisted of two sites (sites 1 and 2).As shown in Figure 1, sites in Group A were situated near the center and the outflow of Qiandao Lake, corresponding to low-pollution regions.The sites in Group B were distributed from northeast to southwest of the lake, corresponding to moderate-pollution regions.The Group C sites were located in areas of high-risk pollution, where they accepted pollutants mostly from inflow from the Xin'an Jiang River.Site 1 presented the highest average values of most parameters, especially Chl-a, TN, TP and NH 3 -N.Overall, water quality presented a tendency of improvement from the inflow to the outflow of the lake.

Descriptive Statistics of Data
The statistical characteristics for the obtained data sets comprising nine parameters are displayed in Table 2. Referring to the quality standards for surface water (GB 3838-2002), the mean values of TP and FC reached the category II, the mean concentration of TN reached the category III, and the remaining parameters all reached category I.The maximum concentrations of CODMn and TP exceeded the category I, and the maximum FC concentration exceeded the second category.Significantly, the minimum and maximum values of TN reached the second and fourth levels of the standard, respectively.Chl-a and SD are not listed in the standard (GB 3838-2002), but they are important parameters for the indication of water quality, and they have been widely used in previous investigations [37,38].

Cluster Analysis
The 12 sampling sites were clustered through CA process into groups A, B and C (Figure 2).Group A included four sites (sites 9, 5, 6 and 8), Group B included six sites (sites 4, 10, 3, 7, 11 and 12), and group C consisted of two sites (sites 1 and 2).As shown in Figure 1, sites in Group A were situated near the center and the outflow of Qiandao Lake, corresponding to low-pollution regions.The sites in Group B were distributed from northeast to southwest of the lake, corresponding to moderate-pollution regions.The Group C sites were located in areas of high-risk pollution, where they accepted pollutants mostly from inflow from the Xin'an Jiang River.Site 1 presented the highest average values of most parameters, especially Chl-a, TN, TP and NH3-N.Overall, water quality presented a tendency of improvement from the inflow to the outflow of the lake.Temporal CA clustered the 12 sampling months into two groups, as presented by the dendrogram (Figure 3).Cluster A consisted of six months (January, February, March, August, September, October, November, and December), corresponding approximately to the drought period in the study area (September to February).Cluster B consisted of four months (April, May, June, and July), roughly equal to the wet period (March to August).Thus, 12 months were classified into two clusters by the actual water quality rather than the drought/wet period.This partition of sampling periods is more practical because the water quality variation was not strictly consistent with the natural seasons and the drought/wet period.Temporal CA clustered the 12 sampling months into two groups, as presented by the dendrogram (Figure 3).Cluster A consisted of six months (January, February, March, August, September, October, November, and December), corresponding approximately to the drought period in the study area (September to February).Cluster B consisted of four months (April, May, June, and July), roughly equal to the wet period (March to August).Thus, 12 months were classified into two clusters by the actual water quality rather than the drought/wet period.This partition of sampling periods is more practical because the water quality variation was not strictly consistent with the natural seasons and the drought/wet period.

Discriminant Analysis
To characterize the temporal pattern of water quality among the four years, DA was applied to recognize the most significant discriminant parameters according to the differences between the years.The values of Wilk'lambda, chi-square and p-level for each discriminant function are shown in Table 3, where the values of Wilk'lambda varied between 0.091 and 0.943, chi-square varied between 5.7 and 118.3, and the p-levels were below 0.01, denoting that the DA performance was statistically meaningful.The proposed three modes of DA were performed on the raw data, from which the discriminant functions (DFs) were obtained, as shown in Table 4.The DFs constructed by standard DA mode, including all of the nine independent variables, produced the classification matrices (CMs), rightly identifying 87.5% of the cases.The coefficients for the FC were zero.The DFs obtained through the forward-stepwise mode, selecting SD, DO, CODMn, TN and TP as the discriminant parameters, generated CMs identifying 81.4% of the cases rightly.The DFs constructed by the backward-stepwise DA mode provided CMs with 79.9% correct assignations using SD, DO, CODMn and TP as the discriminant variables.Therefore, the parameters SD, DO, CODMn and TP were the most significant expected to interpret the temporal variations in water quality during the study years.

Discriminant Analysis
To characterize the temporal pattern of water quality among the four years, DA was applied to recognize the most significant discriminant parameters according to the differences between the years.The values of Wilk'lambda, chi-square and p-level for each discriminant function are shown in Table 3, where the values of Wilk'lambda varied between 0.091 and 0.943, chi-square varied between 5.7 and 118.3, and the p-levels were below 0.01, denoting that the DA performance was statistically meaningful.The proposed three modes of DA were performed on the raw data, from which the discriminant functions (DFs) were obtained, as shown in Table 4.The DFs constructed by standard DA mode, including all of the nine independent variables, produced the classification matrices (CMs), rightly identifying 87.5% of the cases.The coefficients for the FC were zero.The DFs obtained through the forward-stepwise mode, selecting SD, DO, COD Mn , TN and TP as the discriminant parameters, generated CMs identifying 81.4% of the cases rightly.The DFs constructed by the backward-stepwise DA mode provided CMs with 79.9% correct assignations using SD, DO, COD Mn and TP as the discriminant variables.Therefore, the parameters SD, DO, COD Mn and TP were the most significant expected to interpret the temporal variations in water quality during the study years.Figure 4 shows the temporal trend of the four discriminant variables obtained through DA.The variation in SD was larger in 2012 compared with the other three years, and significant sudden changes were not detected.The concentration of DO was higher in 2011, with a smaller variation than in 2010, and it was highest in 2012.The concentration of COD Mn was observed to be the highest in 2011, and then it declined from 2011 to 2013.The trend for TP showed that the variation in 2011 was the smallest, followed by 2010 and 2012, and it was the largest in 2013.According to the tendencies of the four parameters during 2010 to 2013, the water quality in 2013 was somewhat superior compared with 2010. Figure 4 shows the temporal trend of the four discriminant variables obtained through DA.The variation in SD was larger in 2012 compared with the other three years, and significant sudden changes were not detected.The concentration of DO was higher in 2011, with a smaller variation than in 2010, and it was highest in 2012.The concentration of CODMn was observed to be the highest in 2011, and then it declined from 2011 to 2013.The trend for TP showed that the variation in 2011 was the smallest, followed by 2010 and 2012, and it was the largest in 2013.According to the tendencies of the four parameters during 2010 to 2013, the water quality in 2013 was somewhat superior compared with 2010.Temporal DA was applied to the two clusters (Cluster A and Cluster B) obtained through CA to recognize the most significant discriminant water quality parameters among the sampling periods.The values of Wilk'lambda and chi-square and the p-levels showed in Table 5 indicate that the DA performance was statistically meaningful.The DFs yielded from the different modes of DA are shown in Table 6.The DFs from the standard and forward-stepwise modes, comprising nine and five parameters, respectively, produced CMs with total correct assignments close to 90% (88.5% and 88.2%, respectively).The DFs from the backward-stepwise mode yielded a similar result, identifying 87.8% of the cases using three variables: pH, DO and Chl-a.Thus, it indicated that pH, DO and Chl-a were the most significant parameters accounting for the variations in water quality between the two sampling periods.Temporal DA was applied to the two clusters (Cluster A and Cluster B) obtained through CA to recognize the most significant discriminant water quality parameters among the sampling periods.The values of Wilk'lambda and chi-square and the p-levels showed in Table 5 indicate that the DA performance was statistically meaningful.The DFs yielded from the different modes of DA are shown in Table 6.The DFs from the standard and forward-stepwise modes, comprising nine and five parameters, respectively, produced CMs with total correct assignments close to 90% (88.5% and 88.2%, respectively).The DFs from the backward-stepwise mode yielded a similar result, identifying 87.8% of the cases using three variables: pH, DO and Chl-a.Thus, it indicated that pH, DO and Chl-a were the most significant parameters accounting for the variations in water quality between the two sampling periods.Box plots of pH, DO and Chl-a show the temporal trends between the sampling periods (Figure 5).The pH value was significantly higher in Cluster B than in Cluster A, and some dispersed points were observed, which indicated very high or very low values.The concentration of DO decreased in Cluster B, with a larger variation than in Cluster A. An opposite relationship between DO and Chl-a can be found.Compared with Cluster A, the Chl-a values in Cluster B was higher, and the values were more dispersed.Additionally, some points with very high concentrations were found in Cluster B. The temporal trend of significant parameters suggested that the water quality in Qiandao Lake was inferior in the wet period (Cluster B) compared with the drought period (Cluster A), which can be attributed to the increased agricultural activities during the wet period and the higher river contribution leading to washout of agricultural soils [27].Box plots of pH, DO and Chl-a show the temporal trends between the sampling periods (Figure 5).The pH value was significantly higher in Cluster B than in Cluster A, and some dispersed points were observed, which indicated very high or very low values.The concentration of DO decreased in Cluster B, with a larger variation than in Cluster A. An opposite relationship between DO and Chl-a can be found.Compared with Cluster A, the Chl-a values in Cluster B was higher, and the values were more dispersed.Additionally, some points with very high concentrations were found in Cluster B. The temporal trend of significant parameters suggested that the water quality in Qiandao Lake was inferior in the wet period (Cluster B) compared with the drought period (Cluster A), which can be attributed to the increased agricultural activities during the wet period and the higher river contribution leading to washout of agricultural soils [27].Spatial DA was applied to characterize the spatial variations in water quality between the three groups of sampling sites delineated by CA.Table 7 shows the values of Wilk'lambda, chi-square and p-level for each discriminant function, where the values of Wilk'lambda varied between 0.080 to Sustainability 2016, 8, 243 10 of 17 0.660, and chi-square varied between 18.5 and 103.8.The DA results were significant, as indicated by the p-levels below 0.01.The three DA modes were again performed to obtain the DFs, as shown in Table 8.The DFs produced by the standard and forward-stepwise modes, including nine and six discriminant variables, correctly assigned 85.1% and 83.3% of the cases, respectively.The DFs constructed by the backward-stepwise mode obtained CMs with 81.6% right assignations using SD and NH 3 -N, suggesting that they were the significant discriminating parameters.Box plots of SD and NH 3 -N show different spatial patterns of the water quality of Qiandao Lake (Figure 6).The SD was highest in the Group A sites, followed by Group B, whereas Group C had the lowest.An obvious opposite relationship between SD and NH 3 -N can be found.The concentration of NH 3 -N was highest at the Group C sites, and it was lowest at the Group A sites.The spatial pattern in the water quality of Qiandao Lake exhibited a trend of improvement from the northwest (inflow) to the southeast (outflow) of the lake.Among all of the sampling sites, the Group C sites were located at the most upstream site, and the water quality was mostly impacted by inflow from the Xin'an Jiang River, which received substantial amounts of domestic sewage, agricultural pollution, and industrial wastewater discharge from the upstream contributing area.

Identification of Potential Pollution Sources
After normalization for the data sets of the three groups of sampling sites (as obtained through CA), PCA was applied to assess the compositional patterns between variables and to detect the sources influencing water quality for different regions.PCA produced four PCs for Groups A and B

Identification of Potential Pollution Sources
After normalization for the data sets of the three groups of sampling sites (as obtained through CA), PCA was applied to assess the compositional patterns between variables and to detect the sources influencing water quality for different regions.PCA produced four PCs for Groups A and B and three PCs for Group C with eigenvalues greater than 1, explaining 84.58%, 81.61% and 78.68% of the total variance in Groups A, B and C, respectively.Table 9 shows the parameter loadings and explained variance.Figure 7 shows the scores of sampling sites for the first two principal components.The variable loadings were classified as strong, moderate and weak, corresponding to absolute loading values of >0.75, 0.75-0.50and 0.50-0.30,respectively [39].For Group A, the VF1 accounted for 29.43% of the variance, and it had strongly positive loadings on CODMn and Chl-a and a moderately positive loading on NH3-N.CODMn was an organic pollution indicator, and it could be attributed to uncontrolled domestic discharges [3].Chl-a was an index for phytoplankton abundance, and it likely represented the impacts of anthropogenic activities, including fish culture and agricultural runoff [40].NH3-N can originate from various sources, For Group A, the VF1 accounted for 29.43% of the variance, and it had strongly positive loadings on COD Mn and Chl-a and a moderately positive loading on NH 3 -N.COD Mn was an organic pollution indicator, and it could be attributed to uncontrolled domestic discharges [3].Chl-a was an index for phytoplankton abundance, and it likely represented the impacts of anthropogenic activities, including fish culture and agricultural runoff [40].NH 3 -N can originate from various sources, including municipal waste, soil erosion and fertilizer applications [3,41].Therefore, VF1 represented the contribution of nutrient pollution such as domestic wastewater and agricultural runoff.VF2, accounting for 24.50% of the variance, displayed a strongly positive loading on DO and a strongly negative loading on SD, and it can be ascribed to biochemical pollution [31].VF3, which explained 17.54% of the variance, showed a strongly positive loading on FC and a moderately positive loading on TN.This factor can be ascribed to pollution from domestic discharge and agricultural and surface runoff [31,42].VF4 explained the lowest variance (13.11%) and had a strongly positive loading on pH, representing the variability in physico-chemical aspects [21,32].
Among the four significant VFs of Group B, VF1 accounted for 30.46% of the total variance, and it was weighted on NH 3 -N, SD, TP, FC and TN.VF1 identified the nutrient pollution from anthropogenic activities such as domestic sewage and agricultural non-point source pollution [31,43].VF2 represented 22.65% of the variance and had a strongly positive loading on Chl-a.It may be ascribed to organic pollution from domestic wastewater.VF3 constituted 15.20% of the variance, and it had a strongly negative loading on COD Mn and a moderately positive loading on DO.This factor can be resulted from biochemical pollution [42].VF4 explained 13.31% of the variance and was only marked by pH, representing the physico-chemical source of variability.
The sources in Group C were similar to that in Group B. VF1 explained 38.28% of the variance, and it was weighted on TN, TP and NH 3 -N.Agricultural non-point source pollution can be the explanation for this factor [27,43].VF2, accounting for 25.25% of the variance, had strongly positively loadings on Chl-a and FC and a moderately positive loading on SD, likely representing organic pollution from domestic wastewater [31].As with VF3 of Group B, VF3 (15.15% of the total variance) had a strongly positive loading on DO and a moderately negative loading on COD Mn , and it can be attributed to biochemical pollution [42].

Water Quality Variations within the Lake
Pollutant transport from the river to the water reservoir is strongly controlled by hydrology.We roughly divided Qiandao Lake into three parts based on the water flow direction, which was indicated using three flow paths constituted by the sampling sites (Figure 8), to analyze the water quality variations within the lake.The three paths are the central path (site 1-site 2-site 3-site 5-site 6-site 8-site 9), the southwest path (site 12-site 11-site 5-site 6-site 8-site 9) and the northeast path (site 10-site 4-site 5-site 6-site 8-site 9).The potential pollutant sources of the drainage basin were also described in Figure 8. Figure 9 showed the trends of 4-year averages of the water quality parameters for the three paths.
The Xin'an Jiang River received substantial amounts of domestic sewage, agricultural pollution, and industrial wastewater discharge from the upstream contributing area.Therefore, the water quality of the inflow into Qiandao Lake was undesirable.Site 1 was nearest to the river inflow, and the 4-year averages of most of the parameters (COD Mn , TN, TP, NH 3 -N, Chl-a, and FC) were the highest and the SD values were the lowest among all of the sampling sites.SD, TN, TP, COD Mn , NH 3 -N, Chl-a and FC showed similar variation trends along the central path (Figure 9a), presenting an overall improvement of water quality from the inflow to the outflow, which suggested a strong capacity for self-purification in Qiandao Lake.Appreciably increased levels of COD Mn , TN, NH 3 -N and FC, as well as a decreased level of SD, were observed at site 8 compared with the adjacent sites, indicating degraded water quality at this location.The explanation was that site 8 was an important passenger port with intensified anthropogenic activities.The variation along the southwest path was similar to that along the central path (Figure 9b), showing an overall improvement of water quality from site 12 to site 9. Three peaks can be found at sites 12, 5 and 8, reflecting the purification process of water quality within the lake.Site 12 is close to traditional agricultural regions where amounts of nutrients can be easily shifted from the surrounding arable lands into the lake.Site 5 is located at the intersection of waters from the central, southwest and northeast directions, resulting in increased TN concentrations at this site.Along the northeast path, we can see peak values of TP, NH 3 -N and Chl-a at site 10, and TN and FC at sites 4 and 8 (Figure 9c).Site 10 was close to villages and farmlands, and it was mainly impacted by agricultural runoff and domestic sewage.Site 4 received waters from the central and northeast directions and was close to the downtown where rapid urbanization and population growth had recently occurred, and thus it was strongly influenced by the pollutants brought by the central path and domestic sewage from the dense population.Site 7 was also near the downtown and was impacted by the intense human activities, leading to the relatively inferior water quality compared with sites 6 and 8.In addition, compared with the other parameters, a larger decline in TP concentration was found for each path, without an increase at site 8.There are two possible explanations.First, phosphorus was mainly transferred into water in a granular state, and its transmission distance is shorter than those of nitrogen and other pollutants in the reservoir.Second, phytoplankton absorbed more phosphorus than the other nutrients during the growth period [44].had a strongly positive loading on DO and a moderately negative loading on CODMn, and it can be attributed to biochemical pollution [42].

Water Quality Variations within the Lake
Pollutant transport from the river to the water reservoir is strongly controlled by hydrology.We roughly divided Qiandao Lake into three parts based on the water flow direction, which was indicated using three flow paths constituted by the sampling sites (Figure 8), to analyze the water quality variations within the lake.The three paths are the central path (site 1-site 2-site 3-site 5-site 6site 8-site 9), the southwest path (site 12-site 11-site 5-site 6-site 8-site 9) and the northeast path (site 10-site 4-site 5-site 6-site 8-site 9).The potential pollutant sources of the drainage basin were also described in Figure 8. Figure 9 showed the trends of 4-year averages of the water quality parameters for the three paths.The Xin'an Jiang River received substantial amounts of domestic sewage, agricultural pollution, and industrial wastewater discharge from the upstream contributing area.Therefore, the water quality of the inflow into Qiandao Lake was undesirable.Site 1 was nearest to the river inflow, and the 4-year averages of most of the parameters (CODMn, TN, TP, NH3-N, Chl-a, and FC) were the highest and the SD values were the lowest among all of the sampling sites.SD, TN, TP, CODMn, NH3-N, Chl-a and FC showed similar variation trends along the central path (Figure 9a), presenting an overall improvement of water quality from the inflow to the outflow, which suggested a strong capacity for self-purification in Qiandao Lake.Appreciably increased levels of CODMn, TN, NH3-N and FC, as well as a decreased level of SD, were observed at site 8 compared with the adjacent sites, indicating degraded water quality at this location.The explanation was that site 8 was an important passenger port with intensified anthropogenic activities.The variation along the southwest path was similar to that along the central path (Figure 9b), showing an overall improvement of water quality from site 12 to site 9. Three peaks can be found at sites 12, 5 and 8, reflecting the purification process In addition, we have also explored the water quality variation in different depths.Overall, the higher DO values were observed in surface water compared with the bottom and the medium waters.The relatively active photosynthetic oxygen production associated with phytoplankton growth in surface water may be responsible for the variation in depth [45].As for the other parameters (COD Mn , TN, TP, NH 3 -N, Chl-a, and FC), the deeper samplers showed decreased levels of concentrations, and this can be ascribed mainly to the fluxion of surface water and the dilution of pollutants with depth [46].downtown and was impacted by the intense human activities, leading to the relatively inferior water quality compared with sites 6 and 8.In addition, compared with the other parameters, a larger decline in TP concentration was found for each path, without an increase at site 8.There are two possible explanations.First, phosphorus was mainly transferred into water in a granular state, and its transmission distance is shorter than those of nitrogen and other pollutants in the reservoir.Second, phytoplankton absorbed more phosphorus than the other nutrients during the growth period [44].In addition, we have also explored the water quality variation in different depths.Overall, the higher DO values were observed in surface water compared with the bottom and the medium waters.The relatively active photosynthetic oxygen production associated with phytoplankton growth in surface water may be responsible for the variation in depth [45].As for the other parameters (CODMn, TN, TP, NH3-N, Chl-a, and FC), the deeper samplers showed decreased levels of concentrations, and

Implications for Water Resources Management
The comprehensive use of different multivariable methods in trend analysis for water quality and identification of pollution sources could offer effective support for government implementing reservoir water resources management.The results obtained from CA provide insights into the spatial and temporal trends in water quality, making it possible to carry out a sampling arrangement in a more rational way [32,37].After detecting the similarity in water quality between different sampling sites and periods, the number of sites could be optimized by selecting representative sites from each group, and the sampling frequency could be reduced by selecting typical periods from each cluster.In this manner, the surveying efficiency would be increased and the cost would be lowered without losing any significance of the results.The identification of discriminating parameters between different sampling periods and sites by DA would help with enacting holistic regulations by taking into account the spatial and temporal variations.The spatial pattern of water quality may also help local governments to understand the pollution conditions in the area under administration and to take responsibility for conservation of the respective aquatic ecosystems.The PCA method can be used to detect the pollution sources in different regions, and it helps manages determine their priorities by emphasizing the regional distinction.Based on the information extracted from PCA, different policies can be established to treat the pollution sources in different areas.

Conclusions
The presented study comprehensively applied different multivariate methods to explore the dynamics of water quality and the factors impacting water quality in Qiandao Lake using a 4-year (2010-2013) data set.The results indicated that the integrated application of multivariate methods may serve as an operational analysis tool for reservoir water quality assessment and water resource management.
To represent different water quality characteristics, the 12 monitoring sites were divided through the CA method into three categories (Groups A, B and C), and the 12 months were divided into two groups (April-July, and the remaining months).According to the results from DA, the parameters SD, DO, COD Mn and TP were the most significant (79.9% correct assignments), accounting for temporal variations in water quality between the study years, and pH, DO and Chl-a significantly discriminated between the two sampling periods classified through CA, with 87.8% correct assignments.For spatial variation, DA identified SD and NH 3 -N as the significant discriminating parameters, with 81.6% correct assignments.PCA yielded four PCs for Groups A and B and three PCs for Group C (groups were classified by spatial CA), explaining 84.58%, 81.61% and 78.68% of the total variance, respectively.Organic pollution, nutrient pollution, domestic wastewater, and agricultural and surface runoff were recognized as primary latent sources affecting the water quality of Qiandao Lake.

Figure 1 .
Figure 1.Location of the Qiandao Lake and sampling sites and spatial pattern of the three classified groups derived by CA.

Figure 1 .
Figure 1.Location of the Qiandao Lake and sampling sites and spatial pattern of the three classified groups derived by CA.

Figure 2 .
Figure 2. The result of cluster analysis performed on the sampling sites.Figure 2. The result of cluster analysis performed on the sampling sites.

Figure 2 .
Figure 2. The result of cluster analysis performed on the sampling sites.Figure 2. The result of cluster analysis performed on the sampling sites.

Figure 3 .
Figure 3.The result of cluster analysis performed on the sampling months.

Figure 3 .
Figure 3.The result of cluster analysis performed on the sampling months.

Figure 5 .
Figure 5. Temporal variations of discriminant parameters identified by DA in Qiandao Lake between sampling periods.(a) pH; (b) DO; (c) Chl-a.

Figure 5 .
Figure 5. Temporal variations of discriminant parameters identified by DA in Qiandao Lake between sampling periods.(a) pH; (b) DO; (c) Chl-a.

Figure 6 .
Figure 6.Spatial variations of discriminant parameters identified by DA in Qiandao Lake between sampling sites.(a) SD; (b) NH 3 -N.

Figure 7 .
Figure 7. Scores of sampling sites for the first two principal components produced by PCA.

Figure 7 .
Figure 7. Scores of sampling sites for the first two principal components produced by PCA.

Figure 8 .
Figure 8. Locations of the three water flow paths constituted by the sampling sites and the potential pollutant sources.

Figure 8 .
Figure 8. Locations of the three water flow paths constituted by the sampling sites and the potential pollutant sources.

Figure 9 .
Figure 9. Trends of 4-year averages of the water quality parameters for the three paths.(a) Central path; (b) Southwest path; (c) Northeast path.

Figure 9 .
Figure 9. Trends of 4-year averages of the water quality parameters for the three paths.(a) Central path; (b) Southwest path; (c) Northeast path.

Table 1 .
Water quality parameters and the corresponding measurement methods.

Table 1 .
Water quality parameters and the corresponding measurement methods.

Table 2 .
The statistical characteristics for the water quality parameters and the quality standards for surface water (GB 3838-2002) (units: mg/L).

Table 2 .
The statistical characteristics for the water quality parameters and the quality standards for surface water (GB 3838-2002) (units: mg/L).

Table 3 .
Results of temporal DA performed in the four years during 2010-2013.

Table 3 .
Results of temporal DA performed in the four years during 2010-2013.

Table 4 .
Classification functions coefficients from temporal DA performed in the four years during 2010-2013.

Table 5 .
Results of temporal DA performed on the two groups of sampling periods.

Table 5 .
Results of temporal DA performed on the two groups of sampling periods.

Table 6 .
Classification functions coefficients from temporal DA performed on the two groups of sampling periods.

Table 6 .
Classification functions coefficients from temporal DA performed on the two groups of sampling periods.

Table 7 .
Results of spatial DA performed on the three groups of sampling sites.

Table 8 .
Classification functions coefficients from spatial DA performed on the three groups of sampling sites.

Table 9 .
Loadings of selected parameters on significant principal components for the three sites groups.