Water Quality Assessment and Pollution Source Identification of the Eastern Poyang Lake Basin Using Multivariate Statistical Methods

Multivariate statistical methods including cluster analysis (CA), discriminant analysis (DA) and component analysis/factor analysis (PCA/FA), were applied to explore the surface water quality datasets including 14 parameters at 28 sites of the Eastern Poyang Lake Basin, Jiangxi Province of China, from January 2012 to April 2015, characterize spatiotemporal variation in pollution and identify potential pollution sources. The 28 sampling stations were divided into two periods (wet season and dry season) and two regions (low pollution and high pollution), respectively, using hierarchical CA method. Four parameters (temperature, pH, ammonia-nitrogen (NH 4-N), and total nitrogen (TN)) were identified using DA to distinguish temporal groups with close to 97.86% correct assignations. Again using DA, five parameters (pH, chemical oxygen demand (COD), TN, Fluoride (F), and Sulphide (S)) led to 93.75% correct assignations for distinguishing spatial groups. Five potential pollution sources including nutrients pollution, oxygen consuming organic pollution, fluorine chemical pollution, heavy metals pollution and natural pollution, were identified using PCA/FA techniques for both the low pollution region and the high pollution region. Heavy metals (Cuprum (Cu), chromium (Cr) and Zinc (Zn)), fluoride and sulfide are of particular concern in the study region because of many open-pit copper mines such as Dexing Copper Mine. Results obtained from this study offer a reasonable classification scheme for low-cost monitoring networks. The results also inform understanding of spatio-temporal variation in water quality as these topics relate to water resources management.


Introduction
Water scarcity is a growing threat to economic and social development and widespread water pollution in recent decades further complicates the threat, especially in developing countries [1][2][3].Water pollution caused both by anthropogenic activities such as urbanization [4], industrial accidents [5][6][7], dam construction [8], and natural phenomena like soil erosion [9] and climate change [10], is a global issue that increases pressures on freshwater resources.Declining water quality is the result of spatio-temporal changes in sedimentation, temperature, pH, nutrients, heavy metals, toxic organic compounds and pesticides, and so on [11].In order to safeguard water quality and alleviate the pressures on water resources, it is necessary to elaborate spatial-temporal changes in regional water quality and identify the potential pollution sources.
A series of monitoring programs and protocols have been developed to enable a reliable quantification of nutrient transport in the aquatic environment (e.g., the National Land with Water Information in Japan [12,13], the National Monitoring and Assessment Program (NOVA) in Denmark [14,15], the Harmonized Monitoring Scheme (HMS) in Britain [16,17], and the National Water-Quality Assessment (NAWQA) in the United States [18,19]) to generate a more comprehensive picture of water quality conditions and trends.Meanwhile, sophisticated data-driven analytical approaches (e.g., the projection pursuit technique [20] and neural networks [21,22]), multivariate statistical techniques [23] (e.g., discriminant analysis (DA), cluster analysis (CA) and principal component analysis/factor analysis (PCA/FA)), fuzzy theory [24] and hydrological models [11,[25][26][27] have substantially improved water quality assessments.Among these methods, multivariate statistical techniques including CA, PCA/FA, and DA can be applied to easily extract important information in large water quality datasets and are therefore used widely to evaluate water quality and identify potential pollution sources [28].
In some regions in China, water quality impairment is severe, with important consequences for human health as well as sustainable economic and social development [29].From 2004, the State Ministry of Environmental Protection started to focus on surface water quality monitoring systems in the main river basins, including the Yellow River Basin, the Yangtze River Basin, the Pearl River Basin, the Lake Taihu Basin, the Songhua River Basin, and the Southeastern Coastal Rivers.Meanwhile, local environmental administrations including national, provincial, prefectural (city), and local have been established to monitor and report on local surface water quality in recent years.As a consequence, a huge monitoring database, including nutrients, sediment, physical and chemical properties, toxic organic compounds and pesticides, heavy metals, etc. has been built for regional water resources management.However, lake water environment is still deteriorating.For example, Poyang Lake, the largest freshwater lake (3050 km 2 ) in China, is polluted and most pollutants originated from five rivers including the Gan River, the Fu River, the Xin River, the Rao River, and the Xiu River.Therefore, it is necessary to investigate current situation of water pollution and identify pollution sources in rivers around the Poyang Lake.Moreover, the application of CA and PCA/FA in the Poyang Lake Basin is rare.
Therefore, several multivariate statistical approaches (DA, CA and PCA/FA) are applied to (1) illuminate temporal and spatial variations of water quality; and (2) identify the potential influencing factors that explain changes in water quality parameters of the Eastern Poyang Lake Basin.All results obtained from this study will can offer a reasonable classification scheme for low-cost monitoring networks and also inform understanding of spatio-temporal variation in water quality as these topics relate to water resources research and management.

Study Area
Poyang Lake, which is the largest freshwater lake (3050 km 2 ) in China, is located on the south bank of the middle-lower Yangtze River in Jiangxi Province (Figure 1).It is shallow and connected with five main rivers, including the Gan River, the Fu River, the Xin River, the Rao River, and the Xiu River.All these river tributaries form the Poyang Lake Basin, which covers an area of 162,200 km 2 , accounting for nearly 97% of Jiangxi Province.The topography of the basin varies from highly mountainous regions (maximum elev.2200 m) to alluvial plains in the lower reaches of the primary watercourses.
The basin lies in a subtropical wet climate zone with a distinct alternation from a wet to a dry season (see Figure 2) with an annual mean precipitation of 1710 mm and annual mean temperature of 17.5 ˝C.Water resources in the Poyang Lake Basin are rich, but water managers now face a series of difficulties as the awareness of global climate impacts on precipitation patterns increases.Meanwhile, with rapid economic development and the population explosion in the basin, human activities including dam construction [30] and land-use change significantly affect the water supply and patterns of water demand [31], while simultaneously being exacerbated by increased pollutant loading.In addition, mining activities caused heavy metal pollution, which is a serious issue.As a result, Poyang Lake faces a great deal of environmental problems including water quality deterioration and eutrophication.According to the Environmental Aspect Bulletin, Poyang Lake is only slightly polluted, underscoring the importance of investigating and assessing water quality for the protection of the lake water environment.In this study, the East Poyang Lake Basin (Figure 1) was chosen to conduct water quality investigations.The Eastern Poyang Lake Basin mainly contains the Xin River and Rao River (rising from two branches including the Chang River and Lean River.With rapid economic development in the past decades, dozens of non-ferrous metal mines have been intensively exploited in the East Poyang Lake Basin, and heavy metal pollution and related environmental  Water resources in the Poyang Lake Basin are rich, but water managers now face a series of difficulties as the awareness of global climate impacts on precipitation patterns increases.Meanwhile, with rapid economic development and the population explosion in the basin, human activities including dam construction [30] and land-use change significantly affect the water supply and patterns of water demand [31], while simultaneously being exacerbated by increased pollutant loading.In addition, mining activities caused heavy metal pollution, which is a serious issue.As a result, Poyang Lake faces a great deal of environmental problems including water quality deterioration and eutrophication.According to the Environmental Aspect Bulletin, Poyang Lake is only slightly polluted, underscoring the importance of investigating and assessing water quality for the protection of the lake water environment.In this study, the East Poyang Lake Basin (Figure 1) was chosen to conduct water quality investigations.The Eastern Poyang Lake Basin mainly contains the Xin River and Rao River (rising from two branches including the Chang River and Lean River.With rapid economic development in the past decades, dozens of non-ferrous metal mines have been intensively exploited in the East Poyang Lake Basin, and heavy metal pollution and related environmental Water resources in the Poyang Lake Basin are rich, but water managers now face a series of difficulties as the awareness of global climate impacts on precipitation patterns increases.Meanwhile, with rapid economic development and the population explosion in the basin, human activities including dam construction [30] and land-use change significantly affect the water supply and patterns of water demand [31], while simultaneously being exacerbated by increased pollutant loading.In addition, mining activities caused heavy metal pollution, which is a serious issue.As a result, Poyang Lake faces a great deal of environmental problems including water quality deterioration and eutrophication.According to the Environmental Aspect Bulletin, Poyang Lake is only slightly polluted, underscoring the importance of investigating and assessing water quality for the protection of the lake water environment.In this study, the East Poyang Lake Basin (Figure 1) was chosen to conduct water quality investigations.The Eastern Poyang Lake Basin mainly contains the Xin River and Rao River (rising from two branches including the Chang River and Lean River.With rapid economic development in the past decades, dozens of non-ferrous metal mines have been intensively exploited in the East Poyang Lake Basin, and heavy metal pollution and related environmental changes have been gradually attracted by scientific community.For example, a great deal of acidic mine drainage (pH 2-3) and waste effluents containing copper (Cu) and Zinc (Zn) discharged from the neighboring Dexing Copper Mine and from dozens of smelters and mining/panning activities along rivers were poured continuously into the Lean River [32], where the aquatic environment has been contaminated.

Monitored Parameters and Analytical Methods
Twenty-eight monitoring stations were selected for taking water samples in this study; station X1-X13 are located in the Xin River, station L1-L9 are located in the Chang River, and station P1-P5 are located in east Poyang Lake (Figure 1).Water samples were taken every two months from Jan. 2012 to Apr. 2015.Considering pollution features and traditional water quality index in China, total 14 water quality parameters including temperature (TEMP), pH, ammonia-nitrogen (NH 4 + -N), 5-day biochemical oxygen demand (BOD), chemical oxygen demand (COD), dissolved oxygen (DO), total nitrogen (TN), total phosphorus (TP), fluoride (F), sulfide (S), copper (Cu), oil, chromium (Cr) and Zinc (Zn) were selected to analyze water quality in the Eastern Poyang Lake Basin.The process of sampling, preservation, transportation and analysis of the water samples were conducted strictly according to standard methods (State Environment Protection Bureau of China 2002).Table 1 shows the specific analytical method for each water quality parameters.

Multivariate Statistical Methods
Spatio-temporal analysis of the water quality in the Eastern Poyang Lake Basin was analyzed by using CA, DA, and PCA/FA techniques.CA is the task of grouping a set of objects based on the characteristics they possess [33,34] and Ward's method is a criterion applied in hierarchical cluster analysis.The Ward's Method of hierarchical clustering with Squared Euclidean Distance was applied to explore the grouping of the 28 sampling stations.
DA determines the variables that discriminate between two or more naturally occurring groups/clusters on the basis of the accuracy rate of discriminant functions (DFs).It constructs a discriminant function (DF) for each group [35].DFs are calculated using the following equation: where i represents the number of groups pGq, k i represents the constant inherent to each group, n represents the number of parameters, and w ij represents the weight coefficient assigned by DF analysis (DFA) to a given parameter (P ij ).DA was employed to calculate the mean of a variable to predict group membership.The standard, forward stepwise, and backward stepwise modes of DA were used to calculate DFs in two groups generated from CA to describe spatial variations in river water quality.
PCA is a dimensionality reduction technique that helps to simplify the data and make it easier to visualize by finding a set of principal components (PCs) [36,37].PCs are orthogonal variables calculated by multiplying the original correlated variables with a list of coefficients, which can be described as where z represents the component score, a represents the component loading, x represents the measured value of the variable, i represents the component number, j represents the sample number, and m represents the total number of variables.FA was used to extract a lower dimensional linear structure from a set of data and thenfore provide a powerful means for detecting similarities among samples [38].FA can reduce the contribution of less significant variables obtained from PCA and the new group of variables known as varifactors (VFs) is extracted through rotating the axis defined by PCA.The basic concept of FA is described as where z represents the measured value of a variable, a represents the factor loading, f represents the factor score, e represents the residual term accounting for errors or other sources of variation, i represents the number of sample, j represents the number of variable, and m represents the total number of factors.
Here, PCA/FA was applied to the normalized log-transformed data sets (14 variables) separately for the two different spatial regions (low pollution region and high pollution region) as delineated by the CA technique.

Temporal/Spatial Similarity and Grouping
Figure 3 shows the results of temporal cluster analysis, grouping the 6 months into two statistically significant clusters at (D link /D max ) ˆ100 < 60.Cluster 1 (dry season), comprised of January and March, approximately correspond to the low flow period.Cluster 2 (wet season) contains two small clusters at (D link /D max ) ˆ100 < 40; that is, May and November reflect the mean flow periods, and the remaining months (July and September) comprise another group, and reflect the high flow period.Notably, temporal variation of surface water quality was significantly affected by local climate seasons (spring, summer, autumn and winter) and hydrological conditions (low, mean, and high flow period).The Poyang Lake Basin lies in a subtropical wet climate zone with a distinct alternation from wet to a dry season, consistent with the temporal patterns of water quality (Figure 3).
Spatial CA also yielded a dendrogram with two statistically significant clusters at (D link /D max ) ˆ100 < 60 (Figure 4).Group 1 included X1 and L1 to L4, and the remaining monitoring stations comprised Group 2. The X1 station and L1 to L4 stations in Group 1 are located at the upstream of the Xin River and the Lean River, respectively, which, due to low population density and the absence of industrial and commercial activity, are far from major point and non-point pollution sources.However, L1-L4 stations are located in the Dexing district, which is one of the largest copper and gold producing districts in China and metal pollution and associated mineral pollution are always a problem [39,40].Despite relatively high Cu, S, and F concentrations were observed at the L4 Station in this study, Group 1 should be considered as moderate or low pollution.Group 2 corresponds to highly polluted stations, with highest average concentrations of NH 4 -N, oil, BOD, COD and TP.Most stations in this group were located at the middle to down-stream of the east Poyang Lake basin and received pollution from point sources including municipal sewage and industrial wastewater and non-point pollution sources.BOD, COD and TP.Most stations in this group were located at the middle to down-stream of the east Poyang Lake basin and received pollution from point sources including municipal sewage and industrial wastewater and non-point pollution sources.

Temporal/Spatial Variations in River Water Quality
Based on the temporal groups (wet season and dry season) from CA, DA was performed on raw data to further explore temporal changes in surface water quality.Tables 2 and 3

Temporal/Spatial Variations in River Water Quality
Based on the temporal groups (wet season and dry season) from CA, DA was performed on raw data to further explore temporal changes in surface water quality.Tables 2 and 3

Temporal/Spatial Variations in River Water Quality
Based on the temporal groups (wet season and dry season) from CA, DA was performed on raw data to further explore temporal changes in surface water quality.Tables 2 and 3 indicate the discriminant functions (DFs) and classification matrices (CMs), which were calculated by the standard, forward stepwise and backward stepwise modes of DA.Variables are included step-by-step beginning with the more significant until no significant changes in the forward stepwise mode, but are removed step-by-step beginning with the less significant in the backward stepwise mode.Both the standard and forward stepwise mode DFs using 14 and 7 discriminant variables, respectively, produced the corresponding CMs assigning 96.43% of the cases correctly.In the backward stepwise mode, however, DA yielded a CM with approximately 97.86% correct assignations using only four discriminant parameters, showing that TEMP, pH, NH 4 -N, and TN.Thus, the temporal DA indicated that TEMP, pH, NH 4 -N, and TN were the most significant parameters to discriminate differences between the wet season and dry season, revealing that these four parameters could be used to account for the expected temporal changes in surface water quality in the Eastern Poyang Lake Basin.Box and whisker plots of the discriminate parameters identified by DA are indicated in Figure 5.The average temperature (Figure 5a) in wet season was clearly higher than in dry season because of the local climate.The same difference in pH was found in Figure 5b.In contrast, the average NH 4 -N and TN were higher in dry season than in wet season due to the local hydrologic conditions.The discharge in the wet season is much larger than in the dry season, which significantly dilutes the NH 4 -N and TN.Moreover, in wet season (typical in summer and autumn) there are more aquatic organisms than in the dry season, consuming more NH 4 -N.Just like temporal DA, the DFs and CMs for spatial DA were obtained from the standard, forward stepwise and backward stepwise modes on the basis of spatial groups (low pollution stations and high pollution stations), which are shown in Tables 4 and 5.Both the standard and forward stepwise mode DFs using 14 and 11 discriminant variables, respectively, yielded the corresponding CMs assigning 95% of the cases correctly, whereas the backward stepwise DA gave CMs with about 93.75% correct assignations using only five discriminant parameters (Tables 4 and 5).Backward  Just like temporal DA, the DFs and CMs for spatial DA were obtained from the standard, forward stepwise and backward stepwise modes on the basis of spatial groups (low pollution stations and high pollution stations), which are shown in Tables 4 and 5.Both the standard and forward stepwise mode DFs using 14 and 11 discriminant variables, respectively, yielded the corresponding CMs assigning 95% of the cases correctly, whereas the backward stepwise DA gave CMs with about 93.75% correct assignations using only five discriminant parameters (Tables 4 and 5).Backward stepwise DA showed that pH, COD, TN, F, and S were the most significant parameters to discriminate differences between the low pollution stations and high pollution stations.Figure 6 shows the chosen discriminate parameters identified by spatial backward stepwise DA.The pH (Figure 6a) in low pollution regions was clearly less than in high pollution regions, which was not consistent with analyze results in Danjiangkou Reservoir Basin of China [41].It maybe because river segments in this region receives a great deal of acidic mine drainage and waste effluents containing Cu and Zn discharged from the neighboring Dexing Copper Mine and from many smelters and mining/panning activities.The average COD and TN concentration (Figure 6b,c) in the low pollution region were also clearly less than in the high pollution region.Within high pollution regions, all stations were located in middle to down-stream reaches or near urban areas and therefore in proximity to municipal sewage and industrial waste water.The average F and S concentration (Figure 6d,e) in low pollution regions were also clearly higher than in high pollution region.Obviously, these excess acidic pollutants were main drivers that leading to lower pH. Figure 7 preferably illustrates spatial distribution of pH, DO, TN, and F at 27 stations in the east Poyang Lake basin.

Data Structure Determination and Source Identification
Based on the normalized log-transformed data sets, PCA/FA was used to further identify the potential pollution sources for the low pollution and high pollution regions.Before the PCA/FA analysis, the Kaiser-Meyer-Olkin (KMO) and Bartlett's Sphericity tests were carried out on the parameter correlation matrix to examine the validity of PCA/FA.The KMO results for Group 1 and Group 2 were 0.61 and 0.71, respectively, and Bartlett's Sphericity results were 547.92 and 1611.68 (p < 0.05), suggesting that PCA/FA analysis was reasonable to offer significant reductions in dimensionality.Six VFs were calculated for the low pollution region and four VFs for the high pollution region with the eigenvalues great than 1, explaining proximately 78.86% and 57.78% of the total variance in respective surface water quality data sets (Figure 8 and Table 6).

Data Structure Determination and Source Identification
Based on the normalized log-transformed data sets, PCA/FA was used to further identify the potential pollution sources for the low pollution and high pollution regions.Before the PCA/FA analysis, the Kaiser-Meyer-Olkin (KMO) and Bartlett's Sphericity tests were carried out on the parameter correlation matrix to examine the validity of PCA/FA.The KMO results for Group 1 and Group 2 were 0.61 and 0.71, respectively, and Bartlett's Sphericity results were 547.92 and 1611.68 (p < 0.05), suggesting that PCA/FA analysis was reasonable to offer significant reductions in dimensionality.Six VFs were calculated for the low pollution region and four VFs for the high pollution region with the eigenvalues great than 1, explaining proximately 78.86% and 57.78% of the total variance in respective surface water quality data sets (Figure 8 and Table 6).In the low pollution region, among six VFs, VF1, explaining about 25.38% of the total variance, had strong positive loadings of pH and TN and moderate positive loadings of COD, and strong negative loadings of S and moderate negative loadings of F. Generally, high concentrations of total nitrogen reflect agricultural runoff and municipal effluents [42,43]; COD is an indicator of organic pollution from industrial and domestic waste water [44].The pH is regarded as one of the main reaction conditions for redox reactions involving organic matter, which can regulate the concentration of COD [45].Sulphide and fluoride mainly originate from copper mines in this region (e.g., the Dexing Copper Mine in Dexing City), the components of which are very complex for dressing with high sulphur and fluoride [46].VF1 included nutrient pollution, organic pollution and mining pollution.VF2 (14.29% of the total variance) had strong positive loadings of Cu and strong negative loadings of BOD, representing metal pollution.VF3, explaining 12.21% of the total variance, had strong positive loadings of Cr and TP and moderate positive loadings of Oil.This factor can be explained as representing influences from point sources, such as copper mines, industrial effluents and In the low pollution region, among six VFs, VF1, explaining about 25.38% of the total variance, had strong positive loadings of pH and TN and moderate positive loadings of COD, and strong negative loadings of S and moderate negative loadings of F. Generally, high concentrations of total nitrogen reflect agricultural runoff and municipal effluents [42,43]; COD is an indicator of organic pollution from industrial and domestic waste water [44].The pH is regarded as one of the main reaction conditions for redox reactions involving organic matter, which can regulate the concentration of COD [45].Sulphide and fluoride mainly originate from copper mines in this region (e.g., the Dexing Copper Mine in Dexing City), the components of which are very complex for dressing with high sulphur and fluoride [46].VF1 included nutrient pollution, organic pollution and mining pollution.VF2 (14.29% of the total variance) had strong positive loadings of Cu and strong negative loadings of BOD, representing metal pollution.VF3, explaining 12.21% of the total variance, had strong positive loadings of Cr and TP and moderate positive loadings of Oil.This factor can be explained as representing influences from point sources, such as copper mines, industrial effluents and domestic wastewater.VF4, accounting for 10.49% of the total variance, had strong positive loadings of temperature and strong negative loadings of dissolved oxygen.The concentration of DO is controlled by temperature and therefore has both a seasonal and a daily cycle [47].Therefore, the DO concentration is high in winter and early spring because of low temperature, and is low in summer and fall because of high temperature.VF5 (9.30%) had strong positive loadings on NH 4 -N representing non-point source pollution related to agricultural activities.VF6 (7.19%) had strong positive loadings of Zn indicating the metal pollution.
With respect to the data set pertaining to the high pollution region, among four VFs, VF1, explaining about 22.89% of the total variance, had strong positive loadings on Cu and moderate negative loadings on pH and S, basically representing metal pollution from the upstream.VF2 (16.88% of the total variance) had strong positive loadings of NH 4 -N and moderate positive loading of TN and COD.This factor can be explained as one typical kind of mixed pollution, which consists of point source pollution (e.g., industrial and domestic waste water) and non-point source pollution associated with agricultural activities and atmospheric deposition.VF3, explaining 10.19% of the total variance, had strong positive loadings of DO and moderate positive loadings of F and moderate negative loadings of TEMP.Generally, fluoride is from cement plants, fluorine chemical factories, and copper smelters in this region.The relationship between DO and temperature is explained in the same way as the explanation of VF4 in the low pollution region.VF4 (7.82%) had strong positive loadings of BOD and moderate positive loadings of Oil and Cr.The high concentration of BOD and Oil could represent organic pollution and oil pollution, and Cr is likely from cement plants and copper smelters in this region.
Based on the above analysis, five latent pollutants including nutrients, organics, chemicals, heavy metals and natural pollutants were identified in the study area.Firstly, nutrient pollution (ammonia nitrogen and total nitrogen) was mainly from non-point sources related to agricultural activities and atmospheric deposition and point sources including municipal effluents and fertilizer plant wastewater.In addition, organic pollution (BOD and COD) was usually from point sources (e.g., industrial and domestic waste water).Thirdly, chemical pollution was mainly from the petroleum industry (oil pollution) and copper mines and plating (S and F pollution).Fourthly, heavy metal pollution (Cu, Cr and Zn) was mainly from copper mines and plating.Finally, natural pollution was badly affected by meteorological variations such as the variation of water temperature and dissolved oxygen.Considering the types of pollution in the two regions (Table 6 and Figure 6), heavy metals (Cu, Cr and Zn), fluoride and sulfide stood out.Field investigations showed there are many copper mines in the Eastern Poyang Lake Basin, such as the Dexing Copper Mine, which are associated with mineral effluents including fluoride and sulfide.

Conclusions and Future Work
Different multivariate statistical techniques were applied to evaluate spatio-temporal variations in surface water quality of the Eastern Poyang Lake Basin in this study.Hierarchical CA grouped 6 months into two periods and 28 sampling stations into two groups on the basis of their similar water quality characteristics, which can provide a reasonable and useful classification for optimizing the future spatial monitoring network with lower cost.Based on the results obtained from hierarchical CA, spatial and temporal changes in surface water quality were analyzed through achieving discriminant functions and classification matrices using DA.For temporal changes, the temporal DA used only four discriminant parameters including TEMP, pH, NH 4 -N, and TN, with approximately 97.86% correct assignations.The spatial DA gave CMs with about 93.75% correct assignations using only five discriminant parameters (pH, COD, TN, F, and S).Thus, temporal and spatial DA analysis could be used to optimize future water quality monitoring programs by reducing the number of monitoring stations, monitoring parameters, and monitoring frequency.The results from PCA/FA analysis identified there were five latent pollutions including nutrients, organics, chemicals, heavy metals and natural pollution.Heavy metals (Cu, Cr and Zn), fluoride and sulfide were serious problems in the study region.In low pollution region, heavy metal and sulfide pollution were mainly from copper mines such as the Dexing Copper Mine, but from copper mines and plating in the high pollution region.
Although all results obtained from this study illustrate the utility of multivariate statistical techniques for extracting characteristics from large water quality data sets and identifying pollution sources to fully understand spatiotemporal variations in water quality, further study should be carried out in the future.First, some water quality parameters including TEMP, pH, NH 4 -N, COD, TN, F, and S should be further monitored more accuracy and controlled.In addition, it is necessary to quantitatively evaluate pollution sources to obtain contribution of different pollutants.Finally, further investigation of heavy metal pollution should be implemented, especially for Cu pollution.

Sustainability 2016, 8 , 133 3 of 15 Figure 1 .
Figure 1.Study area and monitoring stations for the rivers in the Eastern Poyang Lake Basin.

Figure 1 . 15 Figure 1 .
Figure 1.Study area and monitoring stations for the rivers in the Eastern Poyang Lake Basin.

Figure 3 .
Figure 3. Dendogram showing the temporal clustering of study periods.

Figure 3 .
Figure 3. Dendogram showing the temporal clustering of study periods.

Figure 3 .
Figure 3. Dendogram showing the temporal clustering of study periods.
indicate the discriminant functions (DFs) and classification matrices (CMs), which were calculated by the

Figure 8 .
Figure 8. Scatter plot of loadings for the four VFs for group 1 (a and b) and group 2 (c and d).

Figure 8 .
Figure 8. Scatter plot of loadings for the four VFs for group 1 (a and b) and group 2 (c and d).

Table 1 .
Water quality parameters, units, analytical methods and lowest detected limit as measured from Jan. 2012 to Apr. 2015 for the Eastern Poyang Lake Basin.

Table 2 .
Classification functions coefficients for DA of temporal changes.

Table 3 .
Classification matrix for DA of temporal changes.

Table 4 .
Classification functions coefficients for DA of spatial changes.

Table 4 .
Classification functions coefficients for DA of spatial changes.

Table 5 .
Classification matrix for discriminant analysis of spatial changes.

Table 6 .
Loadings of experimental variables (14) on significant VFs for low pollution and high pollution.