The descriptive statistics of 22 water quality parameters are summarized in Table 1
. The pH values ranged from 5.74 to 8.91, which were basically within the standard limit of 6–9 allowed by GB3838-2002. The mean values of F, S, F. Coli, and VP in most water samples were far lower than the class III standard (GB3838-2002), while that of Petrol (0.058 mg/L) was slightly higher than the class III standard (0.05 mg/L). Among nutrients, the mean value of TN was 2.87 and far higher than the class III standard (1.0 mg/L); the mean values of NO3
-N and NO2
-N were 0.294 and 0.063 respectively, which were far lower than the class III standard (10 mg/L); the one of NH3
-N was 0.916 and lower than the class III standard (1.0 mg/L). TN is the sum of NO3
-N and organic nitrogen, which is the main indicator of water eutrophication. Thus, the main nutrient in the study area was organic nitrogen. The concentration levels of CODMn
and COD deserve attention because these parameters represent the levels of biological, chemical and organic contamination in surface water, respectively. The maximum values of these parameters were 15.7, 9.0 and 87.9 mg/L, respectively, all exceeding the class III standard (6, 4 and 20 mg/L, respectively). Therefore, the study area had a relatively high contamination level. The coefficients of variation for NH3
, Petrol, S, TN and NO2
-N were relatively high, indicating significant temporal and spatial differences in the distributions of these water quality parameters.
3.1. Water Quality Assessment Using WQI
The water quality of most monitoring stations was classified as “medium-low”, accounting for approximately 84.52% (of which “medium” accounted for 64.29% and “low” accounted for 20.24%). Additionally, 13.10% of the water quality was “good”, and only 2.38% was “excellent” (Figure 2
). The water quality of S1 and S2 was always above “good”, especially the water quality of S2, which was “excellent” in 2015 and 2018. Because these two monitoring stations are in the Daxi Reservoir and Shahe Reservoir within the urban centralized drinking water protection area, their water quality has been maintained in good condition due to the good natural ecological environment and strict contamination control measures. Other monitoring stations are in urbanized or agricultural areas.
From the interannual change trend of WQI (Figure 3
), about half of the monitoring stations showed an increasing trend, most of which were generally stable. The water quality of monitoring stations S1 and S2 decreased slightly. Due to rapid urbanization and population growth, water environment security is facing increased pressure, and the protection of water sources should be further strengthened. The water quality of other monitoring stations showed continuous improvement, especially from 2016 to 2019. Since 2017, Changzhou city has adopted special actions: “two reductions” (reducing total coal consumption and backward chemical production), “six governance” (governing the Taihu Lake water environment, domestic garbage, black and smelly water bodies, livestock and poultry breeding contamination, volatile organic compound contamination, and hidden environmental dangers), and “three improvements” (improving the level of ecological protection, environmental-economic policy regulation, and environmental law enforcement and supervision) [42
]. The environmental quality has been significantly improved, the total discharge of major pollutants has been markedly reduced, and the environmental risks have been effectively controlled.
3.2. Spatial Similarities and Clustering
Spatial CA generated a dendrogram, dividing the 14 monitoring stations into 3 clusters at (Dlink/Dmax) × 100 < 40 (Figure 4
). According to the physical, chemical and microbiological characteristics of water quality, each cluster was classified into its own contamination category. Cluster A included stations S1 and S2 and corresponded to low contamination. Cluster B contained six monitoring stations (S5, S6, S8, S9, S11 and S13) and was classified as medium contamination. Cluster C comprised six monitoring stations (S3, S4, S7, S10, S12 and S14) and was classified as high contamination.
In cluster A, S1 and S2 are in the Daxi Reservoir and Shahe Reservoir. The contamination of the six monitoring stations in cluster B mainly derives from nonpoint source contamination, such as agricultural runoff, livestock and poultry breeding, and fishpond drainage. The monitoring stations of cluster C are mainly located in urban areas and downstream reaches, and the possibility of water contamination is higher because of the comprehensive impacts of domestic sewage, industrial wastewater and upstream inflow water [5
The above spatial CA results coincided with the average WQI of the monitoring stations. The WQI values of S1 and S2 were the highest; those of S3, S4, S7, S10, S12 and S14 were relatively low; those of S5, S6, S8, S9, S11 and S13 were at a medium level (Figure 5
). Thus, CA can be used to provide reliable water quality classification throughout monitoring stations; however, designing optimal spatial sampling strategies is warranted in the future [10
3.3. Spatial Variations in Water Quality
Based on the CA data, discriminant analysis was used to detect the significance of the discriminant function and to identify the important variables reflecting the variation between clusters. The Wilks’ lambda and chi-square values in all discriminant functions were in the range of 0.036–0.509 and 504.269–2479.317, respectively, and the p
values were all less than 0.01 (Table 3
), indicating that the spatial DA was valid [13
and Table 5
show the discriminant function and classification matrix generated from the standard, forward stepwise and backward stepwise modes of DA. The standard and forward stepwise models of the discriminant function used 22 and 21 discriminant variables, respectively, and obtained the corresponding classification matrix, which correctly assigned approximately 88% of cases. However, in the backward stepwise mode, DA generated nearly 87% of the correct allocation to the classification matrix using only 14 discrimination parameters. Spatial DA showed that pH, Petrol, VP, COD, TP, F, S, F. Coli, SO4
, Cl, NO3
-N, T-Hard, NO2
-N, and NH3
were the critical variables to distinguish the water quality of the three spatial clusters and explained most of the spatial variations in expected water quality.
Based on the discriminant parameters analyzed by DA, box and whisker plots of three clusters (cluster A, cluster B, cluster C) were constructed to evaluate the spatial variations in water quality (Figure 6
). Most of the parameters showed significant differences between clusters. Overall, the average concentration of cluster A was much lower than that of clusters B and C, and the average concentration of cluster C was slightly higher than that of cluster B. Higher Petrol, COD and TP values were found in cluster C, indicating that organic contamination and eutrophication were the most serious water environment problems in cluster C. Additionally, lower pH values were found at the monitoring stations of cluster C, likely because of the hydrolysis of acidic substances (ammonia and organic acids) [5
]. In conclusion, the water contamination of cluster C was more serious than that of the other two clusters. Thus, the prevention and control of contamination sources and treatment capacity of point source contamination must be strengthened, such as strengthening the construction and treatment capacity of STPs.
3.4. Principal Component Determination and Contamination Source Identification
Because the contamination levels of the three spatial clusters (clusters A, B, and C) were significantly different, PCA/FA was used to identify the water contamination source types for the normalized data sets of the three spatial clusters.
PCA/FA of the three data matrices obtained six, eight and seven variance factors (VFs) with eigenvalues ≥1, explaining 71.5%, 66.8% and 67.9% of the total variance in the corresponding data sets, respectively (Table 6
, Table 7
and Table 8
). Additionally, the loadings of parameters on VFs were categorized as “high”, “medium” and “low” based on absolute loading values of > 0.75, 0.75–0.50 and 0.50–0.30 [44
Among the six VFs of cluster A, VF1 explained 20.7% of the total variance and had high positive loadings on Petrol, VP, TN and F. This factor indicated toxic organic contamination from farmland drainage, oily sewage discharge from ship operation, domestic sewage, industrial wastewater, atmospheric deposition and precipitation leaching. VF2 (15.3% of the total variance) had high negative loadings on SO4
and TSS, and high positive loadings on NO2
-N. The presence of nitrite in water indicated that the decomposition process of organic matter continued, and the risk of organic matter contamination persisted. VF3 (13.9%) had high positive loadings on NO3
-N, Cl and S, indicating nutrients from agricultural runoff and atmospheric deposition and the natural source of soil erosion and salt ions (CI, S) in the watershed [45
]. VF4 (10.3%) had high positive loadings on BOD5
, representing organic contamination in sewage [6
]. VF5 (6.3%) had high positive loadings on NH3
, pH and EC. Generally, EC indicates natural contamination, which may be due to soil erosion or an increase in the number of salt ions in water [44
]. Additionally, VF6 (only 5.0%) had a medium negative loading on NH3
Regarding the data set of cluster B, among the eight VFs, VF1, which accounted for 12.9% of the total variance, represented a high negative loading on F but medium positive loadings on F. Coli and S, indicating microbial contamination from municipal sewage, livestock and poultry breeding. VF2 (11.1% of the total variance) represented high negative loadings on BOD5
, indicating organic contamination in urban sewage and industrial wastewater. VF3 (9.3%) represented a high positive loading on DO but a high negative loading on Temp. VF4 (9.0%) represented only a moderate positive loading on TP, revealing nutrient contamination (e.g., P), especially from sewage containing detergents, industrial wastewater and fertilizer. Point source contamination (such as wastewater from the phosphorus chemical industry) and nonpoint source contamination (such as animal breeding and agricultural fertilizer) from P, constitute common eutrophication-causing contamination in this area [46
]. VF5 (7.0%) applied only a moderate positive loading on NH3
-N, representing the contamination of animal feces and agricultural fertilizers. VF6 (only 6.5%) presented a high positive loading only on EC, likely because of the mineral composition in river water [6
]. VF7 (only 5.9%) presented a high positive loading on SO4
and a medium positive loading on TN, representing industrial wastewater using sulfate or sulfuric acid. Finally, VF8 (only 5.2%) had a high positive loading on NH3
and pH, likely because of industrial wastewater containing alkaline substances, such as NH3
Regarding the seven VFs of cluster C, VF1 (20.5% of the total variance) showed high positive loadings on NH3
-N and TN, representing nutrient contamination from agricultural runoff, municipal sewage and fertilizer plant wastewater (e.g., N). VF2 (12.2%) showed a high positive loading on F, representing industrial wastewater containing fluoride. VF3 (10.2%) showed a high positive loading on pH. VF4 (7.5%) showed a high positive loading on Temp and a moderate negative loading on DO, contrasting the results for VF3 of cluster B. VF5 (6.7%) showed a high positive loading on Petrol, representing contamination from oily sewage discharge from ship operations and wastewater from the petrochemical industry. VF6 (5.7) showed moderate positive loadings on TSS and EC. Agricultural runoff, wastewater discharge, solid waste disposal and irrigation return increased the suspended solids loading in streams [45
]. VF7 (5.0%) showed a high positive loading on SO4
, similar to VF7 of cluster B.
We have identified four contamination source types—nutrient, organics, feces and oil. Specifically, nutrient represented point source contamination, such as urban domestic wastewater and industrial wastewater from chemical fertilizer plants, and nonpoint source contamination, such as that related to agricultural activities and aquaculture. Second, organics were mainly derived from oxygen consumption and toxic organic matter from municipal sewage and industrial sewage. Third, feces were mainly derived from animal fecal drainage in the fishery and livestock breeding industries. Finally, oil represented the contamination characters from the petroleum chemical industry and oily sewage discharge from ship operation.