Improved Principal Component-Fuzzy Comprehensive Assessment Coupling Model for Urban River Water Quality: A Case Study in Chongqing, China

: An improved principal component-fuzzy comprehensive assessment coupling model for urban river water quality is proposed, which fully considers the inﬂuence of water quality and quantity. This model can not only choose the key indexes, but also specify the spatial variation and class of water quality. This proposed model was used to assess the water quality of the Qingshui and Fenghuang streams in Chongqing, China. Data of twelve indexes used in the assessment were collected from 17 monitoring points. The assessment results show that the key indexes include TN, TP, NH 3 -N, CODcr, pH, DO and velocity. Water quality of 14 monitoring points is classiﬁed as class Bad V, and that of the remaining points is class V. Mainly a ﬀ ected by the deposition of garbage and discharge of domestic sewage, water quality of the midstream is the worst. The upstream is mainly inﬂuenced by farmland non-point source pollution and rural domestic sewage pollution. The downstream is close to the scenic area, and environmental control measures such as river dredging and artiﬁcial aeration are regularly carried out. The water quality of it is the best. The results provide valuable information that allow local environmental departments to discover the source of pollutant and formulate water resource management strategies.


Introduction
As an important part of an urban ecosystem, a river plays a key role in water supply, flood control, sewage discharge, landscape entertainment and so on. River water quality assessment is essential in environment conservation, which provides a scientific basis for the utilization of water resources and the comprehensive prevention of water pollution [1]. However, with the continuous promotion of urbanization in China, the deterioration of water quality is strongly associated with increased point and diffuse pollution, caused from rapidly expanding urban, industrial sewage, domestic sewage and agricultural activities [2,3]. Therefore, it is urgent for policymakers and watershed managers to comprehensively assess the water quality of urban rivers, and then provide the main causes of pollution and remediation strategies [4].
It is acknowledged that a good water quality assessment method can not only specify the water quality class, but also accurately reflect the spatial variation of water quality condition [5,6]. The water quality assessment method, which can be widely used in environmental management should be easy to calculate and master, besides being scientific and accurate [7].The traditional water quality assessment methods are the single factor assessment method and water quality index method, which are widely adopted by the Environmental Protection Department in China [8,9].The single factor assessment method [10] is to select the class of the worst index as the water quality class according to the classification criteria of each monitoring index. The assessment results are easy to generalize and conservative. The water quality index method [11,12] can accurately specify the water quality class to some extent; however, too many parameters remain to be considered. In recent years, many new methods have been applied to the water quality assessment, such as the fuzzy comprehensive assessment method [13,14],water pollution index method [15,16] and grey assessment method [17].
In all water quality assessments, because of the inconsistency and peculiarity of each pollutant, there is vagueness or fuzziness related to water quality [18]. Therefore, the classification criteria used and the boundaries between different classes should be fuzzy to some extent [19]. Fuzzy assessment methods evaluate the contributions of various pollutants comprehensively according to predetermined weights, and decrease the fuzziness due to membership functions [20]. Among fuzzy assessment methods, fuzzy comprehensive assessment is popular, which has been used by many environmental researchers in China [21]. However, there are still some limits when applying the fuzzy comprehensive assessment method to water quality assessment. For example, when the number of the assessment index is too large, the main indexes with strong correlation cannot be obtained. This will lead to the loss of fuzzy matrix information and the phenomenon that assessment results tend to be uniform and difficult to distinguish.
Furthermore, most of Chinese researchers have considered some indexes in the Environmental Quality Standards for Surface Water of China (GB3838-2002), ignoring the influence of hydrodynamic force and water flow. The research shows that there is a close correlation between pollutant transport and water flow [22,23]. The hydrodynamic characteristics of microfluidics formed by pollutants are basically the same as that of particles in water flow [24]. Besides, the sediment, which plays an essential role in the storage, migration and transformation of pollutants, is also affected by the water flow [12]. The release and suspension rate of sediment into the water is directly proportional to the velocity. Therefore, based on the traditional water quality assessment indexes, it is necessary to add water flow indexes for comprehensive assessment.
In this study, based on the traditional indexes of water quality assessment, the indexes of the water quantity characteristic are added, and the principal component analysis method and fuzzy comprehensive assessment method are adopted. According to the principal component analysis method, the key indexes are selected, and the spatial variations of the water quality condition can be reflected. The fuzzy comprehensive assessment method takes advantage of the key indexes to determine the water quality class and verify the results of principal component analysis. Therefore, the coupling model of improved principal component-fuzzy comprehensive assessment for urban rivers is established. Taking two urban streams in Chongqing, China, as an example, 17 monitoring points have been set up for investigation. The key indexes were selected from 12 indexes for analysis. The water pollution degree and water quality class of different monitoring points were explored to comprehensively analyze and determine the source and distribution of pollutants. This study can provide reference for water quality assessment and pollution control in urban rivers.

Study Area
The Qingshui and Fenghuang streams are located in the Shapingba district in the west of Chongqing, both of which are first-level tributaries on the right bank of the Jialing River ( Figure 1). The Qingshui stream, with a total length of 15.88 km, a drainage area of 35.54 km 2 , and an average annual flow of 0.502 m 3 /s, is one of the largest of the five important streams in Chongqing's main urban area, which flows through densely populated areas with a developed economy. The upstream and western tributaries mainly flow through the Gele mountain, most of which are towns, mountains and scattered farmlands. The mainstream and eastern tributaries, flowing through the urban areas, mainly collect domestic sewage and industrial sewage. Fenghuang stream, with a total length of 4.09 km, a drainage area of 3.18 km 2 , and an average annual flow of 0.11 m 3 /s, is adjacent to the Qingshui stream, flowing from west to east, and merges into Jialing river in the Ciqikou scenic spot. The drainage area is 3.18 km 2 and the total length is 4.09 km. The Qingshui and Fenghuang streams are water bodies of the urban natural landscape with important ecological landscape value.

Water Quality Monitor and Sample Collection
Through the comprehensive investigation of the Qingshui and Fenghuang streams, 17 monitoring points ( Figure 1) have been set up. Geographical locations of monitoring points were recorded using a portable GPS system. Twelve sample campaigns were conducted between 1 December 2018 and 30 January 2019 with less rainfall, spanning spatial variations. Most of the samples were collected in sunny weather, while a few were collected in rainy weather. Four samples were collected at each point for each sample campaign. The monitoring indexes include water depth, depth ratio, temperature, dissolved oxygen (DO), conductivity, pH, velocity, potassium dichromate index (CODcr), ammonia nitrogen (NH3-N), total phosphorus (TP), total nitrogen (TN) and suspended matter (SS). Samples were stored at low temperature and sent to the laboratory with 4 h of collection. All samples were analyzed by using the procedures of the Standard Methods for the Examination of Water and Wastewater [25]. Water depth was measured with an on-line water depth probe. The depth ratio was measured during on-site inspection with a meter ruler. Temperature, DO, conductivity, and pH were measured with a YSI multi-parameter portable water quality analyzer. Velocity was measured with a Doppler Flowmeter (water depth >30 cm) or mechanical velocity meter (water depth <30 cm). NH 3 -N was analyzed by Nessler's reagent spectrophotometry. TN was analyzed by potassium persulfate digestion-spectrophotometry. TP was analyzed by potassium persulfate digestion-molybdate spectrophotometry. CODcr was analyzed by potassium dichromate digestion. SS was analyzed by a suction filtration gravimetric. The means of the original data at each monitoring point are presented in Table 1.

Methods
Principal component analysis is an analysis method based on statistical characteristics, which integrates multi-dimensional factors into the same system for quantitative research. The main idea is dimension reduction, which aims to decompose a variety of original variables, reveal the regularity between the internal variables of materials, summarize the main components, and achieve the best comprehensive simplification [26,27].The application of principal component analysis to water quality assessment can choose multiple indicators into a few uncorrelated comprehensive indicators while retaining various information of the original data, and directly reflect water pollution degree, source, cause and spatial distribution of main pollutant [28,29]. Nonetheless, it is difficult to determine the class of water quality.
This study proposes to combine the principal component analysis method and fuzzy comprehensive assessment method, then an improved principal component-fuzzy comprehensive assessment coupling model is established. The calculation steps and flow chart are described as follows ( Figure 2).

Principal Component Analysis
(1) Index standardization and correlated matrix In order to eliminate the influence of different orders of dimensions, it is necessary to standardize the original data. The standardization formula is as follows: where X ij is the standardized index. Y ij is the original index. Y j is the mean of the jth index sample. S j is the standard deviation of the jth index sample, (j = 1, 2, . . . , n). After index standardization, correlation analysis is done for the treated index. The correlated coefficient matrix is as follows: where T is the correlated coefficient matrix; T ij is the correlation coefficient of jth assessment index for ith sample.
(3) Principal component contribution and the cumulative contribution Principal component contribution: Cumulative contribution: (4) Choice of principal component and calculation of principal component load All principal components are selected according to the principle that the eigenvalue is greater than 1, and the determined factors are reduced to mth principal components for final assessment. Then, the calculation of the principal component load is as follows: where q ij is the principal component load; u ij is the eigenvector; λ i is the eigenvalue. According to the principal component load, the main influencing indexes of each principal component are selected to form the key indexes of this assessment.
(5) Calculation of the principal component score Substitute the standardized data of each monitoring point into the expression of each principal component, and the score Fi of each principal component can be calculated. Then, the comprehensive score F can be obtained by the product of Fi and the weighted value of the eigenvalue. The pollution level is sorted quantitatively-the higher the score F, the higher the pollution level. The calculation of the comprehensive score is as follows: where F is the principal component score; f i is the correlation coefficient; F i is the score for the ith principal component.

Fuzzy Comprehensive Assessment Method
(1) Select assessment parameters and establish assessment criteria. It is vital to select assessment parameters which are rational, representative and accurate to form an assessment factor set U. The set U is based on the key indexes of principal component analysis method and can be expressed as: The assessment criteria set V is established based on the Environmental Quality Standards for Surface Water of China (GB3838-2002). Set V is expressed as: In the Environmental Quality Standards for Surface Water of China (GB3838-2002), the grade standards of each pollution index are listed, but the current assessment has added the index of velocity.
In hydraulics [30], in order to avoid or reduce the change of flow pattern in an open channel caused by the scouring or deposition, velocity must be controlled with a certain range. Velocity should be less than the non-scouring velocity to avoid open channel erosion, and it should be greater than the non-silting velocity to prevent sedimentation and weed growth in the water. The value of non-scouring and non-silting velocity depends on the soil quality, reinforcement, water depth and so on. It can be measured by experiments, or it can refer to relevant manuals, and the empirical formula can be selected for calculation according to the actual situation of the local river [31]. In this assessment, the following formula used by most Chinese researchers is selected for rough estimation based on the hydraulics and the local actual situation [32,33].
where v 1 is the non-scouring velocity; v 2 is the non-slushing velocity. R is the hydraulic radius; K, a and e are empirical coefficients.
(2) Establish membership functions of each assessment parameter to assessment criteria at each class.
The value of fuzzy membership function of each assessment parameter to assessment criteria at each class can be calculated by a set of formulae as follows: j = 1: j = 2-5: j = 6: where x i (i = 1, 2, . . . , m) is the original monitoring data of the ith assessment parameter, s ij (i = 1, 2, . . . , m; j = 1,2 . . . 6) is the membership degree of the ith assessment parameter to the assessment criterion at the jth class.
(3) Calculate the membership function matrix. The fuzzy matrix R, is produced by the membership values and corresponding quality parameters.
(4) Calculate the membership function weight matrix The weight calculated by weighting in the light of degree to which the single factor exceeds the standard grades. Allocate weight of each assessment parameter to get matrix W with the formulae, it can be expressed as where C i (i = 1, 2, . . . , m)is the actual monitoring data of the ith assessment parameter, S i (i = 1, 2, . . . , m) is the standard value of the assessment level of the ith class. Based on the above formula, then weight matrix can be expressed as: (5) Comprehensive assessment Fuzzy comprehensive assessment is realized by compound operation of the fuzzy matrix. The result can be obtained by the compound operation of the weight matrix W and the fuzzy matrix R, namely B = W × R. The water quality class is finally represented by the grade to which the maximum value in B belongs.

Choice of Key Indexes
According to the Formulas (1)-(5) in Section 2.3.1, eigenvalues, principal components, cumulative contributions and principal component loads are calculated (Figures 3 and 4). According to the principle that the eigenvalue is greater than 1, four principal components are determined, and the cumulative contribution is 82.6%. Therefore, it can be concluded that 82.6% of the information of the original data can be reflected in the four principal components corresponding to these eigenvalues, which should be retained in the assessment.  The data (Figures 3 and 4) show that the first principal component contains the largest amount of information, and the contribution is 36.3%. The larger load variables are NH 3 -N, TP, TN and CODcr. For urban rivers, nitrogen and phosphorus are restrictive nutrient elements that affect the abnormal reproduction of algae. Once the content is too high, it will cause eutrophication. CODcr reflects the extent to which water is contaminated by reducing substances such as organics, nitrites, and sulfides. Therefore, the first principal component mainly reflects the eutrophication level of the water body and domestic pollution.
The contribution of the second principal component is 24.8%. The larger load variables are velocity, water depth, pH, and dissolved oxygen. Water depth and velocity reflect water volume and hydrodynamic characteristics. Ph represents the acidity and alkalinity and has a certain control effect on the redox reaction. Dissolved oxygen is an important indicator of the aquatic life. Therefore, the second principal component reflects the hydrodynamic characteristics as well as the physical and chemical properties of the water body.
The contribution of the third and fourth principal component is 12.6% and 8.9%, respectively. The larger load variables are depth ratio, velocity and conductivity. Due to the contribution being less, some main loads are the same as that of the second principal component. Therefore, after the above analysis and comprehensive consideration, NH 3 -N, TP, TN, CODcr, pH, DO and velocity are selected as the key indexes of fuzzy comprehensive assessment.

Comprehensive Score of Pollution Degree
Principal component analysis scores for 17 monitoring points are calculated. Subsequently, the water pollution situation at each point is sorted. The higher the score, the more serious the pollution degree. The degree of pollution at each point is shown in Figure 5.

Determination of the Water Quality Class
The assessment factor set U = {NH 3 -N, TN, TP, CODcr, pH, DO, velocity} is determined through choosing by principal component analysis.
Due to the differences in wet cross-section of each monitoring point such as bottom width, water depth, slope coefficient and so on, the non-scouring and non-silting velocity are also different. Based on the geological data and field visit, the soil type, slope coefficient, roughness, design flow-rate and so on can be obtained. According to the empirical Formula (9) in Section 2.3.2 and Design Standard for Irrigation and Drainage Engineering of China (GB50288-2018), the non-scouring and non-slushing velocity can be calculated ( Table 2). The assessment criterion of velocity is similar to that of PH. The water quality belongs to the class I-V if the flow velocity is between the non-scouring and non-silting velocity, otherwise, the water quality is class Bad V.  Table 3. Velocity(m/s) between non-scouring and non-silting velocity ≥non-scouring velocity ≤non-silting velocity DO (mg/L) ≥7.5 6-7. The membership function matrices of 17 monitoring points are calculated. Subsequently, the weighted row matrix of each monitoring point is calculated. Take monitoring point Q1 as an example, its membership matrix is: Its weighted row matrix is: The fuzzy comprehensive assessment matrix of water quality can be obtained by the application of the weighted row matrix and membership function matrix. The fuzzy comprehensive assessment matrix of Q1 is: 242 0.187 0.1 0.07 0.317 0.085 It can be concluded that for monitoring point Q1, the maximum membership degree of each criterion is 0.317, so it is determined as class V.
Similarly, the membership matrices, weighted row matrices and fuzzy comprehensive assessment matrices of the remaining monitoring points are calculated in turn. According to the principle of maximum membership, the water quality class of each monitoring point is determined as shown in Figure 6.

Comparison of the Two Methods
In the results of the principal component scores (Figure 5), the monitoring points Q1, Z4, and Z5 have lower scores, which means their water quality is better than that of other monitoring points. Similarly, in the results of fuzzy comprehensive assessment (Figure 6), only the water quality of monitoring points Q1, Z4, and Z5 is classified as class V, and that of the remaining is class Bad V, indicating that the three points have better water quality. In addition, the monitoring point Z3 with the highest score has the worst water quality. Meanwhile, Z3 has the highest degree of membership among the 14 monitoring points whose water quality is class Bad V; therefore, it can be concluded that the water quality is, relatively, the worst. Consequently, the assessment results of the two methods are consistent from the perspective of the best and worst water quality. On the whole, the membership degree of the given three monitoring points in the Fenghuang stream is consistent with the ranking results of the principal component score. Although there are some differences between the results of membership degree and principal component score of the monitoring points in the Qingshui stream, the spatial variation of water quality evaluated by the two methods are basically the same. In summary, the above analyses prove that the selected key indexes are representative and the assessment results are true and reliable.

Spatial Analysis of Water Quality
According to the results of the Fenghuang stream, the principal component score from downstream to upstream changes from −0.19 to 0.14 and ends with −0.419, which reveals that the water quality of midstream is the worst, and that of downstream is the best. According to the results of the Qingshui stream, the water quality of upstream is better than midstream and downstream, and that of mainstream is better than that of the tributaries. The water quality of monitoring points in each tributary has no obvious regularity, and the pollution degree is different.
Among the monitoring points in Fenghuang stream, F2 scores the highest and its pollution degree is the highest, F3 and F1 are second compared to F2. Monitoring point F3 is close to the source of the river and less affected by human activities; the water quality is relatively good. The reach of monitoring points F3 to F2 is midstream, flowing through the farmland and undergoing the process of pollutant accumulation. Monitoring point F2 is close to the sewage culvert and outlet. The concentration of nitrogen and phosphorus pollutants is high in the sewage; therefore, the water quality is the worst. The reach of monitoring point F2 to F1 flows through the central town. Some domestic sewage and industrial sewage are discharged into the river, which causes a large amount of pollutants. However, monitoring point F1 is closer to the Ciqikou scenic area, and the dredging frequency of this reach section is high. Domestic sewage will be sent to urban sewage treatment plants for unified treatment and discharge, so the water quality of F1 is the best.
Among the monitoring points in the Qingshui stream, Z3 scores the highest and Q1 scores the lowest. Most of the pollutants in the Qingshui stream come from the farmland non-point source pollution, domestic sewage and industrial sewage. Monitoring point Z3 is located in the lower culvert of the Sha tie community, which mainly collects domestic sewage accumulated by the municipal pipe network. The effluent tank culvert is severely deposited, and the concentration of pollutants such as ammonia nitrogen and total phosphorus is high. Therefore, the water quality of Z3 is the worst. Monitoring point Q1, which is close to the Ciqikou scenic area, is the tail end of the mainstream. Owing to the large amount of inflow water, the dilution effect of pollutants is noticeable. Meanwhile, the inflow water is treated by installing an artificial aeration device. The pollutants are removed to a certain extent and the concentration is relatively low. Consequently, the water quality of Q1 is the best, but backward irrigation of the Jialing river appears. Monitoring point Z4 and Z5 have the good water quality and the lowest pollutant concentration among the monitoring points of the tributaries. The reach of Z4 and Z5 is mostly confluence of mountain runoff and rainwater, without domestic sewage and other sewage. The pollutant load is small, but there is some non-point source pollution of farmland. In the reach of Q5, Q4, Z1 and Z2, part of the restaurant sewage and domestic sewage are directly discharged into the river without being treated by the septic tank, resulting in serious siltation and deposition of particles and garbage in the sewage. The reach of Q7, Q8 and Z6 belong to the mountain runoff, which is greatly affected by industrial sewage, poultry slaughtering sewage and non-point source pollution of farmland.

Suggestions for Improving Water Quality
For the midstream, which has serious sedimentation of particles and garbage, a dredging project should be carried out and a clear river management mechanism should be established for regular dredging to ensure long-term effectiveness of the project. In order to prevent domestic sewage from being discharged directly into the river, sewage interception and management works shall be implemented in the midstream and downstream to collect and transfer the sewage overflowing from the sewage outlet to the urban sewage treatment plant for unified treatment. Furthermore, the restaurant sewage can be discharged by a biological method, coagulation method, electrochemical method or adsorption method after reaching the standard. Anaerobic ponds, artificial wetlands, oxidation ponds, etc., can be appropriately constructed in the area where the upstream is located to purify rural sewage. Pesticides and fertilizers should be applied scientifically to reduce non-point source pollution in farmland.
The supervision mechanism should be improved. Moreover, the sewage discharge and siltation of river should be checked regularly. The publicity and education on water pollution prevention should be strengthened and the environmental protection awareness of urban residents should be deepened.

Conclusions
(1) Based on the traditional indexes of water quality assessment, this study adds the water quantity characteristic indexes, and combines the principal component analysis method and fuzzy comprehensive assessment method to propose an improved principal component-fuzzy comprehensive assessment coupling model for urban rivers, which is found to be appropriate for water quality assessment with multiple monitoring points, multiple indexes, and a large number of samples, and the assessment results are reasonable and effective. The model is characterized by comprehensively considering various assessment indexes including water quality and quantity, and capable of highlighting the influence of key indexes. Meanwhile, it can not only quantitatively describe the spatial distribution of water pollution rank at each monitoring point, but also determine the water quality class.
(2) The model was applied to the water quality assessment of the Qingshui and Fenghuang streams in Chongqing, China. The assessment results show that key indexes include NH3-N, TP, TN, CODcr, pH, DO and velocity. The water quality of three monitoring points is classified as class V, and that of the remaining 14 monitoring points is classified as class Bad V. The pollution was, generally, severe. In terms of the Fenghuang stream, the water quality of midstream is the worst, and that of downstream is the best. In terms of the Qingshui stream, the water quality of upstream is better compared to the midstream and downstream. The water quality of mainstream is better than the tributaries. The reasons for spatial pollution of the two rivers are analyzed as follows: the upstream is mainly farmland non-point source and domestic sewage pollution. Some reaches of midstream have direct discharges of domestic sewage, industrial sewage, and restaurant sewage; therefore, the sedimentation of particles and garbage is noticeable. The downstream has taken some protection measures such as artificial aeration and sewage interception and management, but there are still some direct discharge of domestic sewage and backflow of river water. The results provide the spatial variation and class of water quality for local environmental managers, and indicate the direction for determining pollution sources, management measures and treatment means. Therefore, to deal with the current pollution situation, it is necessary to take corresponding measures to control and maintain the water body according to the pollution causes of different river sections and protect the urban river ecosystem.