Determination of Pollution Loads in Spillways of the Combined Sewage Network of the City of Cuenca, Ecuador

Combined sewer overflow (CSO) is one of the main causes of contamination in receiving bodies during the rainy period. The objective of this research was to evaluate the behavior of three combined sewage discharges into the Tomebamba River in the city of Cuenca, Ecuador. For this, the registration of 18 CSO events was carried out. The following water quality parameters were analyzed from the field survey (March 2017 to May 2018): conductivity, turbidity, BOD5, COD, fecal and total coliforms, nitrates, nitrites, ammoniacal nitrogen, dissolved orthophosphate and total phosphorus. The results show that CSOs contribute to the deterioration of the water quality of the Tomebamba River during the rainy season. The analysis of the dynamics of the pollutants determined that the maximum conductivity values occur at the beginning of the discharge, and the maximum turbidity is located near the peak discharge flow. The relationship between rain and the characteristics of the CSO was also analyzed through a canonical correlation analysis and partial least squares regression, obtaining a prediction model of pollutants based on the precipitation parameters. These results can be used for the implementation of integrated ecological models that enable a complete analysis of the city’s sanitation systems, their impact on the receiving bodies and their restoration.


Introduction
In several cities around the world, wastewater and urban runoff water are transported through the same sewage network called the combined sewage network (CSN). Generally, the CSN conducts the flow to a wastewater treatment plant (WWTP) and then returns it to a receiving body. During intense rainfall events, the flow rates in the CSN can greatly exceed the capacity of the WWTP, so structures that allow the discharge of water without any treatment directly into the receiving body are built [1][2][3][4]. This process is called combined sewer overflow (CSO). CSO events have a significant influence on the degradation of the water quality of the receiving bodies during the wet season [2,[5][6][7][8][9].
Urban wastewater is mainly composed of organic matter, which causes reduced oxygen levels [10,11]. Additionally, this water contains an important load of microorganisms, most of which are pathogens responsible for transmitting water diseases [10]. On the other hand, runoff waters may contain suspended solids, microorganisms, organic matter, pesticides and metals, depending on their type [11]. The quality of the water that overflows during CSO events varies within the same storm event and between storms [12] and also depends on the characteristics of the rain [2,13,14]. integrated ecological models for the analysis of the city sanitation systems, their impact on the receiving bodies and their restoration. None of these aspects has been studied before with regard to the CSOs of Ecuador.

Study Area
The study area includes three urban micro-basins within the city of Cuenca, the capital of the province of Azuay, Ecuador. Cuenca is located in the southern part of the Andean mountains, in the geomorphological zone cataloged as the inter-Andean valley, located at an altitude that varies from 2300 to 2900 m ASL, expanding in a surface of 36 km 2 . The annual average temperature changes, according to the altitude, from 9 to 16.3 °C [28,29]. The rainy season runs from the middle of February until the beginning of July and from the second half of September until the first two weeks of November, while the rest of the year constitutes the dry season [24]. The average rainfall is approximately 879 mm per year [30].
The estimated population in the urban area of Cuenca in 2020 is 407,000 inhabitants [31]. The city is crossed by four main water bodies: The Tomebamba, Yanuncay, Tarqui and the Machángara Rivers. Most contamination in the Tomebamba and Yanuncay Rivers is due to urbanization. On the other hand, in the Machángara River, contamination is also due to industrial discharge. Pollution in the Tarqui River is due to both urbanization and livestock activities in its valleys [24].
The drainage system of the city of Cuenca consists of approximately a 1300-km sewerage network and an 80-km interceptor network, the latter leading the flow to the Ucubamba wastewater treatment plant (U-WWTP). This plant constitutes the main wastewater treatment infrastructure for the city [22]. The U-WWTP covers an area of approximately 45 hectares and is located at a distance of two kilometers downstream from the confluence of the Machángara with the Tomebamba River ( Figure 1). The U-WWTP treats approximately an average flow of 1.2 m 3 /s [32] and has been in operation since November 1999 [33]. About 80% of the wastewater in the City of Cuenca is treated before its return to the water bodies [22].  The three combined sewerage discharges analyzed in this document are located on the right bank of the Tomebamba River near the city center ( Figure 1). The first CSO is called Coliseo, which covers a contribution area of around 18.1 ha. The El Vado CSO is located downstream at approximately 1100 m, and its contribution area reaches 56.0 ha. Finally, 500 m downstream from the latter aforementioned CSO is the Multifamiliares, whose contribution area is 19.4 ha.
During this investigation, a total of 18 CSO events were registered, three in the Coliseo CSO, nine in the El Vado CSO and six in the Multifamiliares CSO. The samplings were carried out during the period from March 2017 to May 2018.

Discharge Flow Measurement
Flow rates at the discharge points were determined using the section-slope relationship method [34] through the automatic measurement of the water level and a characterization of the uniform flow presented in the approach section near the discharge. The recording of water height was carried out at one-minute intervals using a level sensor connected to an ISCO 6712 probe [20]. Based on the flow height data and information on the geometric and hydraulic characteristics of the pipeline, the discharge hydrograph was determined for each CSO event ( Figure 2).
Water 2020, 12, x FOR PEER REVIEW 4 of 23 The three combined sewerage discharges analyzed in this document are located on the right bank of the Tomebamba River near the city center ( Figure 1). The first CSO is called Coliseo, which covers a contribution area of around 18.1 ha. The El Vado CSO is located downstream at approximately 1100 m, and its contribution area reaches 56.0 ha. Finally, 500 m downstream from the latter aforementioned CSO is the Multifamiliares, whose contribution area is 19.4 ha.
During this investigation, a total of 18 CSO events were registered, three in the Coliseo CSO, nine in the El Vado CSO and six in the Multifamiliares CSO. The samplings were carried out during the period from March 2017 to May 2018.

Discharge Flow Measurement
Flow rates at the discharge points were determined using the section-slope relationship method [34] through the automatic measurement of the water level and a characterization of the uniform flow presented in the approach section near the discharge. The recording of water height was carried out at one-minute intervals using a level sensor connected to an ISCO 6712 probe [20]. Based on the flow height data and information on the geometric and hydraulic characteristics of the pipeline, the discharge hydrograph was determined for each CSO event ( Figure 2).

Sample Collection
Parallel to the flow registration, the samples to determine the water quality parameters were collected. Table A1 presents a summary of the data collected from the monitoring of the CSOs. The sampling and measurement of water levels were carried out with the ISCO 6712 automatic equipment and its integrated pumping system [20].
Since there is also precipitation that does not generate sewer overflow, it is necessary to define the threshold above which a CSO event is considered. The definition of this threshold is related to the influence of the overflow on the water quality of the river. If the overflow is slight, the dilution effect of the contaminants is negligible; therefore, it is not considered a CSO event [8]. In this research, a threshold related to the height of water in the discharge section was used. The water level at which sampling began corresponds to hmin = 2.5 cm (Figure 2).
During a CSO event, once the hmin level exceed the set threshold (Figure 2), the ISCO 6712 begins collecting samples with a time interval of 3 min [8,20]. The samples are stored at the bottom of the device in 24 one-liter plastic bottles. Therefore, due to the capacity of the ISCO 6712, the sampling span is a maximum time of 72 min (24 bottles multiplied by 3 min) for each CSO event. If the wastewater pour extends beyond 72 min, the entire discharge process cannot be sampled. Of the 18 events recorded, 11 of them had a pour time of less than 72 min and only three lasted a little over 100 min. The water quality parameters considered for the investigation were conductivity, turbidity, fiveday biochemical oxygen demand (BOD 5 ), chemical oxygen demand (COD), fecal coliforms (FC), total coliforms ( TC) , nitrates ( NO 3 -N) , nitrites ( NO 2 -N ), ammoniacal nitrogen ( NH 3 -N ), dissolved

Sample Collection
Parallel to the flow registration, the samples to determine the water quality parameters were collected. Table A1 presents a summary of the data collected from the monitoring of the CSOs. The sampling and measurement of water levels were carried out with the ISCO 6712 automatic equipment and its integrated pumping system [20].
Since there is also precipitation that does not generate sewer overflow, it is necessary to define the threshold above which a CSO event is considered. The definition of this threshold is related to the influence of the overflow on the water quality of the river. If the overflow is slight, the dilution effect of the contaminants is negligible; therefore, it is not considered a CSO event [8]. In this research, a threshold related to the height of water in the discharge section was used. The water level at which sampling began corresponds to h min = 2.5 cm (Figure 2).
During a CSO event, once the h min level exceed the set threshold (Figure 2), the ISCO 6712 begins collecting samples with a time interval of 3 min [8,20]. The samples are stored at the bottom of the device in 24 one-liter plastic bottles. Therefore, due to the capacity of the ISCO 6712, the sampling span is a maximum time of 72 min (24 bottles multiplied by 3 min) for each CSO event. If the wastewater pour extends beyond 72 min, the entire discharge process cannot be sampled. Of the 18 events recorded, 11 of them had a pour time of less than 72 min and only three lasted a little over 100 min. The water quality parameters considered for the investigation were conductivity, turbidity, five-day biochemical Water 2020, 12, 2540 5 of 24 oxygen demand (BOD 5 ), chemical oxygen demand (COD), fecal coliforms (FC), total coliforms (TC), nitrates (NO 3 -N), nitrites (NO 2 -N), ammoniacal nitrogen (NH 3 -N), dissolved orthophosphate (PO 4 -P) and total phosphorus (P). The conductivity and turbidity were recorded for all 24 samples from each test. Subsequently, considering the maximum conductivity value, five samples were selected to perform the COD analysis. The COD was obtained from the sample with the highest conductivity value. Moreover, two samples prior to this and two subsequent samples were also analyzed, skipping one sample, until completing the five records. BOD 5 , total coliforms, and fecal coliforms were obtained in most cases, but only for the sample with the highest conductivity value. In the last two samplings corresponding to the El Vado CSO, additional parameters were measured, NO 3 -N, NO 2 -N, NH 3 -N, PO 4 -P and P, for five of the 24 samples in each test (Table A2). Conductivity was recorded through a YSI-probe 600R multiparameter probe (By YSI-a Xylem brand based in Yellow Springs, OH, USA), while a Turb 555 turbidimeter (By WTW-a Xylem brand, Weilheim, Germany) was used to measure turbidity. These two tests were performed in the hydrophysics laboratory of the Water and Soil Management Program, PROMAS, University of Cuenca. The parameters BOD 5 , COD, FC, TC, NO 3 -N, NO 2 -N, NH 3 -N, PO 4 -P and P were analyzed in the water laboratory of the Faculty of Chemical Sciences of the University of Cuenca using standardized methodologies [8,10,11].

Dynamics of Pollutants during CSO
The quality of the water that overflows during the CSO varies within the same storm event since the first water during a storm is dirtier than that of the end of the storm. This also happens between storms, which also depends on the magnitude and duration of the storm, as well as the length of time without rain before said event [12].
To assess the behavior of the pollutants during the CSO process, two events from each discharge were chosen. The selection of these two events was made based on the following criteria: (i) The first event had and BOD 5 and COD values close to the median; (ii) The second selected event had maximum BOD 5 and COD values.
Criterion (i) looked for an average event related to contamination by organic matter, while criterion (ii) looked for an event with high contamination by organic matter in the wastewater.

Relationship between Rain and CSO Characteristics
The relationship between rainfall variables and CSO characteristics can be studied using multivariate statistical methods [3,14]. Canonical correlation is a way to evaluate the relationship between two multidimensional data sets, as is the case of the variables for rainfall and CSOs [14,35,36].
Canonical correlation analysis (CCA) and partial least squares regression (PLSR) were used to explore the relationships between rainfall variables and the characteristics of the CSOs. PLSR was used to develop equations that can estimate the parameters that characterize a CSO event from rainfall data. The rainfall data were obtained from the hydrological station of the University of Cuenca, which is located between the contribution areas of the El Vado and Multifamiliares CSOs. (Figure 1).
To determine the rainfall parameters, it is necessary to delimit the rainfall events. Weyrauch et al. [8] defined a single rain event as having at least a six-hour interval of no rain between two events. Instead, Yu et al. [37] defined a period of four hours. For the present study, the threshold for delimiting one rainfall event from another was three hours without precipitation. This was defined considering the size of the contribution basins, their hydrological responses and the rainfall regime observed in the pluviographic data.
For each rain event, the following independent and predictive variables were determined (Table 1): the rainfall depth (R d ), the maximum rainfall intensity at five minutes (I max ), the mean rainfall intensity (I mean ), the duration of the rainstorm event (D) and the dry weather duration before the rainfall event (D d ) [3,13,14,20,27]. The maximum intensity values (I max ) strongly depend on the time steps in which they are determined. Sandoval et al. [14] considered a duration of 30 min to determine the maximum rainfall intensity. This interval was defined considering the response time of the catchment. According to Sandoval et al. [14], the correlations between rainfall and CSO parameters did not vary for other maximum intensities defined in intervals of 5 and 15 min. In this sense, considering that the concentration time of the contribution catchment is relatively short (10-15 min), the maximum intensity of a rainfall event in a period of 5 min was taken to define the parameter I max [20]. The dependent variables related to the characteristics of the CSOs are (Table 1) the maximum discharge flow (Q max ), the mean conductivity (C mean ), the mean turbidity (Tur mean ) and the average chemical oxygen demand (COD mean ) [3,13,14,20].
Correlation matrices were preliminarily analyzed. The correlation coefficients of the different pairs of parameters (rainfall and CSO parameters) were calculated using the estimated p value to determine significant correlations [38]. Only relationships with a p value less than 0.05 are considered significant. For the CSO Coliseo it was not possible to determine statistical significance, due to the small number of available data (less than four) [39].
In this research, the R statistical analysis software [39] and the Student version InfoStat software [40] were used to develop the canonical correlation analysis and the partial least squares regression model.

Canonical Correlation Analysis
Canonical correlation analysis (CCA) is a multidimensional exploratory statistical method [36,41,42]. The main purpose of the CCA is the exploration of sample correlations between two sets of quantitative variables [36,43]. The CCA is based on the correlation between a linear combination of the variables in one set (rainfall parameters) with a linear combination of the variables in the other set (CSO characteristics). First, the pair of linear combinations with the maximum correlation is determined. Then, the pair with the maximum correlation is determined between the pairs not correlated with the first, etc. The linear combinations of a pair are called canonical variables and the correlation between them is called canonical correlation [41].
If the canonical relationship is statistically significant at a 0.05 level, we proceeded to examine the canonical functions to determine the relative importance of each of the original variables in the canonical relationships. The method used to interpret the results was canonical loadings [14,43].

Partial Least Squares Regression
The partial least squares regression (PLSR) aims to explain one or more response variables of one of the groups through the variables of the other group [36]. PLSR has advantages over other regression methods [44][45][46] and was developed to avoid, among other problems, the effect of multicollinearity on the explanatory variables of a regression, as well as the problem that occurs when the number of individuals is less than the number of variables [45,47]. PLSR is also usually used when there is a relationship between the predictor variables [44]. Wold [48] states that PLSR is most useful for predictive causal analysis in highly complex situations with little developed theoretical knowledge. PLSR may be the most suitable model for predictive purposes [45,49]. PLSR has advantages in situations with few samples, as is the case in the present study. PLSR can be a powerful analysis tool due to its minimal demands for measurement scales, sample sizes and residual distributions [45]. It does not require that the data come from normal or known distributions [50]. However, it should be noted that although the hypothesis of data normality is rarely found in reality, and despite the fact that this restriction can be avoided with this technique, the results and decisions based on them are clearly compromised [45].
PLSR can integrate multiple dependent and independent variables [51,52]. However, each time dependent variables are included, the model loses precision and prediction [53]. Therefore, multiple PLSR models could be formulated considering the rainfall characteristics as the multivariate input and the different CSO characteristics as the individual output in all cases, adjusting the model parameters through a cross validation procedure [14,44].

Characterization of the Pollutant Load
The boxplots in Figure 3 give information about spread, asymmetry and outliers of values of the recorder parameters [54]. For all the samples analyzed in the three CSOs during the entire test period, the conductivity varied from 50 to 616 µS/cm, the turbidity was in the range of 9 to 378 NTU, and the BOD 5 registered values were between 7 and 528 mg/L; the COD remained between 40 and 1450 mg/L ( Figure 3). The Coliseo CSO presented higher values of turbidity, COD and BOD 5 compared to those found in the El Vado and Multifamiliares CSOs (Figure 3b,d). This suggests a higher degree of pollution related to suspended material, microorganisms and organic matter from the Coliseo CSO.
In general, in the El Vado and Multifamiliares CSOs, the data appears to be more dispersed than the data in the Coliseo CSO ( Figure 3). This may be due to the fact that more trials are available in the El Vado and Multifamiliares CSOs. Additionally, from the box plots shown in Figure 3, it can be seen that in almost all, the lower whisker is smaller than the upper whisker. In the El Vado, Multifamiliares and Coliseo CSOs, 7.9%, 8.1% and 1.3% of the observations exceed the upper whisker, respectively. It can be argued that this phenomenon is not a major problem in exploratory data analysis. On the contrary, this gives an additional graphic indication of the shape of the distribution [55]. In this case, this suggests that the data distribution may be skewed to the right [54,55].
For the flows registered during the monitoring period, in the El Vado CSO, the average flow fluctuated between 16.1 to 1170.8 L/s, and the maximum peak reached 2002.2 L/s. This CSO provides the largest volume of residual water during CSO events. In the Multifamiliares CSO, the average flow was recorded between 8.5 to 252.3 L/s, and the maximum peak reached 862.0 L/s. In the Coliseo CSO, the average flow varied between 7.0 and 237.7 L/s, and the maximum peak was 508.8 L/s. Jerves-Cobo et al. [24] recorded flow measurements in the Tomebamba River between 11.12 to 19.79 m 3 /s (average of 17.50 m 3 /s) between the Coliseo and Multifamiliares CSOs during the rainy season.
The average BOD 5 values for the El Vado, Multifamiliares and Coliseo CSOs were 92.7, 134.1 and 170.0 mg/L, respectively ( Figure 3d). On the other hand, Jerves-Cobo et al. [24] determined the BOD 5 values for the section of the Tomebamba River between the Coliseo and Multifamiliares CSOs to be below 2.5 mg/L during the wet season. The above shows a significant contribution of organic matter from the CSOs to the Tomebamba River during high rainfall events. For conductivity (Figure 3a), the values obtained in the three discharges were mostly between 120 µS/cm and 240 µS/cm. However, the Multifamiliares and El Vado CSOs presented more dispersed data than the Coliseo CSO. Jerves-Cobo et al. [24] obtained conductivity values between 92 and 95 µS/cm in the study section in the Tomebamba River during the rainy season. The average conductivity of the wastewater discharged by the CSOs is higher than the conductivity found in the Tomebamba River. The average turbidity at the El Vado, Multifamiliares and Coliseo resulted in 130.6, 96.2 and 216.3 NTU, respectively (Figure 3b). Jerves-Cobo et al. [24] recorded turbidity values in the range of 4.4 to 8.7 NTU in the study section in the Tomebamba River during the wet season. Regarding the microbiological parameters, the three CSOs registered fecal coliform values in the range of 1.6 × 10 6 to 1.9 × 10 10 MPN/100 mL ( Figure 3e) and total coliforms from 4.8 × 10 6 to 4.1 × 10 10 MPN/100 mL (Figure 3f). Jerves-Cobo et al. [24] determined that the fecal coliform and total coliform values in the stretch of river spanning the three CSOs were in the order of 1.1 × 10 5 to 2.4 × 10 5 MPN/100 mL and 1.4 × 10 5 to 9.2 × 10 5 MPN/100 mL, respectively. These records correspond to the wet season, with flows measured in the Tomebamba River between 11.12 to 19.79 m 3 /s. The difference between the values of the parameters registered in the three CSOs and those recorded in the receiving body are large. This means that the waters of the CSOs present a high degree of contamination compared to the waters of the Tomebamba River. The combined sewer overflows that pour into the Tomebamba River during its passage through the city contribute to the deterioration of the river's water quality during the rainy period.
Water 2020, 12, x FOR PEER REVIEW 8 of 23 1.4 × 10 5 to 9.2 × 10 5 MPN/100 mL, respectively. These records correspond to the wet season, with flows measured in the Tomebamba River between 11.12 to 19.79 m 3 /s. The difference between the values of the parameters registered in the three CSOs and those recorded in the receiving body are large. This means that the waters of the CSOs present a high degree of contamination compared to the waters of the Tomebamba River. The combined sewer overflows that pour into the Tomebamba River during its passage through the city contribute to the deterioration of the river's water quality during the rainy period.  Figure 4 shows the results of the measurements of the parameters, nitrates (NO 3 -N), nitrites (NO 2 -N), ammoniacal nitrogen (NH 3 -N), dissolved orthophosphate (PO 4 -P) and total phosphorus (P), which were carried out during the last two CSO events in the El Vado (Table A2). The samplings registered values for nitrate between 0.12 to 0.53 mg/L (Figure 4a). Jerves-Cobo et al. [24] determined NO 3 -N values between 0.2 to 0.3 mg/L for the stretch of river between the three CSOs during the wet season. The nitrite concentration in the El Vado CSO was in the order of 3.74 to 66.27 μg/L (Figure  which were carried out during the last two CSO events in the El Vado (Table A2). The samplings registered values for nitrate between 0.12 to 0.53 mg/L (Figure 4a). Jerves-Cobo et al. [24] determined NO 3 -N values between 0.2 to 0.3 mg/L for the stretch of river between the three CSOs during the wet season. The nitrite concentration in the El Vado CSO was in the order of 3.74 to 66.27 µg/L (Figure 4b). According to Jerves-Cobo et al. [24], the NO 2 -N values measured in the corresponding section of the Tomebamba River varied between 3.3 and 8.2 µg/L in the rainy season. Based on the parameters NO 3 -N and NO 2 -N, it is evident that the El Vado CSO contributes to the deterioration of the waters of the Tomebamba River during CSO events. The dissolved orthophosphate values were in the range of 0.13 to 1.03 mg/L, with a mean value of 0.49 mg/L (Figure 4d). Ammoniacal nitrogen ranged from 0.49 to 2.54 mg/L, with a mean value of 1.34 mg/L (Figure 4c). Finally, the total phosphorus values were between 2.47 to 5.65 mg/L, with an average value of 3.77 mg/L (Figure 4e). Ecuadorian regulations establish a maximum daily mean value of 30 mg/L for ammoniacal nitrogen and 10 mg/L for total phosphorus [26], for the discharge of effluents into a body of fresh water. The concentrations of these two parameters in CSO El Vado are below their respective limits.

BOD5/COD Ratio
One of the most important characteristics of wastewater is its biodegradability, which informs the feasibility of applying biologic treatment methods. The relationship between the five-day biochemical oxygen demand and chemical oxygen demand (BOD 5 /COD) for all the samples analyzed was in the range of 0.04 to 0.92. This result evidences a high variation in the carbon sources of wastewater, showing a wide range from small to large amounts of biodegradable matter in wastewaters [22]. Table 2 shows that the Multifamiliares and El Vado CSOs display a greater range of variability, presenting a larger range in the type of pollutants. Jerves-Cobo et al. [22] adjusted the relationship between the BOD 5 and COD variables by means of a logarithmic regression with an R 2 of 0.62. In this study, it was determined that the data fit better with a quadratic relationship, in which R 2 was 0.73 (Appendix A- Figure A1). These results can be used for the prediction of BOD 5 in discharges from the COD values.

BOD 5 /COD Ratio
One of the most important characteristics of wastewater is its biodegradability, which informs the feasibility of applying biologic treatment methods. The relationship between the five-day biochemical oxygen demand and chemical oxygen demand (BOD 5 /COD) for all the samples analyzed was in the range of 0.04 to 0.92. This result evidences a high variation in the carbon sources of wastewater, showing a wide range from small to large amounts of biodegradable matter in wastewaters [22]. Table 2 shows that the Multifamiliares and El Vado CSOs display a greater range of variability, presenting a larger range in the type of pollutants. Jerves-Cobo et al. [22] adjusted the relationship between the BOD 5 and COD variables by means of a logarithmic regression with an R 2 of 0.62. In this study, it was determined that the data fit better with a quadratic relationship, in which R 2 was 0.73 (Appendix A- Figure A1). These results can be used for the prediction of in discharges from the values.

Environmental Legislation
Ecuadorian legislation establishes water quality criteria for the discharge of effluents into freshwater bodies [26]. This regulation only limits the concentrations of the daily mean values of the effluents. For the average values of COD in the Multifamiliares discharge, 67% of the events were below the concentrations established in the regulations (COD ≤ 200 mg/L). In the El Vado CSO, 33% of the events complied with the regulations. This CSO is an important source of contamination since it provides a greater volume of flow compared to other discharges. The Coliseo CSO presents high COD values, showing that none of the registered events complied with the environmental regulations for the discharge of effluents in freshwater receiving bodies.
For BOD 5 , only one sample was analyzed for each CSO event. Therefore, it was not possible to establish a characteristic mean value for the discharge event. However, for the purposes of comparison with the regulations, it is assumed that the point value obtained represents the average concentration of the CSO event. Thus, in the discharge data for Multifamiliares, 67% of the samples comply with the limits established by the legislation for the parameter For BOD 5 (≤100 mg/L). In the El Vado CSO, 56% of the samples comply with the regulations and in the Coliseo CSO, none of the samples meet the limits of BOD 5 . For the microbiological parameters, the regulations establish a limit of 2000 MPN/100 mL for the daily average value of fecal coliforms [26]. Although the mean total coliform values were not obtained for each CSO event, all the samples analyzed recorded values above this limit (between 800 and 1.0 × 10 7 times higher).
The results show the need for legislation that incorporates specific parameters that characterize the combined sewer overflow processes, an aspect that has already been implemented in other places around the world [13]. For instance, in the province of Quebec, Canada, the legislation recommends a maximum frequency of seven CSO spill per year under wet weather conditions [27]. In the El Vado ( Figure 5) and Coliseo ( Figure A3) CSOs, despite the peak flow being higher in event i, the average values of conductivity, turbidity and COD are higher in event ii. However, in the Multifamiliares CSO ( Figure A2), event ii (which has the highest flow) coincides with the highest records of concentrations of conductivity, turbidity and COD. This analysis suggests that the concentration values of these pollutants do not necessarily grow with the magnitude of the CSO event in terms of flow.

Dynamics of Pollutants during CSO
Some characteristics of the pollutants observed in the six analyzed events were identified, regardless of the discharge or the application of criterion i or ii. These characteristics are analyzed below. Figure 5a(i,ii), Figure A2a(i,ii) and Figure A3a(i,ii) show that peak conductivity values occur between the initial part of the CSO event and the maximum flow value of the discharge hydrograph in the three CSOs. This suggests a higher degree of contamination related to dissolved pollutants at the start of the CSO event. A similar conclusion was reached by Passerat et al. [2], who evaluated the variation in the contaminating components during a precipitation event in one of the main CSOs that discharge to the Seine River in Paris, France.
In all the events analyzed and presented in Figure 5b(i,ii), Figure A2b(i,ii) and Figure A3b(i,ii), the maximum turbidity values occur near the discharge peak flow of the discharge hydrograph; a similar result was determined by García et al. [56]. Turbidity is related to the suspended matter load in the wastewater. The increase in turbidity in the maximum section of the hydrograph may be due to the resuspension of sedimented material in the pipes, caused by an increase in tractive force due to an increase in flow. Passerat et al. [2] determined that around 75% of the suspended matter found in the combined sewer discharge came from the resuspension of sedimented particles in the pipes. The authors also determined that the intensity of precipitation influences the percentage of suspended matter found in the wastewater during discharge. coliform values were not obtained for each CSO event, all the samples analyzed recorded values above this limit (between 800 and 1.0 × 10 7 times higher).
The results show the need for legislation that incorporates specific parameters that characterize the combined sewer overflow processes, an aspect that has already been implemented in other places around the world [13]. For instance, in the province of Quebec, Canada, the legislation recommends a maximum frequency of seven CSO spill per year under wet weather conditions [27]. In the El Vado ( Figure 5) and Coliseo ( Figure A3) CSOs, despite the peak flow being higher in event i, the average values of conductivity, turbidity and COD are higher in event ii. However, in the

Relationship between Rain and the CSO Characteristics
Correlation matrices that are complementary to the CCA were preliminarily analyzed [36,38,43]. These results allow one to visualize the relationship between rainfall and CSO variables for the three CSOs (Figures 6, A4 and A5).
The correlation graph for the El Vado ( Figure 6) and Multifamiliares ( Figure A4) CSOs shows that the average intensity of rain (I mean ), has an influence on the maximum discharge flow (Q max ), for this correlations a p value less than 0.05 was obtained (Tables A3 and A4). It is considered a significant correlation. Furthermore, in the El Vado CSO the maximum intensity of precipitation (I max ) is related to the maximum flow Q max (p value < 0.05). This same relationship is obtained in the Multifamiliares CSO, but with p value of 0.10. Sandoval et al. [14], in their research on the main CSO in Berlin, found that the maximum and average intensity of precipitation is related to the volume of water discharged, as well as the maximum flow and the average discharge flow. At CSO El Vado, a significant relationship was also found between total precipitation R d and maximum flow Q max .
In the El Vado and Multifamiliares CSOs, the I max and I mean respectively also seem to influence the values of the average turbidity (Tur mean ) (Figures 6 and A4). In the same way, these correlations obtained a p value less than 0.05. According to Murillo [57], turbidity helps determine the amount of suspended material, where the higher the intensity of rain is, the greater the drag of suspended solids will be. Likewise, Sandoval et al. [14] found a relationship between I max and I mean with the value of total suspended solids.
For the Multifamiliares CSO ( Figure A4), it was also found that the Tur mean is related to the duration of the dry period prior to a rainfall event (D d ) with a statistical significance related to a p value less than 0.05. In this CSO, it was also found that the COD mean is related to the average intensity of precipitation (I mean ) with a p value of 0.12. D d and I mean , which can influence the amount of drag material due to the runoff and resuspension of material deposited in the drainage ducts that cause higher pollution loads [2,14,20].
in the wastewater during discharge.

Relationship between Rain and the CSO Characteristics
Correlation matrices that are complementary to the CCA were preliminarily analyzed [36,38,43]. These results allow one to visualize the relationship between rainfall and CSO variables for the three CSOs (Figures 6, A4 and A5). The correlation graph for the El Vado ( Figure 6) and Multifamiliares ( Figure A4) CSOs shows that the average intensity of rain (I mean ), has an influence on the maximum discharge flow (Q max ), The results of the correlational analysis between the rainfall and CSO parameters for the Multifamiliares CSO showed a similar pattern to the results obtained for the El Vado CSO, mainly between the parameters I max , I mean , Q max and Tur mean .
When comparing the correlation graphs of the three CSOs (Figures 6, A4 and A5), it can be seen that the correlation matrix of the Coliseo CSO is very different from the other two. This discrepancy could be due to the Coliseo CSO registering only three trials, unlike the other two CSOs, for which more trials were performed. For this, it is recommended in a future study to take a similar quantity sample in each CSO to obtain comparable results. For this reason, given the small sample size, the results for the Coliseo CSO should be interpreted with caution.

Canonical Correlation Analysis
The Canonical Correlation Analysis was carried out only for the El Vado CSO. For the Multifamiliares and Coliseo, it was not possible to perform a CCA due to the insufficient number of available trials (six trials or less) [43].
From the CCA, only two canonical variables L(1) and L(3) were found to be statistically significant (p value ≈ 0). For both canonical variables, the correlation is close to one. This means that two canonical correlations would be sufficient to measure the association between the rainfall and CSO variables [41]. Table 3a shows the canonical loadings (ρ) obtained between L(1) and L(3) and the rainfall characteristics (independent variables). On the other hand, Table 3b shows the canonical loadings obtained between the canonical variables L(1) and L(3) and the characteristics of the CSO (response variables).
Regarding the CCA presented for the El Vado CSO, in Table 3a,b, the values of ρ greater than 0.4-0.5 between L(1) and L(3) and the characteristics of rainfall and CSO indicate the possible influence of rainfall on the CSO variables, an aspect consistent with the results found by Sandoval et al. [14]. From the analysis of L(1), the maximum intensity of rainfall (I max ), the average intensity (I mean ) and the total rainfall depth (R d ) seem to have influence on the variables Q max and Tur mean . These relationships agree with those obtained from the analysis of the correlation matrices. From the analysis of L(3), it was determined that the duration of the rain (D) and the total rainfall depth (R d ) (to a lesser degree) are related to the mean turbidity values (Tur mean ).

Partial Least Squares Regression (PLSR)
Next, we describe the construction of the PLSR model for the pollutant Tur mean at the El Vado CSO. In the same way, the PLSR prediction models were determined for the other dependent variables, C mean , COD mean , and Q max , for all CSOs. Figure A6 shows the root mean squared error of Prediction (RMSEP) for the variable Tur mean as a function of the number of components used in the construction of the model. Table 4 shows the percentage of variability explained by each of the components of the model. Considering the reduction of the RMSEP and the levels of variability explained, two components were determined for the construction of the PLS model to predict Tur mean for the El Vado CSO ( Figure A7). After performing the PLSR for all the dependent variables, fit equations were obtained that can estimate the output CSO characteristics (COD mean , C mean , Tur mean , Q max ) based on the input rainfall characteristics (R d , I max , I mean , D and D d ). The structure of the PLSR is reported in Equation (1).
(1) the regression coefficients (C 1 , C 2 , C 3 , C 4 , C 5 and C 6 ) of the PLSR models for the prediction of the CSO variables in the El Vado CSO are presented in Table 5. The results of the PLSR for the Multifamiliares and Coliseo CSOs are presented in Tables A5 and A6, respectively. The determination coefficient R 2 for the prediction of the CSO parameters in the El Vado CSO ranged from 0.60 to 0.81 (Table 5); for the Multifamiliares CSO, this value was between 0.91 and 0.37 (Table A5) and for the Coliseo CSO, it was in the range of 0.84 and 0.18 (Table A6). On average, the El Vado CSO presents the highest values of R 2 , this means that in this CSO, the equations had a better fit.
The lowest R 2 values were recorded in the determination of COD mean for the Multifamiliares and Coliseo CSOs (0.37 and 0.18, respectively). These last CSOs have a contribution area approximately three times smaller than the El Vado CSO area. On the other hand, the R 2 value obtained for the prediction of COD mean in the El Vado CSO was 0.71 (Table 5). This result suggests that the equations obtained for the COD mean would be better fitted to a larger basin.
In the Coliseo CSO, the values of R 2 showed the widest range of variability. This may be because a smaller number of tests were performed on this combined sewer overflow compared to the other two CSOs, which may have affected the fit of the prediction equations. Thus, in the El Vado CSO nine CSO events were registered with a total of 161 samples, while in the Multifamiliares CSO six events were registered with 124 samples, finally in the Coliseo CSO three CSO events were registered with 46 samples (Table A2).
The average of the R 2 values for the prediction of the variables C mean and Q max in the three CSOs presents a smaller range of variation than in the other two variables. This suggests that the C mean and Q max possibly contain less uncertainty or that their predictions are less sensitive to the uncertainty of the rain. Sandoval et al. [14] found a similar result for these same variables.
For the prediction of COD mean in the Coliseo CSO, a low determination coefficient (R 2 = 0.18) was obtained. The value of R 2 measures the goodness of fit-of-the-model to a set of data [58], in this case, the equation obtained for COD mean in the Coliseo CSO does not have a good fit, so it is considered a low precision model.
The PLSR results are also displayed through a triplot that represents the cases, the response variables (CSO: Q max , C mean , Tur mean , COD mean ) and the predictor variables (Rainfall: R d , I max , I mean , D, D d ) measures for the same cases. Figure 7 shows the triplot for the El Vado CSO, while the triplots for the Multifamiliares and Coliseo CSOs are shown in Figures A8 and A9, respectively.
The triplots of the El Vado and Multifamiliares CSO (Figures 7 and A8, respectively) showed similar relationships between the variables of rainfall and CSO. These results are consistent with the relationships that were determined through the correlation analysis and CCA. In these two CSOs, a correlation was determined between the independent variables I max and R d with dependent variable Q max . A relationship of the average intensity (I mean ) with the average Turbidity (Tur mean ) was also observed, as well as a positive relationship between D d and the COD mean . Similarly, a negative relationship between the total rainfall depth (R d ) and mean conductivity (C mean ) was observed.
The triplot obtained for the Coliseo CSO ( Figure A9) differs from the triplots made for the El Vado ( Figure 7) and Multifamiliares ( Figure A8) CSOs. As mentioned above, this may be due to the small sample size used in the Coliseo CSO analysis.
The relationships found determine the influence of rainfall parameters on the behavior and dynamics of pollutants during CSO events. These relationships can be used in the construction of integrated ecological models for the evaluation and complete analysis of the city's sanitation systems, their impact on the receiving bodies and their restoration. case, the equation obtained for COD mean in the Coliseo CSO does not have a good fit, so it is considered a low precision model.
The PLSR results are also displayed through a triplot that represents the cases, the response variables (CSO: Q max , C mean , Tur mean , COD mean ) and the predictor variables (Rainfall: R d , I max , I mean , D, D d ) measures for the same cases. Figure 7 shows the triplot for the El Vado CSO, while the triplots for the Multifamiliares and Coliseo CSOs are shown in Figures A8 and A9, respectively. The triplots of the El Vado and Multifamiliares CSO (Figures 7 and A8, respectively) showed similar relationships between the variables of rainfall and CSO. These results are consistent with the relationships that were determined through the correlation analysis and CCA. In these two CSOs, a correlation was determined between the independent variables I max and R d with dependent variable Q max . A relationship of the average intensity (I mean ) with the average Turbidity (Tur mean ) was also observed, as well as a positive relationship between D d and the COD mean . Similarly, a negative relationship between the total rainfall depth (R d ) and mean conductivity (C mean ) was observed.
The triplot obtained for the Coliseo CSO ( Figure A9) differs from the triplots made for the El Vado ( Figure 7) and Multifamiliares ( Figure A8) CSOs. As mentioned above, this may be due to the small sample size used in the Coliseo CSO analysis.

Conclusions
The El Vado, Multifamiliares and Coliseo CSOs contribute to the deterioration of the water quality of the Tomebamba River during the rainy season. There was a high variation in the type of wastewater related to the content of biodegradable material in the CSOs. There was also great temporal variability in the water quality parameters during CSO events. For conductivity, in the three CSOs, the highest values were presented between the beginning of the CSO event and the maximum discharge value obtained in the hydrograph. On the other hand, the maximum turbidity values were located relatively close to the peak flow of the discharge hydrograph.
From the analysis of the relationship between rainfall and CSO characteristics carried out with the CCA and PLSR methods, it is concluded that (i) The maximum discharge flow of a CSO is mainly related to the maximum intensity of rainfall and, to a lesser degree, the average intensity and total rainfall depth; (ii) The average rainfall intensity and, the maximum intensity, both influence the average turbidity values; (iii) The average turbidity is positively related to the duration of the dry season prior to a precipitation event and to the average intensity of rainfall. (iv) The mean turbidity is also negatively related to the duration of a precipitation event. (v) The total rainfall depth showed a negative relationship with average conductivity.  C-conductivity; Tur-turbidity; BOD 5 (five-day biochemical oxygen demand); COD (chemical oxygen demand); FC (fecal coliforms); TC (total coliforms); NO 3 -N (nitrates); NO 2 -N (nitrites); NH 3 -N (ammoniacal nitrogen); PO 4 -P (dissolved orthophosphate) and P (total phosphorus).  Figure A1. Adjustment of the relationship between BOD 5 and COD using quadratic regression.