A New Approach to Use Load Duration Curves to Evaluate Water Quality: A Study in the Doce River Basin, Brazil

: Although water availability depends both on qualitative and quantitative aspects, most studies focus only on one of these. Therefore, the goal here is to relate water quality and quantity with the construction of Load Duration Curves (LDC) and to estimate E. coli load patterns in di ﬀ erent ﬂow conditions, seasons, and positions of two sub-basins of the Doce watershed (Brazil): Piracicaba and Piranga. A novel methodology is proposed in which the Burr XII distribution is adjusted to the LDC to compare all observed loads to their respective Total Maximum Daily Load (TMDL), allowing the estimation of the relative di ﬀ erence (RD) between these. Higher values of RD were observed for low ﬂows for the Piracicaba basin, more urbanized, where point sources of pollution are the primary concern, reaching up to 99% of needed load reduction. In the Piranga basin, more agricultural, there was a broader RD variation, from 9% to 97% load reduction needed, which is an evidence of point sources of pollution combined with non-point sources. The new methodology can be used to estimate the load reduction of any pollutant and can be used by environmental agencies to identify e ﬀ ective practices to minimize and control pollution in di ﬀ erent locations of the basins.


Introduction
Water availability depends on the quantitative and qualitative assessments of this resource. Therefore, to ensure a sustainable future, these evaluations must be taken altogether. Nevertheless, in spite of this pre-requisite, most studies only cover the quantification and/or prevision of streamflow [1,2] or the characterization and/or modeling of water quality [3][4][5].
Watershed models consist of tools that can be used to assess water quantity and quality simultaneously. However, these approaches often require a diversity of input data and information to run, which can be viewed as a limitation for their use in the developing countries where data is fragmented, uncertain, and barely available [6][7][8][9]. The limited portability of watershed models can be overcome by simpler quantity-quality methods, which are less dependent on flow and transport equations and their adjustment parameters, such as Load Duration Curves (LDC).
An LDC is a graphical representation of water quality, namely observed and total maximum daily loads. The TMDL is an upper threshold of a predefined water quality criterion. For example, the TMDL for Escherichia coli (E. coli) can be computed considering a concentration of 1000 MPN dL −1 (Most Probable Number) for most Brazilian rivers, according to the Environment National Council by the USEPA, thus providing a more representative evaluation of water quality. Thus, besides the presentation of a new method to quantify a required load reduction, the aim here was to increase our understanding of E. coli load variation and thus optimize resource allocation and selection of appropriate best management practices to reduce the percentage of impairment in the basins of study.

Study Area
The study was developed in two neighbor watersheds, the Piranga and Piracicaba basins, within the Doce river basin (MG), illustrated in Figure 1. This basin received worldwide attention when a tailing dam collapsed in 2015, destroying a small city and compromising the water quality from the basin headwaters towards the sea [26].     Table 1. Table 1. List of water quality and streamflow gauges used in the study, with the upstream drainage area. Information is also provided for the cities where the gauges are located, namely city name and their population. The basins are mostly occupied by not too populous cities. Thus, given the lack of resources from these cities, most of them lack wastewater treatment plants, which causes most of their generated sewage to be loaded into the rivers without treatment. This situation, associated with pasture and swine activities in the basins, has caused E.coli contamination to be one of their core water quality problems [27].

Basin
Regarding the loading capacities, CONAMA establishes that the loadings must respect the natural capacity of the river, which means that, after receiving the load, the water quality parameters of the river must meet its class of use. There is no maximum concentration of E. coli established in the loading legislation. Industries or cities that intend to load pollutants into rivers must apply for licenses in the responsible environmental agency and respect a list of minimum requirements, which must be fulfilled unless an unusual situation is experienced. However, given the lack of monitoring agents, situations in which this legislation is not met are frequent, compromising the water quality in many Brazilian water bodies [28].

Data Used in the Study
To understand the nexus between E.coli contamination and the hydrological patterns in the basins, the maximum allowed E. coli load was compared to the observed E.coli load in various water quality gauges. The maximum allowed load combines the water quality criteria set up by the CONAMA, which is 1000 MPN dL −1 , with the flow rate at the streamflow gauge, obtained with the development of an FDC. By multiplying the FDC with the water quality criteria, it is then possible to obtain the LDC, which relates the maximum allowed load to the percentage that can be observed in time (and flow permanence), the Total Maximum Daily Load (TMDL) [29,30]. Larger allowable loads are associated with a shorter permanence, since these are associated with higher flows when the dilution capacity of the river is more elevated. On the other hand, smaller allowable loads are associated with smaller streamflows.
Because it is necessary to measure water quality and quantity at the same location in order to estimate the TMDL, streamflow, and water quality gauges approximated from each other were selected for the study. Three and four pairs of stations were selected for Piracicaba and Piranga basins, respectfully, with data from 1997 to 2018. The MG state holds four water quality campaigns throughout the year to account for three different periods: Wet, dry, and intermediary. The wet season goes from January to March, the dry season from July to September, and the intermediary seasons from April to June and October to December. During the wet and dry seasons, 51 water quality parameters are sampled, whereas 19 parameters are sampled in the intermediary seasons. E. coli concentrations are measured in all campaigns.
In this state, the assessment of water availability and water quality occur separately. Therefore, the gauges are not installed in a way to minimize the distance between then. As a consequence, there are limited locations in which it is possible to apply an LDC-like methodology, and the amount of data is frequently limited.
First, the normalized 7-day, 10-year low flow (q7,10), in m 3 s −1 km −2 , at the selected gauges was calculated per month. This analysis improves the understanding of the hydrological dynamics in the gauges and their potential relationship with water quality. The q7,10 was selected because it is the low flow used by the state of Minas Gerais for water resources management and planning, and it was normalized by the catchment's upstream drainage area to estimate water production, and therefore, allow their comparison [31].
For the scope of this study, extreme outliers were evaluated and eliminated from the dataset. Inferior and superior extreme samples can be identified using box-plots in combination with Equations (1), and (2) [32], respectively: where LE inf and LE sup are the inferior and superior limits of the box-plot for the identification of extreme outliers. The seven time-series used in this study, which represent stream flows in the main watercourses of their respective basins, were obtained from the HidroWeb, which is operated by the Brazilian Water Agency (ANA). The FDCs were generated with the Hydrology Plus software [33].

Load Estimation
The TMDL was calculated for all gauges (Load TMDL ), to understand their water quality dynamics, and then compared with the observed E. coli load monitored in the water quality gauges (Load OBS ). When plotted together, a water quality violation is characterized when Load OBS is higher than Load TMDL . Load TMDL and Load OBS were calculated according to Equations (3) and (4), respectively: where C MAX is the maximum allowable concentration of E. coli, determined according to Class 2 of the CONAMA regulation, which gives 1000 MPN dL −1 ; Q FDC is the streamflow obtained with the flow duration curve of the gauge; C OBS is the observed concentration of E. coli sampled in the water quality gauges; Q OBS is the streamflow observed in the day in which the water quality was sampled, and CF is a conversion factor used to estimate the load in MPN day −1 (864,000 × 10 3 ). The Loess smoothing technique [23] was applied to estimate the tendency of Load OBS samples in relation to the Load TMDL estimated with the LDC. This technique highlights the tendency of observed data in relation to the maximum observed load in all gauges. Thus, it is possible to understand the expected water quality for different flow permanences. The Loess method consists of a locally weighted regression that, for each value in the x-axis, a fitted value in the y-axis is generated, the so-called x k . According to this method, the fitted values consider all values in the x-axis, but their weight varies according to the proximity of x k . The result of the smoothing technique is that it generates curves with different parameters for each value in the x-axis.
The smoothing analysis helps the visual perception of observed loads in the basin in comparison to the water quality criteria, instead of comparing the sampled points with the Load TMDL . Thus, in areas where the smoothing curve is above the TMDL, it is likely that the water quality will be impaired for the flows related to that permanence.

Burr XII Curve
The main novelty of the proposed methodology consists in the identification of the needed load reduction. To quantify the pollutant reduction for each observation and then for all classes of flow, the LDC representing the TMDL was fitted to the Burr XII distribution [24,25]. In this case, it is possible to predict the observed load for all the x-values in which the observed E. coli was sampled, unlike the LDC method proposed by USEPA, whereas the observed load is computed based on a single concentration and the middle permanence flow of each flow regime. The Burr XII is a flexible double-powered distribution and was selected because it has a precise fit for empirical data, thus representing with accuracy the extremes of an FDC, and consequently an LDC, for the scope of this study. The Burr XII distribution can also be adjusted to different streamflow regimes, such as for perennial, intermittent and ephemeral rivers. Equation (5) describes the Burr XII equation, where Load p is the observed load arranged in descending order; p is the percentage of exceedance, τ is the cease-to-flow percentile, which is the ratio between the days in which the flow was larger than zero to the total number of observations of the series. In this study, τ = 100 because all rivers are perennial; λ is the scale parameter and β and α are related to the shape of the curve.
If the curve is not adjusted, the quantification of the difference between observed and the maximum allowed E. coli load is computed through approximation, i.e, the method proposed by the USEPA, in which the observed load is set as the multiplication of the median streamflow for each flow range by the 90th percentile concentration of a given flow regime, whereas the maximum allowable is the medium load of that regime. This allowable load is compared to the existing load, in order to identify the ranges by which the pollutant contamination is more expressive [22].
In the present study, the R software was used to fit the permanence values which the observed E. coli was sampled. The equation parameters were determined by an iteration process. The fitted equations allowed to estimate the load related to each percentage in which there were sampled observations. In turn, this allowed the estimation of total E. coli contamination and its percentage observed in each flow regime: High, mid-range and low, corresponding to a percentile of 0%-30%, 30%-70%, and 70%-100%, respectively [34]. The r-squared statistics was used to evaluate the goodness-of-fit of the fitted distribution.

Estimation of the Needed E. coli Load Reduction
There are different methods to estimate pollutant reduction according to the flow regimes. Here, the reduction, or Relative Difference (RD), is calculated for all observations and compares the observed load with the maximum load, which is obtained with the adjusted curve of the FDC. The RD between the actual sampled load and the fitted load was calculated according to Equation (6), where Load OBS corresponds to the observed load of the water quality gauges, while the Load MAX represents the maximum allowed load estimated from the fitted TMDL curve. The Relative Difference (RD) that represents the difference between the observed load and the maximum accepted load considering Class 2 of the CONAMA 357/2005 legislation [35] was calculated for all observed samples. The positive values indicate that there is a need for E. coli load reduction, whereas negative values indicate that the load in the section is not causing the river section to be impaired. In the format of Equation (6), positive values range from 0% to 100%, indicating the percentage of reduction required to meet the water quality criteria. Negative values, on the other hand, can be smaller than negative 100%, depending on the magnitude of Load OBS in relation to the Load MAX . Nevertheless, these are always classified as "no-load reduction needed".
To facilitate the interpretation, the RD was also divided into three groups, representing three classes of flow regime: High (HF), mid-range (MRF), and low flows (LF). For each class, the number of impaired observations was quantified, and then it was possible to estimate the magnitude of the needed reduction for each flow regime with the use of box-plots where the load variation in each class of flow was represented. The RD observations were also divided into seasons, which represent the frequency in which the water quality samples are collected, in order to identify when the pollution is more expressive during the year, and if it matches the expected flow (high, medium or low).
The main limitation of this approach, however, is that the goodness-of-fit of the Burr XII curve to the FDC will determine the validity of the RD estimates, and hence the magnitude of the needed reduction. Additionally, the margin of safety (MOS) and allocation of loads that characterize the final aspects of the LDC methodology proposed by the USEPA were not covered here.

Water Quality and Streamflow Patterns
Prior to the estimation of the E. coli load, the pollutant concentration and streamflow patterns throughout the studied basins were investigated to increase the knowledge about their relationship with the existing load. Figure 2 illustrates the box-plots of the stations involved in the study for the Piracicaba and Piranga basins (Figure 2a,b), as well as the normalized 7-day, 10-year low flow (q7,10) of the selected gauges per month for the Piracicaba and Piranga basins (Figure 2c,d).
Water 2020, 12, x FOR PEER REVIEW 7 of 22

Water Quality and Streamflow Patterns
Prior to the estimation of the E. coli load, the pollutant concentration and streamflow patterns throughout the studied basins were investigated to increase the knowledge about their relationship with the existing load. Figure 2 illustrates the box-plots of the stations involved in the study for the Piracicaba and Piranga basins (Figure 2a,b), as well as the normalized 7-day, 10-year low flow (q7,10) of the selected gauges per month for the Piracicaba and Piranga basins (Figure 2c,d). In Figure 2, the dashed line represents the maximum pollutant concentration allowed in Class 2 rivers, 1000 MPN dL −1 . After the outlier is removed, according to Equations (1) and (2), the number of E. coli samples in the Piracicaba basin was 80, 81 and 79 for gauges RD025 (56610000), RD029 (56659998), and RD031 (56696000), respectively. For the Piranga basin, the number of samples was 72, 73, 114, and 108 for gauges RD001 (56028000), RD007 (56075000), RD013 (56110005), and RD023 (56539000), respectively.
In most stations, the water quality was impaired for more than half of the samples, in particular stations RD025 and RD029 in the Piracicaba basin and station RD013 in the Piranga basin, in which cases more than 75 percent of the data were above the maximum allowed concentration established by the legislation.
With regards to the hydrological patterns, which summarize the historical flowrates from 1989 to 2018, it is clear that these are similar: Streamflow increases from November to March and decreases In Figure 2, the dashed line represents the maximum pollutant concentration allowed in Class 2 rivers, 1000 MPN dL −1 . After the outlier is removed, according to Equations (1) and (2), the number of E. coli samples in the Piracicaba basin was 80, 81 and 79 for gauges RD025 (56610000), RD029 (56659998), and RD031 (56696000), respectively. For the Piranga basin, the number of samples was 72, 73, 114, and 108 for gauges RD001 (56028000), RD007 (56075000), RD013 (56110005), and RD023 (56539000), respectively.
In most stations, the water quality was impaired for more than half of the samples, in particular stations RD025 and RD029 in the Piracicaba basin and station RD013 in the Piranga basin, in which cases more than 75 percent of the data were above the maximum allowed concentration established by the legislation. With regards to the hydrological patterns, which summarize the historical flowrates from 1989 to 2018, it is clear that these are similar: Streamflow increases from November to March and decreases from March to October. In a study developed in the Piranga basin, streamflow regionalization techniques were applied to estimate the water availability variation throughout the network, and it was observed that, in some points of the hydrography, the difference between dry and wet seasons reaches almost 120% [31].
Given the nature of E. coli data, characterized by a wide range of values, smaller observations commonly get overwhelmed with larger ones. Therefore, log10 transformation was used in the analysis of the data patterns and generation of graphs, whereas the Burr XII was adjusted to untransformed data, and so was performed the RD computation. Figure 3 illustrates the sampled E. coli load, in comparison with the maximum allowable E. coli load (TMDL) for the stations RD025, RD029, and RD031, generated with the development of LDC (Equation (3)). The Loess curve was also represented in the map to indicate the tendency of the sampled E. coli load in comparison to the points representing the TMDL, whereas the grey area surrounding the curve represents the 95% confidence interval for the smoothing curve. The Loess covers all frequencies of permanence for which E. coli data were sampled. Thus, considering that, for some gauges, sample collection could not cover the entire frequency range (0-100%), the Loess may also not cover all the range.

Piracicaba Basin
Water 2020, 12, x FOR PEER REVIEW 8 of 22 from March to October. In a study developed in the Piranga basin, streamflow regionalization techniques were applied to estimate the water availability variation throughout the network, and it was observed that, in some points of the hydrography, the difference between dry and wet seasons reaches almost 120% [31]. Given the nature of E. coli data, characterized by a wide range of values, smaller observations commonly get overwhelmed with larger ones. Therefore, log10 transformation was used in the analysis of the data patterns and generation of graphs, whereas the Burr XII was adjusted to untransformed data, and so was performed the RD computation. Figure 3 illustrates the sampled E. coli load, in comparison with the maximum allowable E. coli load (TMDL) for the stations RD025, RD029, and RD031, generated with the development of LDC (Equation (3)). The Loess curve was also represented in the map to indicate the tendency of the sampled E. coli load in comparison to the points representing the TMDL, whereas the grey area surrounding the curve represents the 95% confidence interval for the smoothing curve. The Loess covers all frequencies of permanence for which E. coli data were sampled. Thus, considering that, for some gauges, sample collection could not cover the entire frequency range (0-100%), the Loess may also not cover all the range. In the Piracicaba basin, there is an impairment for E. coli for all gauges, at some flow range, and E. coli contamination is higher as the drainage area decreases once there is a larger distance between the Loess and the TMDL curves in upstream regions. For gauges RD025 and RD029, a similar pattern is recognized: the tendency line is distant from the TMDL and the percentage of impairment in these gauges is also higher, 90% for both RD025 and RD029 in comparison to 68.8% in gauge RD031. Figure 4 illustrates the adjusted TMDL curve according to the Burr XII distribution considering the flow duration points that represent the maximum loading capacity, as presented in Figure 3 for the gauges in the Piracicaba basin. It also illustrates a summary of the samples (a box-plot of the observed E. coli) in comparison to the TMDL curve for the high (HF), mid-range (MRF) and low flows (LF). In the Piracicaba basin, there is an impairment for E. coli for all gauges, at some flow range, and E. coli contamination is higher as the drainage area decreases once there is a larger distance between the Loess and the TMDL curves in upstream regions. For gauges RD025 and RD029, a similar pattern is recognized: the tendency line is distant from the TMDL and the percentage of impairment in these gauges is also higher, 90% for both RD025 and RD029 in comparison to 68.8% in gauge RD031. Figure 4 illustrates the adjusted TMDL curve according to the Burr XII distribution considering the flow duration points that represent the maximum loading capacity, as presented in Figure 3 for the gauges in the Piracicaba basin. It also illustrates a summary of the samples (a box-plot of the observed E. coli) in comparison to the TMDL curve for the high (HF), mid-range (MRF) and low flows (LF). Water 2020, 12, x FOR PEER REVIEW 9 of 22 The box-plots allow recognizing how extensive the impairment of each flow regime is in relation to the TMDL curve. In contrast to Figure 3, Figure 4 allows the pattern of each flow regime to be interpreted and understood. For example, as observed from the tendency curves, gauges RD025 and RD029 have similar E. coli contamination patterns throughout the three regimes of flow (Figure 4a,c). Gauge RD031, on the other hand, have higher E.coli impairment for high flows, decreasing to almost none for mid-range flows (considering the median), and increasing for low flows but in a smaller magnitude, compared to the observed for high flows.

Piracicaba Basin
Overall, fewer observations were collected in high flow regimes, which is expected given the smaller permanence in time [36]. In gauges RD025 and RD029, the higher percentage of impairment The box-plots allow recognizing how extensive the impairment of each flow regime is in relation to the TMDL curve. In contrast to Figure 3, Figure 4 allows the pattern of each flow regime to be interpreted and understood. For example, as observed from the tendency curves, gauges RD025 and RD029 have similar E. coli contamination patterns throughout the three regimes of flow (Figure 4a,c). Gauge RD031, on the other hand, have higher E.coli impairment for high flows, decreasing to almost none for mid-range flows (considering the median), and increasing for low flows but in a smaller magnitude, compared to the observed for high flows.
Overall, fewer observations were collected in high flow regimes, which is expected given the smaller permanence in time [36]. In gauges RD025 and RD029, the higher percentage of impairment is observed in mid-range conditions, while for gauge RD031 a smaller impairment was observed for this regime. Equations (7)-(9) illustrate the best-fit TMDL equations for gauges RD025, RD029, and RD031, respectively, as were illustrated in Figure 4. Figure 5 illustrates the RD between observed E. coli samples and the adjusted TMDL curve for each flow regime. When this information is combined with Figure 4, it is possible to assess the magnitude of the impairment in each flow class, and therefore an estimate for the needed E. coli load reduction. is observed in mid-range conditions, while for gauge RD031 a smaller impairment was observed for this regime. Equations (7)-(9) illustrate the best-fit TMDL equations for gauges RD025, RD029, and RD031, respectively, as were illustrated in Figure 4.  Figure 5 illustrates the RD between observed E. coli samples and the adjusted TMDL curve for each flow regime. When this information is combined with Figure 4, it is possible to assess the magnitude of the impairment in each flow class, and therefore an estimate for the needed E. coli load reduction. In all gauges in the Piracicaba basin, the RD, estimated by the median, was between 50% and 100%, indicating the need to reduce this pollutant in all flow regimes. Additionally, it was observed in gauges RD025 and RD029 that there was a higher range between the first and the third quantile for HFs, decreasing for MRFs and again increasing for LFs. It can have been caused by the constant sewage load into the rivers, which have a higher impact on the LF. The variation of RD in HF is probably due to the variability of diffuse sources of pollution. Figure 6 represents the RD divided into the seasons: January to March (JFM), April to June (AMJ), July to September (JAS) and October to December (OND). Although, the samples were divided into groups of three months, most samples were mainly collected in the first month within the season: January, April, July, and October, whereas fewer were collected in the second (February, May, August and November), and none during the last (March, June, September and December).
As observed in Figure 5, the higher E. coli load reduction in the Piracicaba basin is necessary for the LF. For gauge RD031, although the Loess indicated that the E. coli observations were similar to the TMDL, the needed reduction is still high for all flow regimes. It indicates that, although there is a higher variation within the samples, also observed in Figure 5c, there are more observations that do not match the water quality criteria. In all gauges in the Piracicaba basin, the RD, estimated by the median, was between 50% and 100%, indicating the need to reduce this pollutant in all flow regimes. Additionally, it was observed in gauges RD025 and RD029 that there was a higher range between the first and the third quantile for HFs, decreasing for MRFs and again increasing for LFs. It can have been caused by the constant sewage load into the rivers, which have a higher impact on the LF. The variation of RD in HF is probably due to the variability of diffuse sources of pollution. Figure 6 represents the RD divided into the seasons: January to March (JFM), April to June (AMJ), July to September (JAS) and October to December (OND). Although, the samples were divided into groups of three months, most samples were mainly collected in the first month within the season: January, April, July, and October, whereas fewer were collected in the second (February, May, August and November), and none during the last (March, June, September and December).
As observed in Figure 5, the higher E. coli load reduction in the Piracicaba basin is necessary for the LF. For gauge RD031, although the Loess indicated that the E. coli observations were similar to the TMDL, the needed reduction is still high for all flow regimes. It indicates that, although there is a higher variation within the samples, also observed in Figure 5c, there are more observations that do not match the water quality criteria.
In Figure 6, a smaller median is observed in the first semester of the year that represents mainly wet conditions. In all gauges, a higher RD was observed for the months with lower flows, July and October, where the RD almost reached 100% of reduction. Water 2020, 12, x FOR PEER REVIEW 11 of 22 In Figure 6, a smaller median is observed in the first semester of the year that represents mainly wet conditions. In all gauges, a higher RD was observed for the months with lower flows, July and October, where the RD almost reached 100% of reduction.
In RD031, although higher values of reduction are needed, the Loess curve was closer to the TMDL curve than for gauges RD025 and RD029. This happens because the reduction is measured by the median, considering the non-parametric distribution of the E. coli samples. Although there are many samples below the TMDL curve, the samples above it are many more, leading the needed reduction to higher levels. Table 2 illustrates the RD in the Piracicaba basin, according to the different flow regimes and seasons.  Figure 7 illustrates the sampled E. coli load in relation to the points representing the maximum allowable E. coli load (TMDL) in the stations RD001, RD007, RD013, and RD023, generated with the development of LDCs. The Loess curve indicates the tendency of sampled E. coli loads in comparison to the points representing the TMDL in the Piranga basin, within the 95% confidence interval. In RD031, although higher values of reduction are needed, the Loess curve was closer to the TMDL curve than for gauges RD025 and RD029. This happens because the reduction is measured by the median, considering the non-parametric distribution of the E. coli samples. Although there are many samples below the TMDL curve, the samples above it are many more, leading the needed reduction to higher levels. Table 2 illustrates the RD in the Piracicaba basin, according to the different flow regimes and seasons.  Figure 7 illustrates the sampled E. coli load in relation to the points representing the maximum allowable E. coli load (TMDL) in the stations RD001, RD007, RD013, and RD023, generated with the development of LDCs. The Loess curve indicates the tendency of sampled E. coli loads in comparison to the points representing the TMDL in the Piranga basin, within the 95% confidence interval.

Piranga Basin
In the Piranga basin, a similar pattern is observed for gauges RD001 and RD007 (Figure 7a,b). The impairment is more frequent in streamflow classes of higher flows, and it decreases for mid-flows. The main difference between these gauges is that the tendency line of the observed data is constant for low flows for the RD001 gauge, whereas it continues to decrease for the RD007 gauge. Nevertheless, within the 95% confidence interval (the gray area surrounding the smoothing curve) the TMDL and the Loess curve of the observed data are only different for high flows, from 0-25%. In the Piranga basin, a similar pattern is observed for gauges RD001 and RD007 (Figure 7a,b). The impairment is more frequent in streamflow classes of higher flows, and it decreases for midflows. The main difference between these gauges is that the tendency line of the observed data is constant for low flows for the RD001 gauge, whereas it continues to decrease for the RD007 gauge. Nevertheless, within the 95% confidence interval (the gray area surrounding the smoothing curve) the TMDL and the Loess curve of the observed data are only different for high flows, from 0-25%.
The RD023 gauge, which is located near the mouth of the basin, is associated with the larger drainage area when compared to the other gauges and is inside a protected area. Unlike the other gauges, where the most sampled points were above the TMDL curve, in RD023 the impairment with the water quality standards is more expected from the high flow conditions to the mid-flow conditions. This situation might have happened because the constant load, generated from cities, is smaller than the loading capacity of the river in low flow conditions. In this condition, it is also expected that the E. coli load from upland locations in the basin has decayed before reaching this gauge. On the other hand, during high flows, the streamflow is higher, resulting in a higher velocity of the contaminant from bigger cities, as Ponte Nova, which caused the E.coli to reach gauge RD023. Figure 8 illustrated the adjusted TMDL curve according to the Burr XII distribution considering the flow duration points that represent the maximum loading capacity for the Piranga basin, as well The RD023 gauge, which is located near the mouth of the basin, is associated with the larger drainage area when compared to the other gauges and is inside a protected area. Unlike the other gauges, where the most sampled points were above the TMDL curve, in RD023 the impairment with the water quality standards is more expected from the high flow conditions to the mid-flow conditions. This situation might have happened because the constant load, generated from cities, is smaller than the loading capacity of the river in low flow conditions. In this condition, it is also expected that the E. coli load from upland locations in the basin has decayed before reaching this gauge. On the other hand, during high flows, the streamflow is higher, resulting in a higher velocity of the contaminant from bigger cities, as Ponte Nova, which caused the E. coli to reach gauge RD023. Figure 8 illustrated the adjusted TMDL curve according to the Burr XII distribution considering the flow duration points that represent the maximum loading capacity for the Piranga basin, as well as box-plots of the observed E. coli, sampled points and the TMDL curve for the assessed flow regimes.
Considering the median of all observations collected in each flow regime, the pattern observed in Figure 7 persists in Figure 8. The loads for high and mid-flow conditions are similar for gauges RD001 and RD007, but for low flows they tend to maintain the same pattern of mid-flow conditions just for gauge RD001 and decrease for gauge RD007. For gauge RD013, a reduction in E. coli contamination is observed as the flow decreases. Finally, for gauge RD023 the main contamination is observed when flow conditions are high, whereas 75% of all observations are below the TMDL curve for mid-range and low flows.  Equations (10)-(13) illustrate the best-fit TMDL equation for gauges RD001, RD007, RD013, and RD023, respectively.
Load RD013 = 7.07 × 10 13 Load RD023 = 1.55 × 10 14  Figure 10 illustrates the RD variation in the Piranga basin throughout the seasons. This figure shows higher load variances, with larger differences between the first and third quantiles, unlike the situation observed in the Piracicaba basin. As perceived from previous results, a higher pollutant reduction is needed in gauge RD013, which showed higher RD for all classes of flow throughout the year. It also shows that a higher E. coli reduction is needed in dry months (July and October) in comparison to wet months (January and April). In gauge RD023, the RD is positive only in wet In gauge RD001, there is a higher need for reduction in the LFs, whereas in gauge RD007 the reduction needed in the LFs and HFs is similar. The RD is consistently high in gauge RD013. This is a consequence of sewage load from Ponte Nova. In gauge RD023, a higher reduction is needed in the HFs. The variance in the gauges is a consequence of a smaller influence of point sources of pollution, meaning the sewage load in the rivers, as was observed for the Piracicaba basin. The negative values are obtained when the observed load is smaller than the TMDL, and higher negative magnitudes indicate a smaller Load OBS (Equation (6)), which explains why, for some gauges, the box-plot is not entirely on the plot (i.e., Figure 9d). Thus, it is possible to affirm that lower values of E. coli load (in relation to the TMDL curve) were observed in gauge RD023, followed by gauge RD007 and then RD001. Gauge RD013 presented only positive values.    Figure 10 illustrates the RD variation in the Piranga basin throughout the seasons. This figure shows higher load variances, with larger differences between the first and third quantiles, unlike the situation observed in the Piracicaba basin. As perceived from previous results, a higher pollutant reduction is needed in gauge RD013, which showed higher RD for all classes of flow throughout the year. It also shows that a higher E. coli reduction is needed in dry months (July and October) in comparison to wet months (January and April). In gauge RD023, the RD is positive only in wet months, indicating a need for load reduction in diffuse sources of pollution. Table 3 illustrates the RD in the Piranga basin, according to the different flow regimes and different seasons. In general, the variation is more significant in this basin than in the Piracicaba basin. Besides, no-load reduction is needed in gauge RD023 during the dry season, but this observation is not valid for the other gauges.

E. coli Concentration and Streamflow Patterns
The E. coli concentration in the gauges is directly related to the proximity of urban centers, where the population is mostly concentrated [37]. In the Piracicaba basin, the stations RD025 and RD029 were installed in city surroundings, whereas the gauge RD031 is within the extent of Cel. Fabriciano city, but was installed upstream of its urban center. In the latter case, the station did not receive the raw (untreated) sewage generated in one of the most populated cities in the basin. This fact explains the larger E. coli concentration upstream of the Piracicaba basin in relation to downstream.
Additionally, although the impairment at gauge RD031 is greater than 50% (Figure 2a), if the confidence interval of the Loess curve is considered in relation to the TMDL curve, these overlap, resulting in an impairment percentage of about 20%, mainly observed in high flows. In this gauge, the E. coli contamination is probably attributed to runoff, cattle grazing, or/and the sewage from upstream cities that arrive faster in the gauge during events of high flows, not allowing the process of bacterial decay.
For the Piranga basin, the stations RD001 and RD007 are located close to the urban center of the cities, while station RD013 is located downstream from the city center. For the first stations, most sewage generated by the city is likely loaded after their location, which would result in sampled E. coli values that do not reflect the water quality in the cities. For station RD013, most sewage load generated by the city likely reaches the gauge, although some of the loaded E. coli can decay from the moment the sewage reaches the river. This affirmation is supported by the fact that in this gauge the smoothing curve of the observed samples of E. coli is distant from the TMDL curve for all permanences of streamflow. This situation is probably a consequence of the untreated sewage load from Ponte Nova, one of the most populated cities of this basin, with over 53,000 inhabitants.
The station RD023, on the other hand, is located in a preserved state park, the Doce State Park, away from the urban centers of both cities where it is located. Although a higher E. coli concentration is observed near the cities, it cannot be ruled out that this region is a pole of swine farming activities, which also contribute to a high concentration of pathogens, due to high percentages of waste deposition that are not respecting the environmental regulations [27].
Apart from station RD013, the tendency of the median is to reduce as the drainage area increase. The same pattern was observed in other basins, in which higher E. coli loading was observed to be higher in watersheds with smaller drainage areas and with a higher population density [38]. It is suggested that this situation probably arises because, in smaller basins, the pollutant source is likely to be closer to the streams, resulting in a faster transport rate, thus reducing the impact of E. coli decay. Similarly, in the surroundings of Chinese counties and evidenced that the arrangements of the cities play an important role in water quality. Thus, government initiatives, such as structural reforms and environmental regulations, impact water quality [37]. An example of the importance of government actions in order to improve water quality happened in the USA when a detergent ban and a total P discharge was imposed in order to reduce algal blooms [39].

E. coli Load Patterns in the Piracicaba Basin
As expected, a higher RD was observed for the seasons with lower flows, which can receive closer attention in the establishment of measurements to reduce the pollution load. In a study developed in urban catchments in Florida (USA), higher E. coli contamination was observed during the summer, with wet and hot conditions [38]. This difference in relation to the results presented for this basin is probably due to the difference in scale, which is higher here, resulting in a delay in the transport velocity, lessening the effect of the diffuse pollution.
In a different study, the dynamics of fecal indicators were analyzed according to different conditions of land use, season and water chemistry. It was observed that sometimes the indicators had different patterns throughout the basin [40]. E. coli, in agricultural basins, maintained high concentrations throughout the year, whereas, in urban cities, it tended to increase during the summer season and is smaller during the winter.
The combination of these analyses can improve the current water quality assessment method where only the pollutant concentration is monitored. For instance, it is possible to estimate the pollution pattern, i.e., the regime where it is more recurrent and when there are more samples away from water quality criteria. In gauge RD025, for instance, the percentage of impairment is higher for greater flows (almost 94% of impairment during HF). However, as it is observed in Figure 5a, the magnitude of pollution is higher for low flows, as the median RD of low flows in this class is more distant to x-axis (RD equivalent to zero). As a whole, E. coli contamination in the basin is a concern and is the reason why all classes of flow and sources of pollution (point and non-point) should be addressed.

E. coli Load Patterns in the Piranga Basin
In the Piranga basin, the total impairment is ampler than in the Piracicaba basin, varying from over 95% in gauge RD013 to below 10% in gauge RD023. The impairment is larger in RD001 than in RD007, while this was not possible to detect in the smoothing curve. In a study developed in different scales in Texas: Field, small watershed and river basin, E. coli concentrations were evaluated with different agricultural management and land use [41]. In this study, the authors also observed that the primary E. coli sources in rural basins are wildlife, streamflow resuspension, failing wastewater treatment plants and animal feeding operations. Additionally, in the river courses evaluated, E. coli contamination tends to be higher upstream and decreases downstream. Throughout the Piranga basin, the main sources of pollution are likely caused by cultivated and grazed lands and swine production [27], except by gauge RD013, in which contamination is caused by failing and non-existing wastewater treatment plants.
In the Piranga basin, there is a need for E. coli load reduction in the HFs in all gauges. However, in gauges RD001 and RD013, the magnitude of reduction is higher for the LFs (Table 3). This can be used as evidence to address these flows first when summed with the fact that low flows can represent critical conditions in the context of water supply.
In gauge RD001, E. coli load reduction is also needed in all seasons, but unlike in gauge RD013, a higher variation throughout the seasons is observed.
With the information of the flow regimes with the must recurrence of impairment and the months in which the problems are more aggravated, it is easier for the statewide environmental agencies to identify potential sources of pollution and thus impose regulations to minimize this load.

E. coli Load Variation in Piracicaba and Piranga Basins
In order to understand the dynamics of the observed E. coli load throughout the Doce river basin, the smoothing curves of the gauges within the Piracicaba and Piranga sub-basins were plotted together in Figure 11. Given the smaller size of the Piracicaba basin in comparison to the Piranga, there is not a significant load variation throughout the former basin (Figure 11a). For instance, with 95% confidence, the load in gauge RD029 is equivalent to the E. coli loads in gauges RD025 and RD031 at some flow permanence. In this basin, it is also possible to observe a steeper decrease in the E. coli load in the gauge RD031 as the permanence increases in relation to the other gauges in the basin.
In the Piranga basin (Figure 11b), a broader load variation is observed. The load in gauge RD013 is higher in all permanences of flow, as observed in other studies, where the agricultural basin presented higher and constant E. coli concentrations, in relation to the urban [40]. In gauge RD001, the load in low flow conditions is higher than the loads in the gauges RD007 and RD023 for higher permanences, even considering its smaller drainage area. The difference in the load of gauge RD013, with respect to the other gauges throughout the permanences of streamflow, can be attributed to the fact that it is located in the city with a higher amount of inhabitants, over 53,000, in relation to the others, in which the population is under 7,000 inhabitants.
The Loess curve does not present a continuum decrease in its load magnitude, as expected for the TMDL curve, because it was generated for the sampled observations, which reflected the water quality of these gauges. Therefore, the decrease in permanence does not imply in the reduction of the load, once it will follow the water quality observed in each permanence of flow and, for this reason, the Loess curve can have positive and negative slopes throughout the streamflows.
On that basis, the source control actions to be taken in order to improve water quality in the studied basins should prioritize regions where it could have more impact in the basin. In the Piracicaba basin, the gauges upstream (RD025 and RD029) are likely good candidates given the smaller drainage area and a probable high percentage of point sources. In the Piranga basin, on the other hand, the gauge where the adoption of pollution control practices would have a higher impact is the RD013. In the listed gauges, the main advantage is the existence of point sources of pollution, which are more feasible to identify and further control. Given the smaller size of the Piracicaba basin in comparison to the Piranga, there is not a significant load variation throughout the former basin (Figure 11a). For instance, with 95% confidence, the load in gauge RD029 is equivalent to the E. coli loads in gauges RD025 and RD031 at some flow permanence. In this basin, it is also possible to observe a steeper decrease in the E. coli load in the gauge RD031 as the permanence increases in relation to the other gauges in the basin.

Conclusions
In the Piranga basin (Figure 11b), a broader load variation is observed. The load in gauge RD013 is higher in all permanences of flow, as observed in other studies, where the agricultural basin presented higher and constant E. coli concentrations, in relation to the urban [40]. In gauge RD001, the load in low flow conditions is higher than the loads in the gauges RD007 and RD023 for higher permanences, even considering its smaller drainage area. The difference in the load of gauge RD013, with respect to the other gauges throughout the permanences of streamflow, can be attributed to the fact that it is located in the city with a higher amount of inhabitants, over 53,000, in relation to the others, in which the population is under 7,000 inhabitants.
The Loess curve does not present a continuum decrease in its load magnitude, as expected for the TMDL curve, because it was generated for the sampled observations, which reflected the water quality of these gauges. Therefore, the decrease in permanence does not imply in the reduction of the load, once it will follow the water quality observed in each permanence of flow and, for this reason, the Loess curve can have positive and negative slopes throughout the streamflows.
On that basis, the source control actions to be taken in order to improve water quality in the studied basins should prioritize regions where it could have more impact in the basin. In the Piracicaba basin, the gauges upstream (RD025 and RD029) are likely good candidates given the smaller drainage area and a probable high percentage of point sources. In the Piranga basin, on the other hand, the gauge where the adoption of pollution control practices would have a higher impact is the RD013. In the listed gauges, the main advantage is the existence of point sources of pollution, which are more feasible to identify and further control.

Conclusions
This study presents an alternative method to estimate E. coli load reduction in different locations of Piranga and Piracicaba basins, considering different flow regimes and seasons, using the concept of Load Duration Curves (LDC) and Total Maxim Daily Loads (TMDL). The results made evident that the loading of raw sewage in the rivers is one of the leading causes of E. coli contamination in all flow regimes for both basins. Thus, the first practice in the basins to control E. coli pollution should be the installation of wastewater treatment plants.
In the Piracicaba basin, the tendency is that the observed load is higher than the TMDL curve for all flow regimes, with a higher need for load reduction in low flows, in upstream gauges. On gauge RD031, on the other hand, located downstream of the basin, although there is an impairment of almost 70%, the TMDL load is equal to the observed load for about 80% of the streamflow permanence, considering the smoothing (Loess) curve.
For the Piranga basin, there is a higher variation in the E. coli contamination. Nevertheless, more efforts should focus on controlling the contamination in the higher flows (HF), because the RD was positive for all gauges under this condition, and the minimum percentage of impairment was over 65%.
The study proposes a different way to monitor water quality in Brazilian waters. Nowadays, the primary method used to measure water quality is through the measurement of pollutant concentration, not considering the streamflow related to that concentration. However, in order to improve water quality, it is essential to monitor both aspects, namely water quality and quantity, to propose techniques that are appropriate for the watershed and will control pollution.
The study brought different analyses that can be explored to improve water quality in the studied and other basins. For instance, with this methodology, it is possible to assess water quality and propose periodical goals in order to achieve a certain water standard. The method can also help define the needed efficiency in wastewater treatment plants, and thus, the definition of proper technology. Also, the identification of potential pollution sources can be facilitated with the seasonality approach. This type of approach can aid in the understanding of fate, transport, and survival of E. coli, which is a pollutant of many basins. Nevertheless, the methodology proposed can be applied to different contaminants.