The Role of Landscape Conﬁguration, Season, and Distance from Contaminant Sources on the Degradation of Stream Water Quality in Urban Catchments

: Water resources are threatened by many pollution sources. The harmful e ﬀ ects of pollution can be evaluated through biological indicators capable of tracing problems in life forms caused by the contaminants discharged into the streams. In the present study, the e ﬀ ects on stream water quality of landscape conﬁguration, season, and distance from contaminant emissions of di ﬀ use and point sources were accessed through the evaluation of a Portuguese macroinvertebrate index (IPtI N ) in 12 observation points distributed within the studied area (Ave River Basin, Portugal). Partial least-squares path models (PLS-PMs) were used to set up cause–e ﬀ ect relationships between this index, various metrics adapted to forest, agriculture, and artiﬁcial areas, and the aforementioned emissions, considering 13 distances from the contaminant sources ranging from 100 m to 56 km. The PLS-PM models were applied to summer and winter data to explore seasonality e ﬀ ects. The results of PLS-PM exposed signiﬁcant scale and seasonal e ﬀ ects. The harmful e ﬀ ects of artiﬁcial areas were visible for distances larger than 10 km. The impact of agriculture was also distance related, but in summer this inﬂuence was more evident. The forested areas could hold onto contamination mainly in the winter periods. The impact of di ﬀ use contaminant emissions was stronger during summer, when accessed on a short distance. The impact of e ﬄ uent discharges was small, compared to the inﬂuence of landscape metrics, and had a limited statistical signiﬁcance. Overall, the PLS-PM results evidenced signiﬁcant cause–e ﬀ ect relationships between land use metrics and stream water quality at 10 km or larger scales, regardless of the season. This result is valid for the studied catchment, but transposition to other similar catchments needs to be carefully veriﬁed given the limited, though available, number of observation points. distances, the e ﬀ ect that water quality. Peak values were found for distances of 4 km (pcw = − 0.560) and 10 km in summer (pcw = − 0.648), but for distances larger than 10 km, the changes were minimal. The results lead to the conclusion that, for the Ave River basin, agriculture is a threat to water quality, while the impact seems to be stronger during the summer period.


Introduction
The growing population and demographic expansion threaten hydric resources, not only by inducing stressful water demands but also because of the continuous surge of pollution sources. The response to anthropogenic pressures relies on proper management that should always stand on environmental research. The risks to water quality are well known by experts, but continuous research should be applied since the world is in constant change [1,2]. Effluent discharges are an undeniable

Study Area
The Ave River Basin is located in the northern region of Portugal ( Figure 1A), occupying an area of approximately 1322 km 2 . The main water course extends for 100 km, the most important tributary catchments are the Este (247 km 2 ) and Vizela (323 km 2 ) rivers. The altitude ranges from 0 m along the Atlantic coast to 1254 m at the Cabreira mountains, where the catchment headwaters are located. This river basin is surrounded to the west the by the Atlantic Ocean, to the south by the Leça River Basin, to the east by the Douro River Basin, and to the north by the Cávado River Basin. The group of three river basins, Ave, Leça and Cávado, belong to the same management unit, namely hydrographic region number 2 [53].
area of approximately 1322 km 2 . The main water course extends for 100 km, the most important tributary catchments are the Este (247 km 2 ) and Vizela (323 km 2 ) rivers. The altitude ranges from 0 m along the Atlantic coast to 1254 m at the Cabreira mountains, where the catchment headwaters are located. This river basin is surrounded to the west the by the Atlantic Ocean, to the south by the Leça River Basin, to the east by the Douro River Basin, and to the north by the Cávado River Basin. The group of three river basins, Ave, Leça and Cávado, belong to the same management unit, namely hydrographic region number 2 [53].
In the second half of the 20th century, the Ave River Basin was heavily contaminated by untreated domestic and industrial effluents, being tagged as "Europe's Great Sewer". Following the construction of public wastewater treatment plants in the 90s, water quality increased, but some microbial contamination persisted related to improper functioning of some domestic plants. The heavy pollution of Ave comprised high concentrations of heavy metals in sediments and freshwater, especially in the Este, Selho, and Vizela rivers [9,54]. This condition improved after public investment in the wastewater treatment plants [55,56]. The abundant and persistent nutrient and metal contamination deteriorated the river's ecological status at the central area and lowlands of the Ave River Basin [57,58] where industrial areas have been settled on the river banks for years. Besides the effluents from domestic and industrial origins, contributions from agriculture and livestock production have also been reported as significant causes of water quality and ecological deterioration [59][60][61].  In the second half of the 20th century, the Ave River Basin was heavily contaminated by untreated domestic and industrial effluents, being tagged as "Europe's Great Sewer". Following the construction of public wastewater treatment plants in the 90s, water quality increased, but some microbial contamination persisted related to improper functioning of some domestic plants. The heavy pollution of Ave comprised high concentrations of heavy metals in sediments and freshwater, especially in the Este, Selho, and Vizela rivers [9,54]. This condition improved after public investment in the wastewater treatment plants [55,56]. The abundant and persistent nutrient and metal contamination deteriorated the river's ecological status at the central area and lowlands of the Ave River Basin [57,58] where industrial areas have been settled on the river banks for years. Besides the effluents from domestic and industrial origins, contributions from agriculture and livestock production have also been reported as significant causes of water quality and ecological deterioration [59][60][61].

Workflow
The purpose of the present study was to show how the cause-effect relationships between ecological integrity and pollution sources/indicators changed with the season and distance from contaminant sources. Ecological integrity was assessed through the measurement of a macroinvertebrate index (IPtI N ; see Equation (1) below) in 12 sampling sites along the Ave River Basin ( Figure 1B) during the winter and summer seasons of 2017. For these sites, the entire upstream drainage area was delineated and then sectioned at predefined distances from the sampling site ( Figure 1C). Subsequent to catchment delineation and sectioning, land use and contaminant emission data were prepared for each section to be used in PLS-PM models. Two separate models were defined based on IPtI N values determined in winter and summer, respectively. The purpose was to explore how the effects of anthropogenic pressures could change as a function of season and scale. Figure 2 displays the adopted workflow, summarized in 6 steps.
was to explore how the effects of anthropogenic pressures could change as a function of season and scale. Figure 2 displays the adopted workflow, summarized in 6 steps.
For the delineation of drainage areas and drainage networks (Step 1), ArcMap [62] and its embedded module ArcHydro [63] were used. These computer packages are the key technical elements used in environmental studies with a strong spatial incidence [64][65][66][67][68][69][70][71]. As an input, a digital elevation model with a pixel resolution of 25 m was used. To design each drainage area, the tool "Batch Watershed Delineation" was used, followed by previous procedures [72]. For each sampling site, the "Buffer " ArcMap tool was used to create a circular area, while the created buffers extended  for 100, 250, 500, 1000, 2000, 3000, 4000, 5000, 7000, 10,000, 15,000, 20,000, and 56,000 m (Step 2). Each drainage area intersected with the respective buffer in order to create drainage sections (Step 3). The collected spatial data (Table 1) were processed and applied in the PLS-PM models (Step 4). The land uses and discharge emissions were collected using the ArcMap "Intersect" tool. A total of 26 PLS-PM models were created, one for each season and distance combination. The algorithm was executed in SmartPLS software [73]. PLS-PM analysis was the chosen method because it could establish cause-effect relationships for the studied latent variables, which were termed "Land Use", "Contaminant Emissions", and "Ecological Integrity". Formative models were chosen because these have prediction capabilities and, at the same time, establish the wanted cause-effect relationships [74]. The 26 model outputs were measured with weights, path coefficients, and R-squared values. The outputs were compiled and compared using graphics designed in Excel [75] (Step 6).  For the delineation of drainage areas and drainage networks (Step 1), ArcMap [62] and its embedded module ArcHydro [63] were used. These computer packages are the key technical elements used in environmental studies with a strong spatial incidence [64][65][66][67][68][69][70][71]. As an input, a digital elevation model with a pixel resolution of 25 m was used. To design each drainage area, the tool "Batch Watershed Delineation" was used, followed by previous procedures [72]. For each sampling site, the "Buffer" ArcMap tool was used to create a circular area, while the created buffers extended for 100, 250, 500, 1000, 2000, 3000, 4000, 5000, 7000, 10,000, 15,000, 20,000, and 56,000 m (Step 2). Each drainage area intersected with the respective buffer in order to create drainage sections (Step 3). The collected spatial data (Table 1) were processed and applied in the PLS-PM models (Step 4). The land uses and discharge emissions were collected using the ArcMap "Intersect" tool. A total of 26 PLS-PM models were created, one for each season and distance combination. The algorithm was executed in SmartPLS software [73]. PLS-PM analysis was the chosen method because it could establish cause-effect relationships for the studied latent variables, which were termed "Land Use", "Contaminant Emissions", and "Ecological Integrity". Formative models were chosen because these have prediction capabilities and, at the same time, establish the wanted cause-effect relationships [74]. The 26 model outputs were measured with weights, path coefficients, and R-squared values. The outputs were compiled and compared using graphics designed in Excel [75] (Step 6).

Dataset
In this study, a group of variables was gathered ( Table 1) that could be connected to the variation of IPtI N . The actual data are provided as Supplementary Material. The elevation model was used for the delineation of drainage areas. The effluent discharge points values were provided by the Agência Portuguesa do Ambiente (APA; in English, Portuguese Environmental Agency) in the form of shapefiles. Each point contained was attached to information on the total discharge of nitrogen, phosphorous, and chemical and biological oxygen demands, expressed in released kilograms during the year of 2016. The discharge of nitrogen and phosphorous from livestock production, agriculture, and forestry were also provided by the APA in the form of shapefiles. Each polygon was a catchment containing the released kilograms of nutrients during the year of 2016. The most recent Portuguese land use map refers to 2015 (COS 2015) and is available at the Portuguese Territory Planning website. This map was obtained in the form of a shapefile, containing the land use or occupation from each zone. The land use types were assembled into 4 categories: agriculture, artificial surfaces, forest, and seminatural areas and water bodies. For the calculation of IPtI N , samples were collected in situ during the summer and winter of 2017. After laboratory analyses, the index was calculated for the twelve locations.
For each drainage section, the discharging values of BOD (biochemical oxygen demand), COD (chemical oxygen demand), N, and P were summed and then divided by the drainage area, creating four variables representing the discharge of each type of nutrient and oxygen demand. For diffuse discharges, the total release discharge of N and P in the drainage sections was calculated and then divided by the respective area, resulting in 4 variables: the releases of N and P from livestock and forest/agriculture. For each section, landscape metrics were calculated using a python toolbox embedded in ArcMap [76]. A total of 17 metrics were calculated for all the drainage sections ( Table 2). The north invertebrate Portuguese index (IPtI N ) is widely used to evaluate the ecological status of stream waters in northern Portugal [77]. The IPtI N index reflects the abundance and diversity of benthic invertebrates that are sensitive to all forms of pollution [77][78][79].
For the measurement of this indicator, organism samples were collected from 12 surface water locations, illustrated in Figure 1C. For each site, the organisms were classified and counted, and then Equation (1) was used to calculate the IPtI N score. The equation is complex since it uses a variety of parameters, namely, the number of taxonomic groups present in the sample (N • taxa); the number of families that belong to Ephemeroptera, Plecoptera, and Trichoptera orders (EPT); Pioleu index or evenness [80,81]; biological monitoring working party index divided by the number of families included in this index (IASPT) [82]; and the sum of individuals belonging to Heptageniidae, Ephemeridae, Brachycentridae, Goeridae, Odontoceridae, Limnephilidae, Polycentropodidae, Athericidae, Dixidae, Dolichopodidae, Empididae, and Stratiomyidae families (Sel.ETD): Among all possible variables that could be used in this study, only 8 were chosen for the PLS-PM models. The purpose was to reach low variance inflation factors (VIFs), and hence statistical significance, making a note that variables of the same domain can be strongly correlated in raising the VIFs. To represent the effluent discharges, the released annual flow divided by the drainage area was used, naming this variable as "Point Source". For diffuse contamination, the discharges of nitrogen from livestock, forestry, and agriculture were used, naming these variables as "Livestock" and "Forest and Agriculture", respectively. The chosen landscape metrics were Shannon's diversity, the edge density of forest and seminatural areas, the number of patches of artificial surfaces that were connected at a distance of 500 m, and the percentage of "agricultural areas" that were connected at a distance of 500 m. The land use variables were named as "Diversity", "Forest", "Artificial" and "Agriculture", respectively. The Portuguese index of macroinvertebrates was also used, named as "IPtIN" for the PLS-PM models, as an evaluator of biodiversity as well as water quality. Figure 3 illustrates the spatial distribution of pressures in the Ave River Basin. Figure 3A depicts the land use map of 2015. It can be noted that 50% was occupied by the dominant land use, forest and seminatural areas. Agricultural areas occupied 30% of the river basin, 20% was artificial surfaces, and less than 0.5% was water bodies. The values of discharged nitrogen from livestock, forest, and agricultural areas in the river basin catchments can be seen in Figure 3B,D. The scattered effluent discharge sites are represented in Figure 3C. Among a total of 60 locations, 24 were discharge sites of industrial treatment plants, while 36 were from domestic sewage treatment plants. forest and seminatural areas. Agricultural areas occupied 30% of the river basin, 20% was artificial surfaces, and less than 0.5% was water bodies. The values of discharged nitrogen from livestock, forest, and agricultural areas in the river basin catchments can be seen in Figure 3B,D. The scattered effluent discharge sites are represented in Figure 3C. Among a total of 60 locations, 24 were discharge sites of industrial treatment plants, while 36 were from domestic sewage treatment plants. The twelve river sites where the IPtIN was measured are represented in Figure 1, numbered from 101 to 112. Table 2 depicts the IPtIN values and respective classification. During the winter of 2017, the ecological status was classified as "Excellent" in two sites, "Moderate" in 4, "Poor" for another 4 sites, and "Very Poor" in 2. Overall, the values increased from winter to summer, as none of the sites was classified as "Very Poor" in summer, 6 were classified as "Poor", 3 as "Moderate", and 3 as "Excellent". In the locations 103, 105, 111, 112, 107, and 108, the classification changes were minimal. Besides, in site 111 the classification changed from "Moderate" to "Poor" because the IPtIN value was very close to the class threshold. In site 106 the classification changed from "Poor" to "Moderate", but in sites 109, 110, 103, and 104 the decrease in ecological status was startling because there was a significant decrease in IPtIN and subsequent classification of a less-rich class. For site 101 there was a significant increase, from "Poor" to the maximum class "Excellent". These changes of values were dependent upon seasonal effects but also on the pressures in surface waters. The twelve river sites where the IPtI N was measured are represented in Figure 1, numbered from 101 to 112. Table 2 depicts the IPtI N values and respective classification. During the winter of 2017, the ecological status was classified as "Excellent" in two sites, "Moderate" in 4, "Poor" for another 4 sites, and "Very Poor" in 2. Overall, the values increased from winter to summer, as none of the sites was classified as "Very Poor" in summer, 6 were classified as "Poor", 3 as "Moderate", and 3 as "Excellent". In the locations 103, 105, 111, 112, 107, and 108, the classification changes were minimal. Besides, in site 111 the classification changed from "Moderate" to "Poor" because the IPtI N value was very close to the class threshold. In site 106 the classification changed from "Poor" to "Moderate", but in sites 109, 110, 103, and 104 the decrease in ecological status was startling because there was a significant decrease in IPtI N and subsequent classification of a less-rich class. For site 101 there was a significant increase, from "Poor" to the maximum class "Excellent". These changes of values were dependent upon seasonal effects but also on the pressures in surface waters.

Interpretation of a PLS-PM Example Model
The output models of SmartPLS were all similar to the one represented, as an example, in Figure 4. Each measured variable (MV) is represented as a yellow rectangle, and latent variables (LVs) as blue circles. Inside the LVs preceding other LVs, the R-squared value is portrayed. In the example model, only "Ecological Integrity" has an R-squared value, since this is the only variable that has a measured score (calculated by the sum of the product of MVs with the own weight) and a predicted score (calculated by the sum of the product between the LVs). In a PLS-PM model, weights and path coefficients are determined through an iterative process, termed the path algorithm [83], with the purpose to maximize the R-squared value.
Water 2019, 11, x FOR PEER REVIEW 8 of 19

Interpretation of a PLS-PM Example Model
The output models of SmartPLS were all similar to the one represented, as an example, in Figure 4. Each measured variable (MV) is represented as a yellow rectangle, and latent variables (LVs) as blue circles. Inside the LVs preceding other LVs, the R-squared value is portrayed. In the example model, only "Ecological Integrity" has an R-squared value, since this is the only variable that has a measured score (calculated by the sum of the product of MVs with the own weight) and a predicted score (calculated by the sum of the product between the LVs). In a PLS-PM model, weights and path coefficients are determined through an iterative process, termed the path algorithm [83], with the purpose to maximize the R-squared value. Several PLS-PM models were built and analyzed in this study. In order to exemplify how these models can be interpreted, a PLS-PM model is demonstrated in Figure 4, where the data were gathered for the drainage sections within a distance of 4 km and IPtIN values measured during the winter of 2017.
Each LV was formed by one or more MVs. For the present case study, an LV "Land Use" was created and composed of 4 MVs, namely "Diversity", "Forest", "Agriculture", and "Artificial". The LV "Contaminant Emissions" was composed of three MVs pertaining to different types of contaminant flows: "Point Source", "Livestock", and "Forest and Agriculture". "Ecological Integrity" was formed by a single MV, which is the IPtIN. This LV accumulated the effects of the other LVs that were pressures in surface waters, which is why "Contaminant Emissions" and "Land Use" were connected to "Ecological Integrity". The PLS-PM model was divided into two sub-models, inner and outer. The equations of the measured scores of each LV were calculated according to Equations (2)-(4) for "Land Use", "Contaminant Emissions", and "Ecological Integrity", respectively, and are the equations that composed the outer models. The inner model was composed of relations between latent variables, which, in this case, is solely expressed by Equation  Several PLS-PM models were built and analyzed in this study. In order to exemplify how these models can be interpreted, a PLS-PM model is demonstrated in Figure 4, where the data were gathered for the drainage sections within a distance of 4 km and IPtI N values measured during the winter of 2017.
Each LV was formed by one or more MVs. For the present case study, an LV "Land Use" was created and composed of 4 MVs, namely "Diversity", "Forest", "Agriculture", and "Artificial". The LV "Contaminant Emissions" was composed of three MVs pertaining to different types of contaminant flows: "Point Source", "Livestock", and "Forest and Agriculture". "Ecological Integrity" was formed by a single MV, which is the IPtI N . This LV accumulated the effects of the other LVs that were pressures in surface waters, which is why "Contaminant Emissions" and "Land Use" were connected to "Ecological Integrity". The PLS-PM model was divided into two sub-models, inner and outer. The equations of the measured scores of each LV were calculated according to Equations (2)-(4) for "Land Use", "Contaminant Emissions", and "Ecological Integrity", respectively, and are the equations that composed the outer models. The inner model was composed of relations between latent variables, which, in this case, is solely expressed by Equation Ecological Integrity Measured Score = IPtI N × (1.000 To interpret the example model, the weights and path coefficients should be analyzed simultaneously, which can be viewed in Equation (6), where the combination of Equations (2)-(5) is made. For example, the MV "Diversity" has a positive weight (0.208), so it increases the LV "Land Use", while the same applies to "Agriculture" (0.561). Conversely, "Forest" and "Artificial" have negative weights, namely −0.569 and −0.033, and therefore decrease "Land Use". But since the path coefficient of "Land Use" in "Ecological Integrity" is negative (−1.085), "Diversity" and "Agriculture" are variables that decrease "Ecological Integrity" because the product of the weight and path coefficient is negative: −0.226 and −0.560, respectively. On the other hand, "Forest" and "Artificial Areas" increase "Ecological Integrity", since the product between the path coefficient and weight is positive, respectively 0.617 and 0.036. Equation (6) expresses the total effect of each pressure in "Ecological Integrity". The results of this study were based on the analysis of the product between the path coefficients and weights (termed pcw) for the studied 26 models.

Results of All PLS-PM Models
As shown in Table 2, the IPtI N values were collected during two seasons, winter and summer. For this reason, the 26 PLS-PM models were divided into two groups, winter (2017) and summer (2017), and traced as two dot arrays colored as blue and red, respectively, in Figure 5. For each model in the respective group, the pressure values were used as input data, gathered from the 13 drainage sections and the IPtI N values collected in winter or summer. Figure 5 portrays the results of the PLS-PM models. Each graphic describes the pcw of a measured variable in all models (y axis). The x axis represents the logarithm of the buffer distance for the respective model. For the distances of 100, 250, 500, 1000, 2000, 3000, 4000, 5000, 7000, 10,000, 15,000, 20,000, and 56,000 meters, the log 10 scores were 2, 2.4, 2.7, 3, 3.3, 3.5, 3.6, 3.7, 3.8, 4, 4.2, 4.3, and 4.7, respectively. The purpose of the plots was to illustrate the effects of the pressures in "Ecological Integrity" ("IPtI N ").
The effect of "Artificial" was independent of the season since the variations with distance were practically identical for both winter and summer. For distances shorter than 10 km, the effect was positive, but for longer distances the effect became negative. The strongest positive effects were detected for a distance of 100 m in summer (pcw = 0.386) and for 1000 m in winter (pcw = 0.310). The strongest negative effects were detected for the maximum distance (56 km) (i.e., for the entire drainage areas) either in winter (pcw= −0.247) or summer (pcw= −0.201).
For "Agriculture" it was seen that for both winter and summer periods, the effect was practically identical, but the summer line was below the winter line for a majority of buffer distances (only between 3 km and 5 km is the red line above the blue). The effect of agriculture was practically null for a distance of 100 m in summer (pcw = 0.008). For the same distance, it was positive during winter (pcw = 0.276) and practically null for a distance of 250 m (pcw = 0.01). For longer distances, the effect was negative, which indicated that agriculture decreased water quality. Peak values were found for distances of 4 km in winter (pcw = −0.560) and 10 km in summer (pcw = −0.648), but for distances larger than 10 km, the changes were minimal. The results lead to the conclusion that, for the Ave River basin, agriculture is a threat to water quality, while the impact seems to be stronger during the summer period.
(pcw = −0.856) or in summer (pcw = −0.456). The effect was always negative for winter periods, and stronger in this season for almost all distances, except between 4 and 5 km. For the summer period, the effect was close to zero, but still negative; only distances of 500 m, 10, and 15 km were close to zero, since the pcw values were −0.024, 0.059, and 0.026, respectively. For the longest distance (56 km), the effect was practically the same for winter (pcw = −0.290) and summer (pcw = −0.298). The results provide evidence that the impact of "Diversity" is a threat in winter.  Peak values of "Diversity" were detected for the minimum distance, 100 m, either in winter (pcw = −0.856) or in summer (pcw = −0.456). The effect was always negative for winter periods, and stronger in this season for almost all distances, except between 4 and 5 km. For the summer period, the effect was close to zero, but still negative; only distances of 500 m, 10, and 15 km were close to zero, since the pcw values were −0.024, 0.059, and 0.026, respectively. For the longest distance (56 km), the effect was practically the same for winter (pcw = −0.290) and summer (pcw = −0.298). The results provide evidence that the impact of "Diversity" is a threat in winter.
The effect of "Forest" was essentially positive. For the winter period, the effect was positive for all distances, and always higher than in summer, where the effect was negative but close to zero for the distances 500 m (pcw = −0.080) and 56 km (pcw = −0.049). Peak values occurred on the shortest scale, 100 m. In winter the pcw was 0.853, and in summer it was 0.419. As the buffer distances increased, the effects changed irregularly (drop and rise), but from an overall view, values were always between 0.853 and 0.374 during winter. It can be said that globally (for winter and summer), "Forest" favors water quality.
The variable that had less effects along all distances and seasons was "Point Source". For this variable and the models that comprehended distances between 100 to 500 m, the attributed weight was 0, since for short distances there were no discharge points. Even so, compared to all the other variables, it had less impact because the pcw values were contained in a short range that varied from −0.198 to 0.226. For winter, negative values were found in 1 km (pcw = −0.198) and 7 km (pcw = −0.006), while in summer the negative values were observed for distances longer than 7 km. This indicates that the effect of effluent discharges only decreased IPtI N values during the summer period when long distances were analyzed, but with minimal impact.
The discharges of nitrogen from diffuse pressures were represented by the variables "Forest and Agriculture" and "Livestock". When both graphics were compared, it was seen that there was an inverse relationship between these two variables for all models in both seasons. When the effect "Forest and Agriculture" increased, "Livestock" decreases. For the summer period, the effect of "Livestock" was always negative, while the effect of "Forest and Agriculture" was always positive. These effects were stronger over shorter distances, since maximum values for "Forest and Agriculture" and minimum values for "Livestock" appeared over the short distances. But as the distance increased, both effects approached zero. The variations of both variables were minimal for short distances (≤1 km), positive for "Forest and Agriculture", and negative for "Livestock". At the distances 2, 3, and 4 km, the effect became positive for "Livestock" and negative for "Forest and Agriculture", with a peak at 4 km. For distances longer than 4 km, the effect tended to zero.
The analysis of the pcw variations for the 7 measured variables for all the models is crucial to comprehend the cause-effect relationship changes as function of season and distance. But the analysis of the R-squared values ( Figure 6) reveals the models' capacity to explain IPtI N variations. The effect of "Forest" was essentially positive. For the winter period, the effect was positive for all distances, and always higher than in summer, where the effect was negative but close to zero for the distances 500 m (pcw = −0.080) and 56 km (pcw = −0.049). Peak values occurred on the shortest scale, 100 m. In winter the pcw was 0.853, and in summer it was 0.419. As the buffer distances increased, the effects changed irregularly (drop and rise), but from an overall view, values were always between 0.853 and 0.374 during winter. It can be said that globally (for winter and summer), "Forest" favors water quality.
The variable that had less effects along all distances and seasons was "Point Source". For this variable and the models that comprehended distances between 100 to 500 m, the attributed weight was 0, since for short distances there were no discharge points. Even so, compared to all the other variables, it had less impact because the pcw values were contained in a short range that varied from −0.198 to 0.226. For winter, negative values were found in 1 km (pcw = −0.198) and 7 km (pcw = −0.006), while in summer the negative values were observed for distances longer than 7 km. This indicates that the effect of effluent discharges only decreased IPtIN values during the summer period when long distances were analyzed, but with minimal impact.
The discharges of nitrogen from diffuse pressures were represented by the variables "Forest and Agriculture" and "Livestock". When both graphics were compared, it was seen that there was an inverse relationship between these two variables for all models in both seasons. When the effect "Forest and Agriculture" increased, "Livestock" decreases. For the summer period, the effect of "Livestock" was always negative, while the effect of "Forest and Agriculture" was always positive. These effects were stronger over shorter distances, since maximum values for "Forest and Agriculture" and minimum values for "Livestock" appeared over the short distances. But as the distance increased, both effects approached zero. The variations of both variables were minimal for short distances (≤1 km), positive for "Forest and Agriculture", and negative for "Livestock". At the distances 2, 3, and 4 km, the effect became positive for "Livestock" and negative for "Forest and Agriculture", with a peak at 4 km. For distances longer than 4 km, the effect tended to zero.
The analysis of the pcw variations for the 7 measured variables for all the models is crucial to comprehend the cause-effect relationship changes as function of season and distance. But the analysis of the R-squared values ( Figure 6) reveals the models' capacity to explain IPtIN variations. The calculated R-squared values of summer varied less than the winter counterparts. The range of values varied from 0.75 (1 km) to 0.91 (56 km) in summer, while in winter they ranged from 0.58 (500 m) to 0.93 (7 km). For the winter period, for distances comprehended between 250 m and 3 km, the model's explicability was below 0.75, but the in winter models that comprehended distances between 4 to 20 km, the values were higher than in the corresponding summer models. The calculated R-squared values of summer varied less than the winter counterparts. The range of values varied from 0.75 (1 km) to 0.91 (56 km) in summer, while in winter they ranged from 0.58 (500 m) to 0.93 (7 km). For the winter period, for distances comprehended between 250 m and 3 km, the model's explicability was below 0.75, but the in winter models that comprehended distances between 4 to 20 km, the values were higher than in the corresponding summer models.
In order to assure that the models had no multicollinearity, it was ensured that all the VIF values were below 5 (please see Supplementary Material). The significance of weights and path coefficients was accessed through bootstrapping. By approaching the traditional threshold for statistical significance, p values larger than 0.05 were achieved for the weights (please see Supplementary Material). On the other hand, it was verified that the path coefficients of "Land Use" were significant, characterized by p values lower than 0.05 for long distances (Figure 7). In order to assure that the models had no multicollinearity, it was ensured that all the VIF values were below 5 (please see Supplementary Material). The significance of weights and path coefficients was accessed through bootstrapping. By approaching the traditional threshold for statistical significance, p values larger than 0.05 were achieved for the weights (please see Supplementary Material). On the other hand, it was verified that the path coefficients of "Land Use" were significant, characterized by p values lower than 0.05 for long distances (Figure 7). The significance of land uses seemed to follow a sigmoid pattern. For the summer period, statistical significance (p < 0.05) was achieved for distances larger than 5 km, but for the winter period, significance was achieved for distances larger than 3 km. When it comes to "Contaminant emissions" none of the models achieved statistical significance. For the summer period, p values increased with distance. For winter, p values seemed not to change for distances below 2 km, increased to 0.124 at 3 km, and then dropped consistently until a distance of 15 km. For 20 and 56 km, there was a notable loss of statistical significance.

Discussion
Before analyzing the results, some expectations regarding the effect of the variables are outlined. It was expected that all pressures had a negative impact on ecological integrity, while "Forest" was expected to have a positive effect. This was expected because in catchments with a high presence of forested areas, good water quality can be found [84]. It was also thought that the measured variables that belonged to the latent variable "Contaminant Emissions" would have a higher effect than "Land Use". This was because the discharges of COD from point sources and nitrogen from livestock, agriculture, and forestry represented the mass flow of contaminants that were transported to surface water, while land use metrics were only indicators of possible pollution. In terms of season and scale, no expectations were anticipated because authors already recognized in different studies that different conclusions can be achieved [34].
The positive effect of "Artificial" for distances below 10 km is hard to explain. By studying the impact of land uses on biological integrity, a positive response was found in urbanized areas [31] by accessing a riparian scale. Another author [85] compared the effects of land uses in the same Index, of Biotic Integrity (IBI) and noted that, in a riparian range, the impacts of urban activity were positively correlated to this index, but for the watershed scale, the outcome was negative, which is in concordance with the present study. This might happen because, at a short range, the impacts of urban presence may be hard to capture, compared to an extended scale. Possibly, urbanized areas only affect water quality when their predominance occurs over a long range.
The agricultural land uses are revealed as a threat to Ave River Basin. For distances larger than 200 m, the effect was negative and increased in scale for both winter and summer seasons. This can The significance of land uses seemed to follow a sigmoid pattern. For the summer period, statistical significance (p < 0.05) was achieved for distances larger than 5 km, but for the winter period, significance was achieved for distances larger than 3 km. When it comes to "Contaminant emissions" none of the models achieved statistical significance. For the summer period, p values increased with distance. For winter, p values seemed not to change for distances below 2 km, increased to 0.124 at 3 km, and then dropped consistently until a distance of 15 km. For 20 and 56 km, there was a notable loss of statistical significance.

Discussion
Before analyzing the results, some expectations regarding the effect of the variables are outlined. It was expected that all pressures had a negative impact on ecological integrity, while "Forest" was expected to have a positive effect. This was expected because in catchments with a high presence of forested areas, good water quality can be found [84]. It was also thought that the measured variables that belonged to the latent variable "Contaminant Emissions" would have a higher effect than "Land Use". This was because the discharges of COD from point sources and nitrogen from livestock, agriculture, and forestry represented the mass flow of contaminants that were transported to surface water, while land use metrics were only indicators of possible pollution. In terms of season and scale, no expectations were anticipated because authors already recognized in different studies that different conclusions can be achieved [34].
The positive effect of "Artificial" for distances below 10 km is hard to explain. By studying the impact of land uses on biological integrity, a positive response was found in urbanized areas [31] by accessing a riparian scale. Another author [85] compared the effects of land uses in the same Index, of Biotic Integrity (IBI) and noted that, in a riparian range, the impacts of urban activity were positively correlated to this index, but for the watershed scale, the outcome was negative, which is in concordance with the present study. This might happen because, at a short range, the impacts of urban presence may be hard to capture, compared to an extended scale. Possibly, urbanized areas only affect water quality when their predominance occurs over a long range.
The agricultural land uses are revealed as a threat to Ave River Basin. For distances larger than 200 m, the effect was negative and increased in scale for both winter and summer seasons. This can reflect that agriculture only affects water quality when the predominance is on a long scale, as there is a high accumulation of contaminants. Likewise, other studies revealed a negative impact over a long scale [24,34,39,86].
The edge density of forested areas "Forest" was the variable that had an expected impact on winter and summer (except at the distances of 56 km and 500 m, where the effects were null and negative, respectively). Many authors have concluded that the effect of forestry improves water quality. Positive impacts are found with biotic indexes [31,32], and negative correlations or effects with contaminant concentrations are found [23][24][25]28,39,86,87] independently of scale, season, and study area, or even accessed metric.
In light of the previously mentioned reasons, it was not expected that the effect of "Land use" would be greater than "Contaminant Emissions". It was noticed that in the winter period, the measured variables that belonged to "Land Use" were the ones with stronger effects (except in the model of the scale 500 m, where livestock had the strongest impact). But, in the summer period, the contaminant emissions from "Forest and Agriculture" and "Livestock" were the variables with the highest pcw. Only at a scale higher than 7 km did the effect of agricultural land use overlap contaminant emissions.
When it comes to significance, the path coefficient of "Land Use" achieved statistical significance in both seasons at a long scale, while "Contaminant emissions" did not. This leads to the conclusion that, in concordance with other authors, the effect of land use should be accessed on a long scale, also called a complete watershed. Both diffuse and point-source discharges have temporal changes [7,[88][89][90]. To access the seasonal effects of contaminant emissions, it is important to trace the temporal changes in the contaminant flow, just like it was done in other studies [91]. In the present study, the point source flow of COD and diffuse discharge resulting from livestock production, forested, and agricultural activities was in the form of annual flow. In fact, it is quite hard to access the released contamination on shorter temporal scales for a whole river basin. But, if the data of point and diffuse sources from APA were monthly, or even seasonal, it is believed that the significance of "Contaminant Emissions" would be higher in the model and could possibly reveal higher effects than landscape metrics. In previous studies where the Ave River Basin water quality was assessed, it was noted that point-source pressures [51] and livestock production [52] were major threats to water quality. But, in those studies, landscape metrics were not used in such a detailed form, only the percentage of catchments occupied by agricultural and artificial areas were used. During summer, the contribution of underground water to river discharge increased due to the lack of rainfall [92]. In the presented models this is shown because contaminant emissions from agriculture, forest, and livestock effects were indeed larger in summer periods than in winter. During winter, the runoff effect was higher due to strong rainfalls, which explains why variables related to land use metrics had a stronger impact in winter rather than in summer.
Another important aspect is that when formative PLS-PM models are adopted, the number of explaining variables cannot be large because of the shortcomings of variance inflation. One technique to reduce variance inflation is by restraining the number of variables [93]. For this reason, the effects of other land use metrics were not accessed in this study, only the edge density of forested areas, Shannon's diversity index, and connectance metrics of agricultural and artificial surfaces. However, this study succeeded in demonstrating the seasonal and scale effects in the interaction between various pollution sources and ecological integrity. The low sample size (n = 12) was a limitation. As a consequence, the weights were not significant. Despite this, it was still possible to achieve significance for the path coefficient "Land Use" latent variable. So, it is recommended to use larger sample sizes in similar and future studies. Since the weights were not significant, the tested models may not be suited for prediction purposes. At any rate, the presented study indicates that, in terms of prediction in the Ave River Basin, longer scales should be adopted, considering the consistent high R-squared values and significance of land use variables observed at these scales.
For proper river basin management, landscape metric variables should always be assessed, as it is quite easy to calculate them using computer packages such as FRAGSTATS [94] or the ArcGIS toolbox [76]. In terms of landscape management, when the consequences of anthropogenic pressures for water quality are to be assessed, the exercise should always be applied to various scales. This is recommended because this study concluded that the sense (positive or negative) of an impact can change with scale.
The most limiting factor in this study was probably the small number of sampling points used to assess the IPtI N , just 12. However, these few samples allowed us to provisionally expose the significant role (p < 0.05) of land use metrics for a satisfactory (R 2 ≈ 0.85) explanation of water quality (IPtI N ) in the long range (>4 km from the contaminant sources). This result is noteworthy. It can be (and was) argued that more samples would render the possibility to reveal the influence of other anthropogenic pressures, eventually hidden in this study by the sample's coarse resolution. Nevertheless, it would not be a surprise if the results obtained in this study were replicated with a finer resolution, as contaminant emissions are subject to larger inter annual variations than are land uses, and, hence, they could eventually be inefficient in the studied period. A larger sample would probably capture fine-resolution effects, for example, related to point-source contaminant emissions, but is not certain that would change the general outcomes and conclusions taken from this study. The main goals were achieved, which were to explore the influence of anthropogenic pressures on water quality as function of scale and season using a novel statistical method. The 26 PLS-PM models implemented in a predefined sequence were capable of identifying the most important variables and distances from contaminant sources that controlled water quality in the Ave River Basin in the studied period. The model results may not be directly used in management initiatives without prior verification using a larger sample, but they suggested how scale and season can affect the conclusions about cause-effect relationships involving anthropogenic pressures and water quality. In that context, the outcomes from this study provided interesting clues for managers of water quality at catchment scale, which are inherently an important scientific result.

Conclusions
This study has shown to be effective in demonstrating seasonal and scale impacts in the interplay between the effect of landscape metrics and contamination sources on water quality. This analysis plays an important role for decision makers to take into account that territory planning is intrinsically linked to water quality. As it was found in this study, the effect of metrics can be greater than contamination sources. From a statistical point of view, this study showed that the use of a long scale is preferable, since it obtained higher coefficients of determination in both winter and summer, but also high statistical significance for the latent variable "Land Use". Even so, it is advised that when water quality studies are carried out, effects should always be analyzed not at a long scale but also at a short scale. This is because each river basin is unique and reveals natural and anthropogenic interactions that are different in each river basin. So, in other locations, stronger effects can be found at shorter scales. In terms of water quality improvement, besides the constant monitoring and reduction of pollution sources, it is pointed out that the presence of forested areas improves the ecological integrity, not in total area but in terms of edge density. As far as agricultural areas are concerned for ecological integrity, strategic relocations can also be a key strategy in order to decrease connectance. While in urban areas, these hardly can be changed, given the enormous costs that such processes can entail.