Application of the RSPARROW Modeling Tool to Estimate Total Nitrogen Sources to Streams and Evaluate Source Reduction Management Scenarios in the Grande River Basin, Brazil

: Large-domain hydrological models are increasingly needed to support water-resource assessment and management in large river basins. Here, we describe results for the ﬁrst Brazilian application of the SPAtially Referenced Regression On Watershed attributes (SPARROW) model using a new open-source modeling and interactive decision support system tool (RSPARROW) to quantify the origin, ﬂux, and fate of total nitrogen (TN) in two sub-basins of the Grande River Basin (GRB; 43,000 km 2 ). Land under cultivation for sugar cane, urban land, and point source inputs from wastewater treatment plants was estimated to each contribute approximately 30% of the TN load at the outlet, with pasture land contributing about 10% of the load. Hypothetical assessments of wastewater treatment plant upgrades and the building of new facilities that could treat currently untreated urban runo ﬀ suggest that these management actions could potentially reduce loading at the outlet by as much as 20–25%. This study highlights the ability of SPARROW and the RSPARROW mapping tool to assist with the development and evaluation of management actions aimed at reducing nutrient pollution and eutrophication. The freely available RSPARROW modeling tool provides new opportunities to improve understanding of the sources, delivery, and transport of water-quality contaminants in watersheds throughout the world.


Introduction
Large-domain hydrological models are needed to support water-resource assessment and management in large river basins (e.g., [1]) and coastal and estuarine waters with large contributing drainages [2]. Such models must be able to quantify the effects of a diverse range of land uses on streamflow and water quality as well as the climatic and hydrological processes that control water and contaminant transport in surface waters across large spatial scales [1]. Progress has been made in applying large-domain hydrological and water-quality models to advance understanding of land use and process effects across global (e.g., [3,4]), continental [5][6][7][8], and regional spatial scales [9,10]. sources of nitrogen in the GRB, including the contaminant mass loading contributions from major land uses (urban, pasture, sugar cane) and wastewater treatment facilities in the river basin; and (3) illustrate an application of the model to evaluate the potential stream water-quality effects of hypothetical nutrient-reduction management actions in urban areas of the watersheds. Model-derived information about the major sources of TN to streams and spatial variability in source loadings (objectives 1 and 2) can be used to prioritize watersheds for TN load mitigation, including watersheds and land areas that provide important socio-economic and ecological services. The evaluation of alternative scenarios of hypothetical changes in nitrogen loadings related to the management of contaminants in urban areas (objective 3) can also guide the setting of management priorities for specific contaminant sources and upstream watersheds (e.g., [2]).
Our study features the use of a new open-source (freely downloadable) SPARROW modeling software tool, called RSPARROW [34], which was developed by the USGS in cooperation with the Brazilian National Water and Sanitation Agency (ANA) and the Brazilian Geological Survey (CPRM).
A key component of RSPARROW is an interactive Decision Support System (DSS) mapping tool that allows water managers to evaluate the potential effects of hypothetical changes in contaminant sources on stream water quality, such as those evaluated in the urban management case study presented here.

Site Description
The GRB occupies an area of 143,437 km 2 , with about 60% of the basin being in Minas Gerais State and 40% in São Paulo State. The Grande River is 1286 km long and forms the boundary between the states of Minas Gerais and São Paulo in its lower reaches, to its mouth, when it forms the Paraná River from the confluence with the Paranaíba River [33]. The modeled area comprises two adjacent sub-basins with adequate data for model development, the Pardo and the Sapucaí, located in the southern part of the GRB (Figure 1). The total modeled area is 42,551 km 2 . The Pardo River is the largest sub-basin and drains an area of 35,855 km 2 and flows into the reservoir created by the Marimbondo hydropower plant, with a surface area of 36,271 ha. The Sapucaí River sub-basin drains an area of 6696 km 2 to the reservoir of the Porto Colombia hydroelectric plant, with a surface area of 12,786 ha, located immediately upstream of the Marimbondo Reservoir.
The study area has remaining areas of the Atlantic Forest (seasonal moist and dry broad-leaf tropical forest) and Cerrado (tropical savanna) biomes, the latter being predominant in the upper portions of the two sub-basins. The elevation of the study area ranges from 415 to 1791 m above sea level. The average annual temperature is 19.1 • C, and ranges from 14.6 • C to 23.9 • C [35]. Annual precipitation ranges from 1097 to 1812 mm/year, with an average precipitation of 1423 mm/year. The rainy period is six to seven months in duration, lasting from October to March/April, with more than 80% of the precipitation occurring during this time period [35].
The Grande River Basin (GRB), which has a population of 8.6 million, with approximately 75% concentrated in urban centers, is one of the most important water resource regions in Brazil. In order to manage water resources in the basin in a sustainable manner, the National Water and Sanitation Agency and the Grande River Basin Committee developed the Integrated Water Resources Plan for the Grande River Basin [35]. Among the actions foreseen in this plan is the implementation of water-quality studies and identification of potential mitigation actions aimed at improving water quality in the basin. Specifically, the control of water pollution, especially related to specific constituents capable of causing eutrophication of reservoirs, such as nutrients, from both point and non-point sources, is a priority water quality management issue in the GRB, as nutrients are negatively impacting water quality in the basin. Effective control of water pollution requires information about constituent sources to allow for prioritization of areas where appropriate actions can be taken to avoid and mitigate unwanted impacts on water quality at the basin scale. The study area has remaining areas of the Atlantic Forest (seasonal moist and dry broad-leaf tropical forest) and Cerrado (tropical savanna) biomes, the latter being predominant in the upper portions of the two sub-basins. The elevation of the study area ranges from 415 to 1791 m above sea level. The average annual temperature is 19.1 °C, and ranges from 14.6 °C to 23.9 °C [35]. Annual precipitation ranges from 1097 to 1812 mm/year, with an average precipitation of 1423 mm/year. The rainy period is six to seven months in duration, lasting from October to March/April, with more than 80% of the precipitation occurring during this time period [35].
The Grande River Basin (GRB), which has a population of 8.6 million, with approximately 75% concentrated in urban centers, is one of the most important water resource regions in Brazil. In order to manage water resources in the basin in a sustainable manner, the National Water and Sanitation Agency and the Grande River Basin Committee developed the Integrated Water Resources Plan for the Grande River Basin [35]. Among the actions foreseen in this plan is the implementation of waterquality studies and identification of potential mitigation actions aimed at improving water quality in the basin. Specifically, the control of water pollution, especially related to specific constituents capable of causing eutrophication of reservoirs, such as nutrients, from both point and non-point sources, is a priority water quality management issue in the GRB, as nutrients are negatively impacting water quality in the basin. Effective control of water pollution requires information about constituent sources to allow for prioritization of areas where appropriate actions can be taken to avoid and mitigate unwanted impacts on water quality at the basin scale.
The Integrated Water Resources Plan for the Grande River Basin reports that 97% of the urban population in the basin is served by sewage collection systems. However, less than half of the sewage that is collected is treated, and the sewage that is treated and discharged to water bodies generally contains high concentrations of nutrients due to low removal efficiency. The water resources plan identified the need to both expand sewage treatment services to increase the amount of sewage that is treated and increase nutrient removal capacity in the existing treatment plants to limit negative water-quality impacts on receiving waters. In addition to point sources of nutrients, spatially distributed non-point source inputs from the terrestrial environment to streams are also sources of The Integrated Water Resources Plan for the Grande River Basin reports that 97% of the urban population in the basin is served by sewage collection systems. However, less than half of the sewage that is collected is treated, and the sewage that is treated and discharged to water bodies generally contains high concentrations of nutrients due to low removal efficiency. The water resources plan identified the need to both expand sewage treatment services to increase the amount of sewage that is treated and increase nutrient removal capacity in the existing treatment plants to limit negative water-quality impacts on receiving waters. In addition to point sources of nutrients, spatially distributed non-point source inputs from the terrestrial environment to streams are also sources of potential concern. However, studies of non-point nutrient sources of nutrients in the GRB are limited, and little is known about the relative importance of these sources in contributing to nutrient pollution. Filling the knowledge gap related to the shares of diverse sources of nutrients to streams and their contributions to eutrophication of downstream receiving water bodies will allow water resources managers to optimize efforts aimed at improving sanitation services and agricultural practices in the basin and prevent or mitigate eutrophication in the GRB.

Model Description and Estimation
Total nitrogen loads in surface waters of the Pardo-Sapucaí sub-basins were modeled according to the USGS SPARROW framework [8,11,36]. The model uses a spatially explicit, hybrid (statistical and mechanistic) structure to quantify the sources and transport of water-quality contaminants in watersheds. The equation underlying the SPARROW model that describes the mean annual TN load (L i * ) leaving incremental stream sub-watershed i is given by: where the TN load from all upstream contributing sub-watersheds J(i) that are delivered to sub-watershed i is represented by the first summation term, where L j equals the measured load if the upstream reach j is monitored, or the model-estimated load (L * i ) if there is no monitored value. The term δ i is the dimensionless fraction of upstream load delivered to sub-watershed i. The attenuation of TN as it travels through the stream network is represented by A-the aquatic transport function, which defines the fraction of load entering the upstream end of sub-watershed i that is delivered to the downstream end of sub-watershed i. A is a function of stream (S) and reservoir (R) characteristics defined by vectors Z S i and Z R i . The terms θ S and θ R are coefficient vectors associated with Z S i and Z R i , respectively. The second summation term represents the incremental TN load contributed in sub-watershed i. Specific sources of TN in sub-watershed i are represented by specific sources (S n ), indexed by n = 1, . . . , N S . The term α n is the source-specific coefficient associated with each source (S n ). The term α n combined with D n , which is the land-to-water delivery function, determine the TN load delivered to the stream in sub-watershed i. The term D n is a source-specific function of a vector of land-to-water delivery variables defined by vector Z D i . The term θ D is a vector coefficient associated with Z D i . The term A' is the aquatic transport function applied to TN transport from the midpoint of sub-watershed i to the outlet of the sub-watershed. Detailed descriptions of the theory and development of the SPARROW model, including descriptions of the vectors represented in Equation (1), are available from [8,11]. The model has been widely used in the United States over the past two decades (e.g., [10,21,[37][38][39][40][41]) and in selected countries.
The model structure describes three phases of nitrogen transport in watersheds: nitrogen generation, landscape and sub-surface transport to water bodies, and stream and reservoir transport and loss processes. The mechanistic features of the model include mass balance constraints and non-conservative transport, applicable to a network of one-dimensional stream segments and their contributing drainage areas. Nitrogen generation was described in the model using two types of explanatory variables: (1) Diffuse non-point sources include area-based land-cover variables that were used as surrogates for the primary nitrogen sources in watersheds associated with various types of land uses (e.g., urban and cropland areas); these sources may include nitrogen in human and animal waste runoff, atmospheric deposition of nitrogen (stationary and non-stationary sources), and farm fertilizers; and (2) Point sources include estimates of nitrogen loadings from wastewater treatment plants that are discharged directly into streams and reservoirs. Landscape and sub-surface transport of nitrogen is described in the model by climatic factors (temperature, precipitation) or other properties of the terrestrial landscape (e.g., soils, geology) that control the attenuation and delivery of nitrogen to streams. Stream and reservoir transport and nitrogen removal were described according to first-order decay kinetics [11], which assume that the loss of nutrients (the rate of removal per unit time) varies inversely in streams with streamflow magnitude [17] and in reservoirs with the areal hydraulic load [11]. These relations are consistent with previous observations of nitrogen transport and losses in streams and reservoirs regionally [21] and in experimental field investigations and literature studies (e.g., [10,17,22]).
Non-Linear Least Squares (NLLS) methods were applied to the SPARROW model to estimate the values of the coefficients [11] associated with the diffuse and point sources, land-to-water delivery variables, and the in-stream and reservoir loss variables. The final estimates of the coefficient values using NLLS methods ensure the most accurate fit of the model to the stream monitoring observations, such that differences between the simulated and observed predictions of the mean total nitrogen load at stream monitoring stations are minimized on average. The NLLS methods and supporting measures of model accuracy (e.g., root mean square error (RMSE), R-squared (R 2 ), bias) and statistical diagnostics (e.g., coefficient p-values and variance inflation factors, residual plots and maps) were used to compare the accuracy of alternative models (with different numbers and types of explanatory variables) and to select the final model with the most statistically important explanatory variables (e.g., see statistical methods described in [11]). A nonparametric bootstrapping procedure with 200 iterations was used to estimate uncertainty in predicted loads at the outlets of the two sub-basins [11]. The SPARROW models Water 2020, 12, 2911 6 of 20 were statistically estimated and evaluated for accuracy using the open-source USGS RSPARROW modeling system [34].

Evaluations of Nitrogen Management Scenarios
The calibrated total nitrogen SPARROW model was used to simulate the potential effects of hypothetical managed reductions in nutrient sources on the nitrogen loadings and concentrations in downstream water bodies within the Pardo-Sapucaí sub-basins. Using the interactive DSS feature in the USGS RSPARROW software [34], the potential downstream water-quality effects of improved management of urban diffuse and point sources of nitrogen were illustrated. The hypothetical management scenarios included assessments of the effects of reductions in nitrogen loads associated with improved levels of wastewater treatment (e.g., primary to secondary treatment) at established sewage plants as well as load reductions associated with the construction of new wastewater treatment facilities. For the former case, nitrogen loads from the point source variable in the model (wastewater treatment plant loads) were reduced by selected percentages; whereas, for the latter case, the nitrogen loading per unit area estimated for the urban diffuse source in the model was reduced by selected percentages. The hypothetical scenarios of nitrogen reduction were illustrated for various sub-watersheds of the Pardo-Sapucaí River sub-basins, using control settings in the RSPARROW DSS tool that allows users to customize the geographic areas for applying the scenarios.

Stream Reach Network
The SPARROW model is based on a synthetic stream reach network and associated drainage area polygons or "reaches". The stream network and reaches are constructed to allow for the upstream-to-downstream transport of loads through the network. Characteristics for each reach are determined by combining the reach data with geospatial data representing source and landscape transport variables. The synthetic stream reach network and reach relations were created using the 1:50,000-scale Brazilian cartography drainage lines [42] and the 1-arc second Shuttle Radar Topography Mission (SRTM) Digital Elevation Model [43]. All reach characteristics included in the model input file were calculated using the "PgHydro" extension [44] for PostGIS/PostgreSQL. The stream reach network consisted of 68,586 reaches, with a mean reach area of 0.62 km 2 and a mean reach length of 0.72 km.

Calibration Data and Load Estimation
Bi-monthly discrete TN concentration data were obtained from the Environmental Company of Sao Paulo State (CETESB) and daily discharge data were obtained from the Brazilian National Hydrometeorological Network (RHN). Data collected between January 2001 and June 2017 were used for all subsequent analyses. Discharge and water-quality data were not collected at the same sites. The discharge was estimated at water-quality sampling locations by applying regionalization methods based on regression (polynomial) models. A drainage area-discharge relationship was developed using discharge data from 88 of the RHN sites and applied to estimate daily discharge at the CETESB monitoring locations. The discrete TN concentration data and daily discharge estimates were applied to estimate TN loads for calibration of the SPARROW model.
Long-term, time-detrended mean annual TN loads were computed at each site using a 5-parameter regression, Beale's Ratio Estimator, and an algorithm to select the better of those two estimates at each site. Model estimates and flow corrections were centered on 1 October 2010 by fixing the time input at that date when generating predictions. The 5-parameter regressions (RL5) were implemented within the "rloadest" R package [45]. These linear regressions take the following form: where L is the load in kg d −1 , Q is discharge in m 3 s −1 , T is decimal time in y, c is a centering function that subtracts the mean, and β are the fitted parameters. Each model was fitted by adjusted maximum likelihood estimation [46], then predictions were retransformed by exponentiation and the retransformation bias was corrected as in [47]. Beale's Ratio Estimator (BRE) identifies a representative ratio L:Q for each of eight conditions: low or high discharge crossed with four seasons per year. Daily discharge and day-of-year information is then used to generate an estimate of load on each day, and loads are summed to estimate the multi-year mean load. Despite its simplicity, BRE is frequently more reliable than alternatives including RL5 for decadal mean load estimates under certain conditions [48].
To select between RL5 and BRE estimates for each site, we used the following decision algorithm: If RL5 identified no significant time trend (p-value < 0.1 for 2) or the center of the observation data time window was within 2 years of 10 October 2010, then the BRE estimate was preferred. If the time trend was significant and the observation window was not centered on 2010, then RL5 was preferred as long as the ratio of the RL5 estimate to the BRE estimate was between 1.5 and 0.667; if this ratio was too large or small by these arbitrary but commonsense thresholds, then we rejected both load estimates for that site. After identifying a preference for either BRE or RL5, we used that preferred model at that site as long as the standard error was less than 20% of the estimate, and otherwise, we rejected both load estimates for that site. Of the 43 sites for which TN loads were estimated, this algorithm selected the BRE estimate for 29 sites and the RL5 estimate for 14 sites. Both estimates were rejected for 2 sites, resulting in 41 sites for which TN load estimates were generated to calibrate the SPARROW model.
We also used estimates from three other methods to screen for any poor matches between model structure and the available data at each site. RL7 was a 7-parameter regression with terms for squared discharge and squared decimal time as well as the terms in Equation (1), also implemented in the "rloadest" package. The composite method (CMP) was the sum of the RL5 concentration predictions and a rectangular interpolation of the residuals, as implemented in the "loadflex" R package [49]. A simple rectangular interpolation (INT) of the concentration observations, multiplied by discharge and summed to estimate multi-year average loads was also implemented using loadflex. While the RL7, CMP, and INT model comparisons did help us identify and correct a few data errors, we ultimately found no reasons for concern about the selected BRE or RL5 models. We only used estimates from the RL5 or BRE models in the SPARROW model because RL5 and BRE include uncertainty estimates (CMP and INT do not) and were less prone to overfitting than RL7.

Explanatory Data
Sources and Landscape Transport Characteristics Input data to the SPARROW model included data that describe sources of TN to streams and watershed characteristics that influence the transport of TN from the landscape to streams (i.e., landscape transport characteristics). A variety of sources were tested for potential inclusion in the model, including multiple agricultural land area sources, urban land area, population, and point source inputs from wastewater treatment plants (Table 1). The land use data used in the model were developed specifically for the Integrated Water Resources Plan for the Grande River Basin [35]. All source variables were constrained to be non-negative in the model [11]. Mean annual air temperature and precipitation were tested for potential inclusion in the model as landscape transport characteristics. Landscape transport characteristics were not constrained and were mean-adjusted and log-transformed as needed to approximate normal distributions [11].

In-Stream and Reservoir Transport
In-stream loss of TN in flowing waters was modeled as a first-order decay process, specified as a function of estimated travel time. Travel time was estimated as reach length divided by stream velocity, which was estimated using the Jobson equation [53]. The potential effect of reservoir storage on TN loads was estimated using a reservoir decay term, estimated as a function of areal hydraulic load in reservoirs (mean annual flow divided by reservoir surface area). This term was not identified as being statistically significant and was therefore not included in the final model.

Model Assessment
Exploratory models were developed to test different model configurations. The final model configuration selected contained four source variables and an in-stream decay term ( Table 2). All model input and output data as well as model code are available [54]. None of the tested landscape transport variables were identified as being statistically significant. Calibration data were weighted using a nonlinear weighted least-squares analysis [11]; the "NLLS_weights < -'lnload'" option in RSPARROW [34]. Model estimation used the conditioned model predictions, which are obtained by substituting the predicted load on reaches with monitoring stations with the observed station load where available [11]. RSPARROW reports model performance for both conditioned and unconditioned (simulated) model predictions. Here, we report the conditioned model predictions, as they provide the most accurate reach predictions for model estimation and reduce the downstream propagation of errors [11]. However, for assessing model performance, we report the model performance metrics associated with both conditioned and unconditioned predictions, as the unconditioned predictions are most representative of the model skill and performance at monitored and unmonitored stream locations.
Model statistics indicate a good fit of the model to the calibration data ( Table 2). The yield R 2 values indicate that the model explained 68% of the variability in mean annual TN yield for conditioned predictions and 60% of the variability in mean annual TN yield for unconditioned predictions. RMSE values were 0.43 (conditioned) and 0.48 (unconditioned), indicating that the average error in model predictions (associated with one standard deviation of the model error) was 43% (conditioned) and 48% (unconditioned). Percent bias values for conditioned and unconditioned predictions were −3.94% and −6.80%, respectively. Negative percent bias values signify that on average, Water 2020, 12, 2911 9 of 20 the model tended to overestimate observed loads. Residuals were approximately normally distributed as a function of mean annual TN yield (Figure 2). The Moran's I standard deviate was −1.15 (p = 0.25), indicating that there was no statistically significant spatial autocorrelation in the residuals.
The four source variables identified in the final model configuration included the area of pasture land, the area of urban land, the area of land under cultivation for sugar cane, and point source inputs of total nitrogen from wastewater treatment plants ( Table 2). The p-values for all source variables were less than 0.05. Source coefficients for the three land area-based sources can be interpreted as the mean TN yield (mass per unit area) from a given source. The dimensionless wastewater treatment plant source coefficient was 1.15. A value of 1.0 for point sources is expected if the estimates of point-source loads are accurate and the model representation of the nitrogen sources and stream transport is properly specified. The estimated mean coefficient value of 1.15 and standard error of 0.41 (Table 2) indicates that the estimated loads from wastewater treatment plants in the model are statistically different from the expected value of one (p = 0.025; two-tailed t-test with 36 degrees of freedom); thus, assuming that the model is reasonably specified, the estimates of wastewater nitrogen load that are input to the model may be slightly underestimated and are subsequently corrected by the model estimation. Variance inflation factor (VIF) values for all source coefficients were low (<2; [11]), indicating that there was no statistical evidence for multicollinearity among explanatory variables.
A variety of climatic and terrestrial variables (e.g., temperature, precipitation, water runoff, soil physical and chemical properties) have been previously identified in SPARROW studies (e.g., [10,21]) and other literature studies [55][56][57][58] as being important landscape transport variables influencing the delivery of TN from the landscape to streams in other watersheds. In this study, only landscape transport variables for climatic conditions were tested for potential inclusion in the model; none were identified as being statistically significant and were therefore not included in the final model configuration. This finding does not indicate that temperature and precipitation (or other landscape transport variables) are not important variables determining delivery of TN to streams in the Pardo-Sapucaí sub-basins, but only that at the regional scale, a simple model with sources and in-stream loss terms has been found to sufficiently describe the amount and spatial variability of TN as indicated by model fit statistics. In-stream loss, modeled as a first-order decay process as a function of travel time, was identified as being weakly significant (p = 0.13); there was also insufficient evidence of collinearity with the source variables (VIF = 3.34). The estimated coefficient for in-stream nitrogen loss of 0.08 d −1 is representative of estimated loss rates reported in large temperate streams [17,59].

TN Loads and Yields to Streams
The model was calibrated to TN loads for the base year of 2010 under long-term mean hydrologic conditions. Therefore, the flux estimates presented here are representative of typical TN fluxes under mean streamflow conditions and, therefore, are expected to reflect the average of temporally varying annual fluxes. The SPARROW model estimated that approximately 20,000 tons/yr of TN was delivered to the outlet of the Pardo-Sapucaí sub-basins (Table 3). SPARROW allows for the apportionment of loads from different sources; this is illustrated for the Pardo-Sapucaí sub-basins in Table 3. The model estimates that of the 20,000 tons/yr of TN delivered to the outlet of the watershed, approximately 2200 tons (11%) originated from pasture land,~5500 tons (28%) originated from urban landscapes, 6400 tons (32%) originated from land under cultivation for sugar cane, and~6000 (30%) tons originated from wastewater treatment plants (Table 3). While only 11% of the total load was estimated to be derived from pasture land, this land-use makes up only 2.1% of the land area in the watershed. It is of note that the estimated 5500 tons of TN from urban land is not inclusive of the nitrogen loading from wastewater treatment plants, which is separately estimated by the model. This finding suggests that the additional capture and routing of urban waters through wastewater treatment plants may be an effective approach for decreasing total TN loading to streams in the basin. The effects of such an approach on TN loading to streams is discussed in the following section. Table 3. Estimated annual mass and share of total nitrogen from each source delivered to the outlet from the two sub-basins combined estimated using the standard (i.e., non-bootstrapping procedure) SPARROW modeling approach. Estimated mass ± one standard error and share of load delivered from each source ± one standard error for each source in the individual sub-basins, as estimated by the bootstrapping process, are also included.

Area of Watershed (km 2 ) a Mass Delivered to Outlet (Tons) Share of Total Load (%)
Combined Mean incremental and mean delivered incremental TN yields predicted by the model are shown in Figure 3. The incremental yields are the yields generated within each incremental catchment, and the delivered incremental yields are the yields generated within each catchment that are delivered to the outlet of the Pardo-Sapucaí sub-basins after accounting for nitrogen losses in streams. The median incremental TN yield from the 68,856 reaches in the watershed was 4.66 kg/ha/year (interquartile range: 1.97-6.10 kg/ha/year). Some of the largest incremental yields were estimated for streams in the southeast portion of the watershed (Figure 3a), where pasture land is prevalent. However, due, in part, to the fact that this part of the watershed is furthest from the outlet, with the nitrogen loads subject to larger losses during transport to the outlet, this portion of the watershed displays lower delivered incremental yields relative to other parts of the watershed (Figure 3b).

Wastewater Treatment Plant Source Reduction Scenarios
Nearly one-third of the TN load being delivered to the outlet of the Pardo-Sapucaí sub-basins was estimated to originate from wastewater treatment plants (Table 3). While economic considerations would need to be taken into account, this point source input represents a source that could be removed from the watershed by, for example, improved levels of wastewater treatment via upgrading from primary to secondary or tertiary treatment.
Hypothetical source reduction scenarios were implemented using the DSS feature in the USGS RSPARROW software to assess the potential impact of such actions on in-stream TN loads. The DSS feature was applied to estimate in-stream TN loading in response to wastewater point source input reductions of 20%, 40%, 60%, and 80%. The results of this exercise are shown in Figure 4. The SPARROW model predicts that relative to baseline (i.e., current long-term average) conditions, TN loading at the outlet of the Pardo-Sapucaí sub-basins decreased by 1200 tons (6%) for the 20% reduction in wastewater point source inputs, 2400 tons (12%) for the 40% reduction scenario, 3600 tons (18%) for the 60% reduction scenario, and 4800 tons (24%) for the 80% source reduction scenario.

Wastewater Treatment Plant Source Reduction Scenarios
Nearly one-third of the TN load being delivered to the outlet of the Pardo-Sapucaí sub-basins was estimated to originate from wastewater treatment plants (Table 3). While economic considerations would need to be taken into account, this point source input represents a source that a b feature was applied to estimate in-stream TN loading in response to wastewater point source input reductions of 20%, 40%, 60%, and 80%. The results of this exercise are shown in Figure 4. The SPARROW model predicts that relative to baseline (i.e., current long-term average) conditions, TN loading at the outlet of the Pardo-Sapucaí sub-basins decreased by 1200 tons (6%) for the 20% reduction in wastewater point source inputs, 2400 tons (12%) for the 40% reduction scenario, 3600 tons (18%) for the 60% reduction scenario, and 4800 tons (24%) for the 80% source reduction scenario. Another management-relevant result obtained from the SPARROW model is that both urban land area and wastewater treatment plants were identified as being significant sources of TN to streams in the Pardo-Sapucaí sub-basins. Given the fact that wastewater treatment plants are generally co-located with urban areas, the ability of the model to distinguish between these two sources indicates the strength of these source signals. From a management perspective, this result suggests that the capture of urban runoff (e.g., unsewered residences, stormwater runoff) and the construction of new wastewater treatment facilities, or other technologies like green infrastructure, to remove TN from the currently untreated urban runoff may be an effective approach for decreasing a b c d Another management-relevant result obtained from the SPARROW model is that both urban land area and wastewater treatment plants were identified as being significant sources of TN to streams in the Pardo-Sapucaí sub-basins. Given the fact that wastewater treatment plants are generally co-located with urban areas, the ability of the model to distinguish between these two sources indicates the strength of these source signals. From a management perspective, this result suggests that the capture of urban runoff (e.g., unsewered residences, stormwater runoff) and the construction of new wastewater treatment facilities, or other technologies like green infrastructure, to remove TN from the currently untreated urban runoff may be an effective approach for decreasing TN loading to streams in the basin. To test the potential effectiveness of such an approach, we used the DSS feature in the USGS RSPARROW software to reduce the TN loading per unit area from diffuse urban land source. Specifically, the urban land source coefficient was reduced by 20%, 40%, 60%, and 80% to represent the construction of new wastewater treatment plants that would remove the specified percent of TN loading from the current, baseline urban land source. Relative to baseline conditions, TN loading at the outlet of the Pardo-Sapucaí sub-basins decreased by 1100 tons (6%) for the 20% reduction in loading per unit area from diffuse urban land sources, 2200 tons (11%) for the 40% reduction scenario, 3300 tons (17%) for the 60% reduction scenario, and 4400 tons (22%) for the 80% source reduction scenario ( Figure 5). These percentage reductions in the TN loads at the outlet of the Pardo-Sapucaí sub-basins, resulting from improved management of urban runoff, are generally similar to those associated with the reduction in point source loads, resulting from improved treatment at wastewater plants.
60%, and 80% to represent the construction of new wastewater treatment plants that would remove the specified percent of TN loading from the current, baseline urban land source. Relative to baseline conditions, TN loading at the outlet of the Pardo-Sapucaí sub-basins decreased by 1100 tons (6%) for the 20% reduction in loading per unit area from diffuse urban land sources, 2200 tons (11%) for the 40% reduction scenario, 3300 tons (17%) for the 60% reduction scenario, and 4400 tons (22%) for the 80% source reduction scenario ( Figure 5). These percentage reductions in the TN loads at the outlet of the Pardo-Sapucaí sub-basins, resulting from improved management of urban runoff, are generally similar to those associated with the reduction in point source loads, resulting from improved treatment at wastewater plants.

Model Limitations and Future Applications
We applied the new open-source USGS RSPARROW model, developed in cooperation with ANA and CPRM, to identify major sources of TN in the watershed and to quantify the spatial variability of TN in all streams in the basin. Model results indicate that pasture land contributes approximately 10% of the TN load estimated at the outlet of the watershed, with the remaining sources-land under cultivation for sugar cane, urban land, and point source inputs from wastewater treatment plants-each contributing approximately 30% of the TN load estimated at the watershed outlet. Knowledge of contaminant source contributions, and the locations of major sources, to in-stream TN loadings, as demonstrated in this study for the Pardo-Sapucaí sub-basins provides key information that can be used to prioritize sub-basins for TN load mitigation across large regional basins, as has been done in the Chesapeake Bay watershed, USA [2]. The DSS feature in the USGS RSPARROW software also provides a complementary tool to allow for assessment of the impacts of hypothetical source reductions that could be achieved through management actions on in-stream TN loading. Here we provided demonstrations of how this tool could be applied to assess in-stream TN response to (1) point source reductions from wastewater treatment plants that could be achieved by upgrading treatment facilities to have secondary or tertiary treatment capabilities, for example, and (2) reductions in TN loading per unit area from diffuse urban land sources that could be obtained by building new facilities that would treat currently untreated urban runoff. The results of these analyses suggest that, depending on the magnitude of the management action (i.e., degree of wastewater treatment or number of new wastewater treatment plants), it may be possible to reduce TN loading at the outlet of the watershed by up to 20-25%.
The SPARROW model documented here is a relatively simple water-quality model, in that it contains only four source terms and a single in-stream decay term. Despite the simplicity of the model, assessment metrics indicate that model performance is on par with that reported by other TN SPARROW modeling efforts [10,21,59], and model specification is similar to simple land-use based SPARROW source transport models (e.g., [22][23][24]60,61]). Furthermore, the statistical and mass balance components of the model are important features to ensure that the model estimates of nitrogen loadings from sources are generally consistent with the observations of nitrogen at stream monitoring locations throughout the Pardo-Sapucaí sub-basins. While the results presented here show promise for identifying TN sources for potential mitigation, there is a need to recognize model limitations. For example, the model does not provide a detailed description of the specific sources of TN in the watershed, given that land use (i.e., lands in urban, pasture, and sugar cane) is used as a surrogate for a variety of diffuse nitrogen sources (e.g., human and animal wastes, atmospheric deposition, fertilizer applications). Additionally, there are important landscape properties that can influence the delivery of TN from the landscape to streams (e.g., soil properties) that remain to be evaluated. Although the effects of climatic factors (precipitation, temperature) on stream nutrients were not found to be statistically significant in the model, the expansion of the geographic domain of the model to include data from nearby river basins could potentially broaden the range of climatic conditions represented in the model, thereby improving the sensitivity of the model [11] to these factors. These limitations on the number and types of nutrient sources and other predictor variables in the model are largely a function of the quantity and quality of explanatory and calibration data. Expanding the spatial domain of the model can both increase the range of variation in the model ("data quality") and the number of calibration sites ("data quantity"), thereby improving the statistical power of the model to detect process effects. Improvements in data quality and quantity have been previously shown to be associated with more complex and interpretable SPARROW models [11] as reflected by a larger number of statistically significant explanatory variables. The model could also be improved through the regular collection of new water-quality and streamflow data at existing and/or new monitoring locations. Finally, access to potential predictor variables including soil characteristics, geology, and industrial plant discharges may allow for improvements in model performance and prediction accuracy as well as the process interpretability of the model.
Continued improvements in the Brazil TN model will enhance its ability to assist water management planning [35] in the Grande River Basin. The SPARROW modeling approach has been previously shown to provide an informative approach for water-quality assessment and management across large spatial domains. The model provides an efficient spatial framework for advancing the large-scale understanding of streamflow processes [7] and the major contaminant sources and watershed properties that affect transport over large scales in river networks [8,10,23], information that can assist water resource management of regional river basins and inland and coastal water bodies with large drainages [2,27,28,62]. These capabilities are enabled by the hybrid characteristics of the model [7,8,11] that include a simple mechanistic structure, with mass balance and non-conservative transport constraints, and statistically estimated model parameters and their associated uncertainties.
Our study of the Brazilian Grande River Basin has extended the previous management support capabilities of SPARROW (e.g., [2,63]) by demonstrating the use of the RSPARROW integrated modeling and decision support system [34]. Although progress was previously made in developing SPARROW decision support tools [63], the lack of integrated modeling components has caused inefficiencies, related to the considerable manual labor required to link existing SPARROW models to the decision tools. By contrast, the RSPARROW software developed for this project in cooperation with Brazilian water agencies (ANA, CPRM) provides an automated, fully customized decision support tool that is enabled simultaneously with the development and calibration of a SPARROW model. This tool and its capabilities can be readily applied to large river basins in other countries where stream water monitoring data and geospatial information on watershed characteristics are available.

Conclusions
Given the importance of water resources in this region of Brazil, as indicated by the prioritization of actions aimed at improving water quality outlined by the Integrated Water Resources Plan for the Grande River Basin [35], there is a need for continued efforts to quantify human-natural systems to understand nutrient sources, delivery and transport. This application of SPARROW, facilitated by the use of the new interactive modeling and management RSPARROW mapping tool developed by the USGS in coordination with ANA and CPRM, provides a first assessment of present-day TN sources and transport in an important Brazilian watershed. Results of the example management scenarios presented here demonstrate the ability of SPARROW and the RSPARROW mapping tool to inform management efforts aimed at controlling nutrient loads and preventing eutrophication. Remotely sensed land-use data, the primary predictor variable in the SPARROW nitrogen model, are currently available nationally in Brazilian river basins and could provide a foundation for future model-based investigations of contaminant sources in major rivers. These efforts could be supported by the use of the historical data from existing Brazilian water-quality and -quantity monitoring networks, coupled with the strategic development of new stream monitoring information. The open-source nature of RSPARROW and the availability and ease-of-use of the interactive DSS tool also provides new opportunities to develop SPARROW model-based water-quality assessments in additional sub-basins of the GRB as well as other river basins of Brazil and other countries. The model-based water resource assessment approach that we illustrate here could also be expanded and applied to other water contaminants (e.g., phosphorus, sediment, biochemical oxygen demand, fecal bacteria, and pathogens). Funding: This work was funded by a cooperative agreement between ANA and the USGS.