Assessing the Effects of Climate Change on Water Quantity and Quality in an Urban Watershed Using a Calibrated Stormwater Model

Assessing climate change (CC) impacts on urban watersheds is difficult due to differences in model spatial and temporal scales, making prediction of hydrologic restoration a challenge. A methodology was developed using an autocalibration tool to calibrate a previously developed Storm Water Management Model (SWMM) of Difficult Run in Fairfax, Virginia. Calibration was assisted by use of multi-objective optimization. Results showed a good agreement between simulated and observed data. Simulations of CC for the 2041–2068 period were developed using dynamically downscaled North American Regional CC Assessment Program models. Washoff loads were used to simulate water quality, and a method was developed to estimate treatment performed in stormwater control measures (SCMs) to assess water quality impacts from CC. CC simulations indicated that annual runoff volume would increase by 6.5%, while total suspended solids, total nitrogen, and total phosphorus would increase by 7.6%, 7.1%, and 8.1%, respectively. The simulations also indicated that within season variability would increase by a larger percentage. Treatment practices (e.g., bioswale) that were intended to mitigate the negative effects of urban development will need to deal with additional runoff volumes and nutrient loads from CC to achieve the required water quality goals.


Introduction
Historical evaluations of the U.S. climate ) revealed significant temperature increases for nearly all US cities, which was attributed to climate change (CC) as opposed to "heat island" effects caused by urban development [1]. Nearly 30% of the urban areas exhibited a significant increase in extreme precipitation. Hayhoe et al. [2] predicted an increase in precipitation during winter and spring under higher and lower emissions by the end of the 21st century. Najjar et al. [3] found that, in the mid-Atlantic region, precipitation magnitude and intensity, CO 2 concentrations, sea level, and water temperatures are likely to increase by the end of the 21st century. These predicted increases in rainfall magnitude and intensity could cause infrastructure failures due to increased runoff volumes and rates [4][5][6][7], overwhelming systems designed for much less. Limited studies using design storms and intensity-duration-frequency (IDF) curves have been conducted to assess the relative impact of CC on stormwater infrastructure. For example, Madsen and Figdor [8] found that CC in the Soils are classified by NRCS into four soil hydrologic groups ranging from A to D, where A has the least runoff potential and D has the greatest runoff potential. The majority of the soils in the watershed (40.5%) are Glenelg, soil hydrologic group B, which signifies a moderate infiltration capacity, but is also highly susceptible to erosion [44]. A small proportion (5%) of the soils adjacent to streambeds are subject to inundation during high flows. Current land uses in the Difficult Run watershed range from forest to urban; of the latter, the most predominant subtype is urban residential development at approximately 57% [44]; shown in Table 1. The percentage of impervious surfaces in the watershed is approximately 18.4%, which is about 1/3rd of the urban area [44]. Soils are classified by NRCS into four soil hydrologic groups ranging from A to D, where A has the least runoff potential and D has the greatest runoff potential. The majority of the soils in the watershed (40.5%) are Glenelg, soil hydrologic group B, which signifies a moderate infiltration capacity, but is also highly susceptible to erosion [44]. A small proportion (5%) of the soils adjacent to streambeds are subject to inundation during high flows. Current land uses in the Difficult Run watershed range from forest to urban; of the latter, the most predominant subtype is urban residential development at approximately 57% [44]; shown in Table 1. The percentage of impervious surfaces in the watershed is approximately 18.4%, which is about 1/3rd of the urban area [44]. Like much of Fairfax County and metropolitan D.C., the Difficult Run watershed has experienced significant urban development, which has degraded water quality, reduced aquatic habitat, and increased flooding not unlike other streams draining rapidly urbanized watersheds [45]. After conducting a detailed hydrologic and water quality study of the Difficult Run watershed, a comprehensive watershed management plan (WMP) was adopted by Fairfax County Fairfax County [44] to address water quality and quantity issues. In the development of the WMP, over 900 existing SCMs were assessed. These SCMs provide peak flow storage and varying degrees of treatment for urban runoff. The WMP assessed stream conditions for the mainstem and 18 main tributaries. Using geographic information system (GIS) data provided by the county, hydraulic/hydrologic (H/H) and water quality models were developed for the Difficult Run watershed [44]. These models were used to assess current conditions, and to guide future management planning to meet water quantity and quality goals under anticipated future conditions.
After the study was complete, the U.S. Geological Survey (USGS), in partnership with Fairfax County, conducted a comprehensive, seven-year monitoring program of Difficult Run [46]. This study provided the hydrologic and water quality data used for calibration and verification. Difficult Run sediment yields ranged from 378.3 to 9939.2 kg per ha (kg/ha) annually; with higher loads associated with the degree of urbanization. Total N loads ranged from 4.1 to 18.2 metric tons and corresponding yields ranged from 5.9 to 14.7 kg per ha (kg/ha), annually. Total P loads ranged from 0.190 to 3.79 tones and corresponding yields ranged from 0.24 to 2.8 kg per ha (kg/ha), annually. Monitoring stations in Difficult Run showed higher TSS loading in 2008, and 2011 associated with Tropical Storms Hanna and Lee, respectively. Dissolved N composed 60 to 85 percent of the total annual N load, and the annual dissolved N load was correlated with runoff volume [46]. Total P loads were composed of about 74% sediment-associated P, resulting in a strong correlation between total and dissolved P and annual peak runoff. Dissolved TN and TP sources in Difficult Run were attributed to application of fertilizer to residential lawns, and human and animal waste from inefficient septic systems [46].

Hydraulic/Hydrology (H/H) Modeling
Urban H/H models use climatological data (precipitation, temperature), land use data, and hydraulic system data to estimate water and pollutant flux across the landscape by simulating hydrologic and water quality processes. Key hydrologic processes include rainfall-runoff, infiltration, evapotranspiration, and flow routing. In this study, the U.S. EPA's SWMM version 5.1.010 was employed for H/H modeling (U.S. Environmental Protection Agency, Edison, NJ, USA). SWMM can perform both single event and continuous simulation and has been widely used in urban areas [24,25,31,47]. Outputs of the model include runoff and/or streamflow and water quality constituent loads and/or concentrations. SWMM simulates groundwater flow for each subcatchment through a single aquifer, which is defined by the depth of its unsaturated upper zone and lower saturated zone, bottom of aquifer, groundwater flow parameters, porosity, wilting point, field capacity and saturated hydraulic conductivity. This data was obtained from the geologic map [48,49] and SSURGO database [50]. These parameters were taken from raster grids or attributes associated with a specific area. GIS spatial operations were conducted to create a weighted area average for each subcatchment.
Evaporation from subcatchment surfaces, from subsurface water in aquifers, and from streams was modeled using methods outlined in [51]. SWMM uses the Hargreaves equation for evaporation simulation [31]. Evaporation in the SWMM models used in this study was computed from daily minimum and maximum temperatures, including historical or projected. The Green-Ampt infiltration method was used to estimate infiltration and excess rainfall. This method was selected due to the physical basis of its parameters (suction head, hydraulic conductivity, and initial moisture deficit), and their availability in the Soil Survey Geographic database (SSURGO) database [50]. Flow routing used SWMM's dynamic wave option was selected due to its accuracy and ability to simulate non-uniform, unsteady state flow conditions. Land use and soil data for the watershed were obtained from the WMP [44]. Two USGS stream gauges draining areas of 7 km 2 and 150 km 2 , respectively, are located within the watershed. Data from these stations were used to calibrate and verify the model. Then, projections for precipitation, and temperature across the watershed from the NARCCAP [52] for historical and future conditions was used in the calibrated and verified model to predict streamflow, and TSS, TN and TP concentrations for historical and projected CC conditions.

Water Quality Modeling
Water quality was modeled using an estimated event mean concentration (EMC) washoff loading during runoff events. A simple treatment functions which is commonly used in many urban models were employed in this study as follows: where: C t+1 is concentration of pollutant at time t + 1 in mg/L, C 0 is initial concentration of pollutant in mg/L, and k is removal constant. Water quality characteristics were developed based upon EMCs, which were set to 40 mg/L for TSS, 2.9 mg/L for TN, and 0.27 mg/L for TP, based upon average values for Virginia [53].
In SWMM, water quality treatment expressions have the form of: where: R is fractional removal, C is concentration of pollutant in mg/L, P is the pollutant, R p is the pollutant removal for pollutant, and V is a process variable. Treatment expressions in SWMM were developed by assuming irreducible concentrations [54]. This concept holds that treatment effectiveness is reduced once a low level of pollutant concentration has been reached, i.e., it gets more difficult to remove the last fraction of pollutant. The irreducible concentration relationship is often used to characterize runoff quality data [55], and was recently applied by Chretien et al. to characterize TSS, TN, and TP behavior within an agricultural BMP [56]. TSS and TN concentrations were estimated by: where: C TSS is the concentration of TSS, and C TN is the concentration of TN, D is water depth in m, DT is time step in seconds, i and i + 1 represent the previous and current time.
TP was split into two fractions, soluble phosphorus (SP) and particulate phosphorus (PP): The SP reduction was estimated by: The PP reduction was computed as a function of removal of TSS:

R for Stormwater Management Model (RSWMM)
In order to efficiently calibrate the model without altering the SWMM source code, an external, freely available control program was needed. SWMM is constantly being updated; so, use of a non-altered SWMM source code ensures compatibility with future updates. R [57] (RCD Team, 2015) is an open source, freely available system that can be used for statistical analysis and programming. R has been successfully used in hydrological modeling and its capabilities are well recognized [58]. R is compatible with most operating systems. Because of these capabilities, R was chosen for use in this study for development of a control module that could execute SWMM simulations repetitively, changing key parameters according to user needs. An existing R code known as RSWMM (https://www.openswmm.org/Topic/4390/rswmm-autocalibration-of-swmm-in-r) was identified and was used for this study [59]. Several enhancements were developed as part of this project, including separation of events, incorporating an autocalibration tool described in a later section, and a post-processor to view exceedance curves of the output. The reader is referred to Table A1 in the Appendix A for further information on the RSWMM code.

Calibration
A typical SWMM model application contains many input parameters. As mentioned previously, SWMM model outputs are sensitive to hydraulic width, imperviousness and depression storage depth [34,60], the first two parameters being positively correlated to peak flow, the latter, negatively correlated. Calibration of the Difficult Run watershed model was conducted by varying hydraulic width (the sub-catchment area divided by the maximum overland flow length), imperviousness and depression storage depth (a depth that must be filled prior to the runoff occurs) for 2010 data. Hourly rainfall data were disaggregated to 15-min time step for the proposed H/H simulations using NETSTORM [61]. Then, the simulation results were aggregated to an hourly time series and SWMM was calibrated with respect to peak flows and runoff volume during all events in upstream and downstream catchments. A 6 h minimum inter-event period was used to classify data into events. Two gauging stations located in upstream and downstream sub-watersheds (location shown in Figure 1) were used for calibration and verification. Calibration and verification were performed using events from 2010 and 2013, to calibrate and verify the model performance, respectively.

Autocalibration Methods
The autocalibration tool within RSWMM utilizes a multi-objective optimization package, NSGA-II, which provides functions for box-constrained multi-objective optimization. Three objective functions were used. These included: coefficient of determination (R 2 ), Nash Sutcliffe Efficiency (NSE), and Percent Bias (PBIAS). R 2 describe the degree of collinearity between simulated and observed data and ranges from 0 to 1. R 2 close to 1 indicate less error variance and is desirable. Values greater than 0.5 are considered satisfactory in the study by Santhi et al. [62] and Van Liew et al. [63].
NSE is a normalized statistic which indicates how well the plot of observed versus simulated data fits and ranges from −∞ to 1.0 [64]. A value of NSE close to 1 is considered as an excellent level of performance. NSE was calculated as: where Y i obs is observed data values, Y i sim is simulated data values, and Y mean is the mean of observed data.
Percent bias (PBIAS) shows the deviation of data and expressed as a percentage. It measures the average tendency of the simulated data to be different from observed data [65]. PBIAS close to 0 is considered to be an excellent level of performance. PBIAS was calculated as: where Y i obs is observed data values, and Y i sim is simulated data values.
SWMM was calibrated using an autocalibration procedure, adjusting specific model parameters described within a defined range. Observed and simulated outputs were compared at upstream and downstream gauging stations to determine the parameter set that provide the best model simulated runoff peak and volume. With the completion of a given optimization, sets of calibrated parameters were obtained through optimization. Default values that were used from the previously developed model as initial model parameters. Based on the study by Ref. [62,63], R 2 and NSE ≥ 0.75 and PBIAS ≤ ±10%, which are considered to be very good calibration performance. R 2 and NSE ≥ 0.65 and ±10% ≤ PBIAS ≤ ±15%, which are considered to be good calibration performance. R 2 and NSE ≥ 0. 5 and ±15% ≤ PBIAS ≤ ±20% are considered as satisfactory calibration performance. The autocalibration approach aims to minimize the difference between measured and simulated values. Initial parameter values were allowed to vary by ±20% to create parameter sets to initialize the autocalibration procedure. Calibration and verification were performed on the 2010 and 2013 data, respectively, with a maximum number of iterations set at 500. Because the SWMM model was based on subhourly rainfall data, the run time for each iteration was large.

Climate Modeling
To predict future CC impacts on runoff quantity and quality, projections are required for precipitation and temperature across the watershed. In this study, simulations from the North American Regional CC assessment program (NARCCAP) [52] were used. Projections were based upon dynamical downscaling; embedding a regional climate model (RCM) with 50 km 2 spatial resolution into a global climate model (GCM). Two time periods, 1971-1998 and 2041-2068 were used for historical and future CC conditions, respectively. The latter scenario represents a medium-high greenhouse gas emissions assumption, A2 [66].
Data were obtained from one GCM-RCM combination, the MM5I-CCSM, which was selected based on its ability to accurately simulate historical temperature and precipitation. Annual mean temperature and precipitation were averaged over the region for nine of the NARCCAP models including CRCM_cgcm3, ECP2_gfdl, HRM3_gfdl, HRM3_hadcm3, MM5I-CCSM, MM5I_hadcm3, RCM3_cgcm3, RCM3_gfdl, WRFG_cgcm3, and compared the means to the mean obtained from an observation-based dataset. Figure 2a,b show the annual mean temperature and precipitation of 9 models, respectively. Although the historical performance of the model was reasonable, bias was still evident, so, biascorrection was conducted. Bias correction was applied using a modified version of the equiratio cumulative distribution function matching method [20]. This method corrects model data using multiplicative scaling factors. The scaling factor applied to each model data point is the ratio of the observed value at the data point's quantile to the historical-period model value at the data point's quantile. Using the notation of Wang and Chen [20], this is expressed mathematically as: where is a value from the model during the prediction period (which may be either the historical or future period), is the empirical cumulative distribution function (CDF) of the model prediction data, is the inverse CDF of the observed data, and is the inverse CDF of the model data during the historical (or current) period. This correction method was applied to the model-simulated precipitation and temperature data for both time periods. The observed CDFs in the numerator of Equation (10) were determined using forcing data from Phase 2 of the North American Land Data Assimilation System (NLDAS-2) [67]. This dataset was selected since it has Although the historical performance of the model was reasonable, bias was still evident, so, bias-correction was conducted. Bias correction was applied using a modified version of the equiratio cumulative distribution function matching method [20]. This method corrects model data using multiplicative scaling factors. The scaling factor applied to each model data point is the ratio of the observed value at the data point's quantile to the historical-period model value at the data point's quantile. Using the notation of Wang and Chen [20], this is expressed mathematically as: where x m−p is a value from the model during the prediction period (which may be either the historical or future period), F m−p is the empirical cumulative distribution function (CDF) of the model prediction data, F −1 o−c is the inverse CDF of the observed data, and F −1 m−c is the inverse CDF of the model data during the historical (or current) period. This correction method was applied to the model-simulated precipitation and temperature data for both time periods. The observed CDFs in the numerator of Equation (10) were determined using forcing data from Phase 2 of the North American Land Data Assimilation System (NLDAS-2) [67]. This dataset was selected since it has matching variables for all climate model data, it has a relatively high spatial (1/8 degree) and temporal (hourly) resolution, and did not have missing data.
Several modifications were applied to the correction algorithm. For precipitation, use of Equation (10) would set the frequency of zero precipitation to be equal to the observed frequency in the historical and future periods. To relax this restriction, model data with a CDF value less than the frequency of no precipitation in the observed dataset were set to zero. For the historical period, this results in the frequency of corrected model dry values equaling the frequency of observed dry values. However, for the future period the frequency of dry values in the corrected data may change. The correction in Equation (10) was then applied using only values greater than zero in each dataset. For downwelling shortwave radiation, the bias correction was only applied using modeled and observed values greater than zero. Model air temperature predictions were corrected using an additive shift rather than a multiplicative ratio. Unlike precipitation and other variables, an additive correction applied to temperature is unlikely to produce values that are below zero. The bias correction and temporal disaggregation were applied separately for each calendar month.
NARCCAP provides model output at three-hour intervals. This temporal resolution was deemed sufficient for all variables except precipitation. Temporal disaggregation was applied to convert precipitation data to hourly frequency using a method developed to create input data for the Variable Infiltration Capacity model [68]. This method starts with daily precipitation totals (produced from upscaled and subsequently bias-corrected 3 h NARCCAP precipitation) and uses a CDF mapping approach to assign the daily amounts to fall during hours such that the disaggregated model CDFs of precipitation duration and hour of occurrence match the observed CDFs, using hourly precipitation totals at the Ronald Reagan National Airport in Washington, D.C. Separate CDFs and corrections were applied for each calendar month and for five daily-total precipitation bins: 0-5, 5-10, 10-15, 15-20, and 20+ mm. Unlike the bias correction method, this temporal disaggregation method does not allow the distributions of duration and time of occurrence to change from the historical to the future period. However, because the disaggregation step preserves the daily total amounts, which it receives from the bias correction step, the daily total precipitation amounts may still change from the historical to the future CC projection.

Statistical Analysis
Changes in precipitation, runoff, TSS, TN, and TP in the watershed were investigated using the student's t-test to find whether there was a significant difference between the historical and future conditions. The two-sample t-test was used to test the null hypothesis that the population means of two groups are the same. The t-test assumes the underlying distribution is normal, therefore the normality of the datasets was tested using Shapiro-Wilk test [69]. In addition, flow statistics, i.e., Q95, Q50 and Q10, which represent low, median, and high flows, respectively, were computed and compared to historical and predicted conditions using exceedance probability plots.

Calibration and Verification
Calibration and verification were performed using the data from upstream and downstream gauging stations. Performance metrics of the model with respect to peak flow and runoff volume after calibration and verification for upstream and downstream gauges are provided in Tables 2 and 3 [62,63], and [70], simulated and observed runoff volume showed a good level of agreement on an event basis.
The most sensitive parameters identified in previous studies were hydraulic width, imperviousness, and depression storage for impervious and previous areas. During calibration, each parameter was adjusted to improve agreement between observed and predicted peak flows and runoff volume. However, adjusting the hydraulic width of some of the smaller catchments in the watershed did not have much impact on runoff. Depression storage for impervious and pervious did not have significant effects on the results. Calibration and verification results based on the peak flows are shown in Figures 3 and 4. The most sensitive parameters identified in previous studies were hydraulic width, imperviousness, and depression storage for impervious and previous areas. During calibration, each parameter was adjusted to improve agreement between observed and predicted peak flows and runoff volume. However, adjusting the hydraulic width of some of the smaller catchments in the watershed did not have much impact on runoff. Depression storage for impervious and pervious did not have significant effects on the results. Calibration and verification results based on the peak flows are shown in Figures 3 and 4.    The results indicate an underestimation of peak flow in the verification period, especially in wet events that may be due to missing of some processes in the SWMM model such as groundwater recharge. While SWMM has a groundwater flow component, it is primarily a surface runoff model, and some subsurface processes are neglected [31]. Flows that are produced via infiltration loss and routed through subsurface may not be fully captured and simulated by the model.
Performance metrics of the model with respect to runoff volume for upstream and downstream gauges are provided in Tables 4 and 5, respectively. The results revealed that simulated and observed runoff volume showed a good level of agreement on an event basis. Simulated and observed hydrographs for some selective events including dry and wet events are shown in Figure 5. The results indicate an underestimation of peak flow in the verification period, especially in wet events that may be due to missing of some processes in the SWMM model such as groundwater recharge. While SWMM has a groundwater flow component, it is primarily a surface runoff model, and some subsurface processes are neglected [31]. Flows that are produced via infiltration loss and routed through subsurface may not be fully captured and simulated by the model.
Performance metrics of the model with respect to runoff volume for upstream and downstream gauges are provided in Tables 4 and 5, respectively. The results revealed that simulated and observed runoff volume showed a good level of agreement on an event basis. Simulated and observed hydrographs for some selective events including dry and wet events are shown in Figure 5. Average annual flow in the calibration and verification period at upstream and downstream gauging stations are provided in Tables 6 and 7, respectively. The results indicate good performance of the calibration in simulating mean annual flow during calibration and verification periods. A map of the model calibrated parameters by sub-catchments in the watershed are provided in Figure 6.    A simplified assessment of performance was conducted by comparing predicted values with station annual loads. Model results at the upstream gauging station for TSS, TN, and TP showed an R 2 of 0.61, 0.57, and 0.58, respectively, during the calibration; and an R 2 of 0.65, 0.59, and 0.61, during the calibration at the downstream gauging station. The performance of the model through calibration for TSS, TN, and TP is illustrated in Figure 7a,b for upstream and downstream gauge locations, respectively. The final model parameters for each sub-catchment are provided in Table 8. A simplified assessment of performance was conducted by comparing predicted values with station annual loads. Model results at the upstream gauging station for TSS, TN, and TP showed an R 2 of 0.61, 0.57, and 0.58, respectively, during the calibration; and an R 2 of 0.65, 0.59, and 0.61, during the calibration at the downstream gauging station. The performance of the model through calibration for TSS, TN, and TP is illustrated in Figure 7a,b for upstream and downstream gauge locations, respectively. The final model parameters for each sub-catchment are provided in Table 8.

Climate Change Impacts
After calibration and verification, the flow and water quality in the watershed was simulated for the 2041-2068 period. The annual average flow and water quality during historical and future periods were calculated and compared. The simulation results indicate that the mean annual runoff is predicted to increase by 6.5% with CC. The mean annual TSS, TN, and TP were predicted to increase by 7.7%, 7.0%, and 8.1%, for upstream and downstream locations, respectively.
Seasonal changes in runoff, TSS, TN, and TP are shown in Figure 8a-c, respectively.

Climate Change Impacts
After calibration and verification, the flow and water quality in the watershed was simulated for the 2041-2068 period. The annual average flow and water quality during historical and future periods were calculated and compared. The simulation results indicate that the mean annual runoff is predicted to increase by 6.5% with CC. The mean annual TSS, TN, and TP were predicted to increase by 7.7%, 7.0%, and 8.1%, for upstream and downstream locations, respectively.
Seasonal changes in runoff, TSS, TN, and TP are shown in Figure 8a-c, respectively.  Annual precipitation for 2041-2068 increased by 6.1%, while temperature rose by 1.6 • C relative to 1971-1998. Seasonal changes in runoff during 2041-2068 ranged from −22.3% to 32.3%, with the highest increase occurring from October to December. Analysis of seasonal runoff variations indicates increasing runoff in fall and summer and decreasing in winter and spring. Smaller increases in runoff were predicted during the summer, which could be due to increased evapotranspiration and less groundwater recharge. In the CC projections, summer and fall are associated with more precipitation and higher temperatures, with more frequent occurrences of heavy precipitation. Higher temperatures will lead to higher evapotranspiration rates and lower soil moisture. Lower rainfall and higher temperatures in the winter and spring will result in less runoff volume because the water can be absorbed by the soil, whereas higher rainfall and temperature in summer and fall will lead to increased runoff volume due to more precipitation.
Precipitation during the CC projection was characterized by more frequent occurrences of heavy precipitation. Annual mean temperature during the CC projection was 1.6 • C warmer than during the historical period; the warming is relatively uniform throughout the seasons. During September-December, the model predicted runoff, TSS, TN, and TP to increase by 22.5%, 13.2%, 12.8%, and 14.1%, respectively, relative to the historic time period. On a monthly basis, the greatest expected increase was December, with runoff volume, TSS, TN, and TP projected to increase by 32.3%, 29.7%, 24.8%, and 37.7%, respectively. April shows the greatest decrease, with runoff volume, TSS, TN, and TP projected to decrease by 22.3%, 25.7%, 24.7%, and 27.3%, respectively. Imperviousness in highly urbanized watersheds like Difficult Run has been identified as one of the main drivers of runoff pollution and stream degradation [46] from this watershed. Increased precipitation resulting from CC would result in increased streamflow which will likely lead to further stream instability and erosion, resulting in further water quality degradation from anticipated higher level of imperviousness, temperature, and precipitation.
Pair-wise comparison to determine if there was a significant difference between the mean monthly historical and CC projections was conducted using a t-test for runoff volume, TSS, TN, and TP. A Shapiro-Wilk test [69] indicated that, at p-value > 0.05, the dataset follows a normal distribution. The results of t-test are shown in Table 9. Table 9. Student's t-test pairwise results. The results of the t-test indicated that the differences between the mean monthly values of historical and predicted CC runoff volume, TSS, and TN were not significant at the 95% confidence level. TP values were significantly different at 95% confidence interval, which implies that TP may be more sensitive to CC. In general, TP loads are composed of soluble phosphorus (SP) and particulate phosphorus (PP), both of which are strongly correlated with TSS and peak runoff. As a result, increases in peak runoff and TSS may lead to higher levels of TP compared to runoff, TSS, and TN in the watershed under anticipated CC.

t-Calculated t-Critical p-Value
The mean, median, and maximum values, and the range between the 10th and 95th percentiles (Q10 and Q95) for the future period, were compared with statistical parameters for the historical period (Table 10). Compared with the historical period, the mean, median, maximum runoff volume, TSS, TN, and TP loads during the events increase under CC. In addition to the increase in the mean, median, maximum, runoff, TSS, TN, and TP, the interannual variability increased in most months. Interannual variability of precipitation, temperature, streamflow, TSS, TN, and TP are shown in Figure 9a-f, respectively.
Water 2017, 9,464 18 of 24 In addition to the increase in the mean, median, maximum, runoff, TSS, TN, and TP, the interannual variability increased in most months. Interannual variability of precipitation, temperature, streamflow, TSS, TN, and TP are shown in Figure 9a-f, respectively. To analyze the effect of CC on runoff volume and water quality parameters, exceedance probability curves were developed, as shown in Figure 10a-d, for runoff volume, TSS, TN, and TP, respectively. The curves were created by plotting the simulated results by event for the historical and future periods. Values of runoff volume, TSS, TN, and TP for the 10th, the 50th, and 95th percentiles are provided in Table 11. To analyze the effect of CC on runoff volume and water quality parameters, exceedance probability curves were developed, as shown in Figure 10a-d, for runoff volume, TSS, TN, and TP, respectively. The curves were created by plotting the simulated results by event for the historical and future periods. Values of runoff volume, TSS, TN, and TP for the 10th, the 50th, and 95th percentiles are provided in Table 11.  Figure 10a shows a significant difference between the historical and future runoff volume during larger storm events occurring at the 10% probability. Historical and future exceedance curves for TSS and TN track very closely, except for small differences apparent in mid-range events (Figure 10b,c).
Of the water quality constituents, TP shows the most difference, and there are more frequent events (Figure 10d). Results in Table 11 show that all the flow statistics (Q95, Q50, and Q10) increase under CC. The results also indicate a 30% to 158% increase in Q10.
Water 2017, 9,464 19 of 24 Table 11. Statistics during events for the historical and future periods. Figure 10a shows a significant difference between the historical and future runoff volume during larger storm events occurring at the 10% probability. Historical and future exceedance curves for TSS and TN track very closely, except for small differences apparent in mid-range events (Figure 10b,c).
Of the water quality constituents, TP shows the most difference, and there are more frequent events (Figure 10d). Results in Table 11 show that all the flow statistics (Q95, Q50, and Q10) increase under CC. The results also indicate a 30% to 158% increase in Q10. The results of this study provide a guidance to better understand which pollutants are most critical and pose the greatest variability under CC projections. Therefore, it helps managers and decisionmakers develop better mitigation actions and select the most appropriate SCMs. It may  The results of this study provide a guidance to better understand which pollutants are most critical and pose the greatest variability under CC projections. Therefore, it helps managers and decisionmakers develop better mitigation actions and select the most appropriate SCMs. It may enable a more reliable and rational CC decision-making process for this specific case study watershed. Overlooking the impact of CC on water quantity and quality could result in failing to achieve water quality goals, may result in unanticipated flooding, and likely will result in damages and waste.

Conclusions
A hydrologic/hydraulic/water quality model (SWMM) was used to simulate a continuous rainfall-runoff response in an urban watershed, Difficult Run, Fairfax, VA. The model was then calibrated to observed conditions for peak flows, runoff volume, and water quality using two gauging stations in the watershed. The calibrated model was verified by assessing its performance against historical data. After calibration and verification, the impact of CC on the runoff volume and water quality in the watershed was assessed using downscaled precipitation and temperature data. Recent simulations from the NARCCAP were adapted for this purpose, using dynamic downscaling. The hydrology of the Difficult Run watershed was then simulated using SWMM, with the assistance of RSWMM for calibration, event processing and exceedance curve plotting for historical and future conditions. The following conclusions can be drawn from the study:

1.
Three SWMM parameters, including hydraulic width, imperviousness, and depression storage, were altered during calibration. Hydraulic width and imperviousness were the most sensitive parameters affecting peak flows. 2.
NARCCAP and other dynamically downscaled model datasets are particularly useful for hydrological modeling because they provide downscaled data for all of the necessary variables from multiple global and regional models and are openly available online. Dynamical downscaling of the CMIP5 models is being performed by the North American Coordinated Regional Climate Downscaling Experiment (NA-CORDEX), but this project is not yet complete. Temperature and precipitation data from NA-CORDEX have been recently published, but other data that are needed for hydrological simulations, such as wind speed and radiation, are not yet available. Thus, the NARCCAP dataset is still the most current set of complete downscaled model data for our study area.

3.
The hydrological impact of the CC projection indicates that the mean annual runoff is predicted to increase by 6.5%, and TSS, TN, and TP are predicted to increase by 7.66%, 6.99%, and 8.1%, respectively. Statistically, only projected TP loads were significantly different than historical TP loads. 4.
The simulation results demonstrate that the mean, median, maximum, and the range between the 10th and 95th percentiles were projected to increase in the future with CC conditions. The interannual variability in the future is also projected to increase.
Q10, Q50, and Q95 were compared using exceedance probability curves. Results show that these flow statistics are projected to increase in the future with CC; the greatest difference occurred at the 10th percentile for runoff volume.
The limitations of this analysis stem from the sources of uncertainty, which include: uncertainty in GCMs, including their inherent assumptions regarding future emission of greenhouse gases, uncertainty in the representation of climatology at regional and local scales, and uncertainty of parameters required for input in development of hydrological models. Analysis of uncertainty propagating through the multiple models and processes covered in this study was beyond the scope of this case study, but is a research recommendation. A key limitation of the case study analysis is the use of a single greenhouse emission scenario. However, the contribution of this paper is its development of methods for water quality modeling of an urban watershed subjected to CC. Methods used in this paper can be extended to incorporate a wide range of CC models.
Understanding CC effects on water quantity and quality in urban watersheds will help water resource managers and planners make better decisions in managing water resources issues. Studies such as these can assist them as they evaluate the complex physical, social and economic impacts of CC on urban communities. In addition, making predictions of the impact of CC will help improve the management of stormwater systems to reduce flooding and to produce better water quality through appropriate selection of SCMs.