Predicting Event-Based Sediment and Heavy Metal Loads in Untreated Urban Runoff from Impermeable Surfaces

Understanding the amount of pollutants contributed by impermeable urban surfaces during rain events is necessary for developing effective stormwater management. A process-based pollutant load model, named Modelled Estimates of Discharges for Urban Stormwater Assessments (MEDUSA), was further developed (MEDUSA2.0; Christchurch, New Zealand) to include simulations of dissolved metal loadings and improve total suspended solids (TSS) loading estimations. The model uses antecedent dry days, rainfall pH, average event intensity and duration to predict sediment and heavy metal loads generated by individual surfaces. The MEDUSA2.0 improvements provided a moderate to strong degree of fit to observed sediment, copper, and zinc loads for each modelled road and roof surface type. The individual surface-scale modelling performed by MEDUSA2.0 allows for identification of specific source areas of high pollution for targeted surface management within urban catchments.


Introduction
The accumulation, or build-up, of pollutants on an impermeable surface during dry periods is the result of interactions between several processes, including atmospheric deposition, wind erosion, surface material breakdown due to weathering, and direct deposition of particles from vehicle wear [1][2][3]. During rain events, kinetic energy in the raindrops enables the entrainment and transportation of pollutants from the impermeable surfaces. Simultaneously, additional pollutants may enter the runoff from wet deposition, where the raindrops scavenge particles from the air as they fall [4], or via dissolution of the surface material due to acidity of the rainfall [5,6]. Rainfall characteristics, such as rainfall pH, rainfall intensity (average event intensity (INTavg) and peak intensity (INTpeak)), duration (Dur), depth, and the length of the dry period between rain events (antecedent dry days: ADD), therefore influence the amount of pollutants that build up and are washed off urban surfaces. Several studies have found correlations between pollutant build-up and wash-off and rainfall characteristics for both total suspended solids (TSS) and heavy metals (Table 1), and these relationships vary with both pollutant and surface type. Table 1. Reported relationships between pollutants in untreated runoff and rainfall characteristics (chronological order).
Many stormwater quality models describe pollutant build-up and wash-off processes using rainfall characteristics to predict the resultant pollutant contribution from impermeable surfaces [23][24][25]. Model simulation structures can vary both spatially and temporally. Some models simulate individual surfaces, whereas other models are structured to simulate spatially distributed surfaces in a catchment or provide lumped representations of catchments. In terms of temporal scale, models can predict pollutant loads or concentrations either as continuously simulated values, on a single rain event basis or as an annual load [26,27]. Continuous models simulate pollutant load over a long time period, and account for the continuous build-up and wash-off of pollutants across each time step using the principles of mass balance [26,28]. An event model simulates results for an individual storm event. Annual load models use unit area pollutant load factors (based on published literature) to estimate the annual load for each pollutant.
Although some available models are sophisticated and many are comprehensive in their scope, their complexity and need for detailed input data (e.g., catchment hydraulics), or their aggregation of surfaces to a subcatchment level, pose a restriction to their use. A balance is needed between accuracy and reliability of the model outputs against the time and cost of obtaining the required input data.
This gap has led to the development of a process-based model framework, the Modelled Estimates of Discharges for Urban Stormwater Assessments (MEDUSA) model (first introduced in Fraga et al. (2016) [29]). However, the original model form did not include predictions of dissolved metals. Heavy metal partitioning in dissolved and particulate form is not only an important indicator of the potential environmental effects of the stormwater runoff, but it also directs the type of treatment that would be effective at reducing the heavy metal load. Furthermore, new field data enabled better representation of TSS and rainfall characteristics relationships to be incorporated into MEDUAS2.0 for improved TSS predictions. MEDUSA2.0 now predicts the TSS, total copper (TCu), dissolved copper (DCu), total zinc (TZn), and dissolved zinc (DZn) loads contributed by individual surfaces for individual rain events, using rainfall parameters as the independent variables. Different equations are used for each pollutant, with TSS load most related to antecedent dry days (ADD), rainfall intensity and duration; TCu and TZn loads most closely related to rainfall pH, ADD, average intensity, and duration; and DCu and DZn related back to the total metals loads. The equation coefficients are also specific to each surface type, thereby enabling dynamic relationships between various rainfall parameters and different impervious surfaces to be quantified in terms of pollutant generation in runoff. The MEDUSA2.0 framework assumes that the rate at which the material is washed from a surface is proportional to the amount of material built up on the surface at the start of a rain event and that the rate can be described by an exponential equation [7,24,30].
The objective of this paper was to present the MEDUSA2.0 physical process-based (i.e., build-up and wash-off processes) model. Firstly, the need for improvements from the previous version of MEDUSA is informed through review of literature and a new dataset of untreated runoff quality from various impermeable urban surfaces. The improvements are recorded in terms of model framework and equations. Model calibration and validation is also presented using the extensive field data. The paper also provides a catchment-wide application of MEDUSA2.0, and reports on the usability, benefits, and future needs of the model.

MEDUSA2.0 Model Framework
MEDUSA2.0 is an event-based pollutant load process model. It predicts the amount of TSS, TCu, DCu, TZn, and DZn contributed by an individual impermeable surface during a rain event, on the basis of the surface area, material type, and the surface's relationship to specific rainfall characteristics, namely, rainfall pH, average intensity, duration, and length of antecedent dry period (updated from Fraga et al. (2016) [29]) ( Figure 1 and Table 2). These particular pollutants were prioritized for modelling as previous receiving environment monitoring had identified these pollutants as elevated and of concern [31,32].

Rainfall pH
Rainfall is naturally acidic due to raindrops' scavenging of carbon dioxide to form carbonic acid as they fall. The low pH can dissolve metallic components of a surface [33,34].

Rainfall intensity (mm/h)
Average event intensity was characterized for each sampled event. The intensity is an indication of the kinetic energy present that allows the entrainment and transport of particles in runoff from a surface [24].

Length of antecedent dry period (days)
Pollutants accumulate on impermeable surfaces due to atmospheric deposition, direct deposition from vehicles, and weathering of surface, during the dry periods between rain events. Studies have shown a log or arctan relationship where pollutant build-up rates are most rapid at the start of the antecedent dry period then slower over time [6].

Event duration (h)
Pollutant wash-off continues throughout a rain event, however, the rate of wash-off is generally expected to reduce (exponentially) over the course of a rain event due to the decreasing amount of available material remaining on the surface [23]. Depth of event (mm) Greater rainfall depths, and therefore larger rainfall volumes, will generate larger total amounts of pollutants [35].
MEDUSA2.0 predicts the TSS event load from each contributing road and roof surface (TSSRoad and TSSRoof) using a relationship defined in Egodawatta et al. (2009) [1]: where A is the surface area, ADD is the length of antecedent dry period (days), INT is the average rainfall intensity of the event (mm/h), DUR is the event duration (h), Cf is the capacity factor

Rainfall pH
Rainfall is naturally acidic due to raindrops' scavenging of carbon dioxide to form carbonic acid as they fall. The low pH can dissolve metallic components of a surface [33,34].

Rainfall intensity (mm/h)
Average event intensity was characterized for each sampled event. The intensity is an indication of the kinetic energy present that allows the entrainment and transport of particles in runoff from a surface [24].

Length of antecedent dry period (days)
Pollutants accumulate on impermeable surfaces due to atmospheric deposition, direct deposition from vehicles, and weathering of surface, during the dry periods between rain events. Studies have shown a log or arctan relationship where pollutant build-up rates are most rapid at the start of the antecedent dry period then slower over time [6].

Event duration (h)
Pollutant wash-off continues throughout a rain event, however, the rate of wash-off is generally expected to reduce (exponentially) over the course of a rain event due to the decreasing amount of available material remaining on the surface [23].
Depth of event (mm) Greater rainfall depths, and therefore larger rainfall volumes, will generate larger total amounts of pollutants [35].
MEDUSA2.0 predicts the TSS event load from each contributing road and roof surface (TSS Road and TSS Roof ) using a relationship defined in Egodawatta et al. (2009) [1]: where A is the surface area, ADD is the length of antecedent dry period (days), INT is the average rainfall intensity of the event (mm/h), DUR is the event duration (h), C f is the capacity factor (simplified to 0.75 for roofs, 0.25 for roads), and a 1 to a 3 are empirically-derived coefficient values. The capacity factor is a measure of a specific rainfall intensity's ability to mobilize sediment available on the impermeable surface (1). Total metal loads from roof surfaces (TCu Roof and TZn Roof ; µg) are predicted using the following relationship: where X 0 is the initial copper or zinc concentration (µg/L); X est is the second stage copper or zinc concentration (µg/L) that the runoff drops to over a transition period, Z (hours), as observed from the intra-event concentration sampling; A is surface area (m 2 ); k is the wash off coefficient (i.e., based on the rate of decay to second stage concentrations from initial concentrations); INT is average rainfall intensity (mm/h); and DUR is event duration (h). X 0 and X est can be described by the following relationships to rainfall characteristics (where X 0 = Cu 0 and X est = Cu est for copper, and X 0 = Zn 0 and X est = Zn est for zinc): Zn est = c 7 PH + c 8 , where PH is rainfall pH, ADD is antecedent dry period (days), INT is average rainfall intensity (mm/h), and b 1 to b 8 and c 1 to c 8 are empirically-derived coefficient values. Note that the relationship of copper to pH is a power relationship, whereas zinc has a linear relationship with pH that is based on empirical observations (14). Total metal loads for road and carpark surfaces were found to strongly correlate to the TSS load in the calibration data and therefore the model predicts total metal loads for all road and carpark surfaces as a proportion of the TSS load, as follows: TZn Road, Carpark = e 1 × TSS Road, Carpark , where d 1 and e 1 are dimensionless proportionality coefficients derived from experimental data. As the heavy metal build-up and wash-off processes remain the same for each surface type across multiple rain events, the model assumes that the ratio of dissolved to particulate metals is relatively constant for any given surface type (as evidenced by untreated runoff quality data; see Appendix A). Accordingly, dissolved metal loads for all roof, road, and carpark surfaces (DCu Surface and DZn Surface ) are calculated as a proportion of the total metal load in the model, as follows: where f 1 and g 1 are dimensionless proportionality coefficients derived from experimental data.

Sample Collection and Analysis for Calibration Data
Runoff quality data was collected from the Okeover catchment in western Christchurch, New Zealand, to calibrate the MEDUSA2.0 model coefficients for the local (low-intensity) rainfall conditions. Untreated runoff samples were collected from four different urban surfaces within the Okeover catchment: a concrete tile roof (a common residential roofing material), a copper roof (used primarily as an architectural material), a galvanized roof (a common industrial, commercial, and residential roofing material), and a coarse asphalt road (most common road surface in the city). Further details of the sampled surfaces' characteristics are found in Charters et al. (2015) [36].
Time-series samples were collected throughout 25 rain events (see Table A1 in Appendix B) using a combination of grab sampling and automatic sampling (ISCO 6712C Compact Portable Automatic Sampler). Samples were analysed for TSS (as per American Public Health Association (APHA) method 2540 D (APHA, 2005)), and TCu, DCu, TZn, and DZn concentrations (as per APHA method 3125 B), as previous receiving environment monitoring had identified these particular pollutants as elevated and of concern [31,32].

Sample Collection and Analysis for Validation Data
First flush (FF; defined as the first 1 L of runoff for this study) and second stage (SS; taken at least 1 h after FF capture) samples from the Heathcote catchment, adjacent to the Okeover catchment, were used for validating the models. The validation dataset included a painted galvanized roof and collector road that were sampled over nine rain events (see Table A2 in Appendix B). Thermo Scientific Nalgene Storm Water Sampler bottles (1 L high density polyethylene (HDPE)) were used to collect the FF samples. For the collector road site, they were deployed by suspending the bottle from the sump grate in the corner of the sump where the initial runoff would flow in. For the galvanized roof site, the bottle was fitted within a Thermo Scientific Nalgene Storm Water Mounting Kit and fixed under the downpipe. Grab sampling (1 L HDPE) was used for all SS samples. Samples were analysed for TSS, TCu, DCu, TZn, and DZn, as per the Okeover samples.

Sampled Event Rainfall Data
For each sampled event, rainfall was collected and its pH measured during sample collection. For the Okeover calibration data, rainfall data were sourced from Campbell weather station data at the University of Canterbury within the Okeover catchment. For the Heathcote validation data, rainfall data were sourced from the National Institute of Water and Atmosphere's (NIWA) Kyle St Weather Station, within 2.5 km of the sampling sites. Rainfall data were compared between the University of Canterbury station and the NIWA station for the same rain events, as they are 2.2 km apart, and little difference was found in recorded amount and time of rain start and end, confirming the appropriateness of using the NIWA station data for the Heathcote sites. The 5 min interval data were aggregated into average event intensity (mm/hr), antecedent dry period (days), event duration (hours), and total event depth (mm) for each sampled event (Table 2 and Appendix B). Rainfall within 6 h of the previous rain was considered as being part of the same event.

Total Event Loads from Observed Concentrations
Event pollutant loads (g/m 2 /event or mg/m 2 /event) for each surface were calculated on a per area basis using the measured pollutant concentrations for each sample and rainfall depth accumulated over the time interval between samples. These event loads derived from observed concentration data (hereafter, observed loads) were used to both calibrate the model to achieve as close as fit as possible to these event loads and for validation. Optimal calibration was achieved by adjusting the model coefficient values (i.e., a 1 to a 3 in Equation (1), b 1 to b 8 in Equations (4) and (5), c 1 to c 8 in Equations (6) and (7), d 1 and e 1 in Equations (8) and (9), and f 1 and g 1 in Equations (10) and (11)) to obtain the best predictive accuracy and goodness of fit.

Assessing Model Fit to Case Study Observed Data
The Nash-Sutcliffe model efficiency (NSE) and percent bias (PBIAS) statistics were used to assess the predictive power of the model and its goodness of fit to observed data for calibration and validation. The NSE was developed for assessing hydrological models [37], but has also been employed for modelling sediment and nutrient loadings [38]. It describes the predictive accuracy of the model in comparison to the observed data. The NSE is defined as: where x o is the mean of the observed pollutant loads, x j m is the modelled load, and x j o is the observed load for rain event j. An NSE value of 1 indicates a perfect fit between the modelled and observed loads, a value of 0 < NSE < 1 indicates the model is a better predictor than the observed mean, and NSE = 0 indicates the model is only as accurate as the observed mean, whereas NSE < 0 indicates the observed mean is a better predictor than the model. Modelled and observed loads were log-transformed before the NSE was applied to reduce the influence of any peak events as they increase the sensitivity of NSE to systematic over-or under-prediction [39].
The percent bias (PBIAS) is a measure of the average tendency of the model-predicted values to be greater or smaller than their observed values [40]. It has been commonly used for hydrological models and is recognized for its ability to clearly identify poor model performance [40]. The PBIAS is defined as: where x m is the modelled load and x o is the modelled load and is the observed load. A value of 0 indicates a perfect fit; the smaller the PBIAS value, the better the performance of the model.

Case Study Application: Okeover Catchment Description
The Okeover Catchment is a mixed residential/institutional catchment in western Christchurch, New Zealand. A GIS map of the 61 ha catchment (of which 40% is impermeable) was developed that delineated all individual roof, road, and carpark surfaces contributing runoff to the stormwater network and ultimately to the Okeover Stream via multiple discharge points ( Figure 2). Each surface was classified on the basis of material type and assigned appropriate model properties ( Figure 3 and Appendix C). Hardstand areas such as driveways on private residential property were not included in the modelling as their pollutant loads to the net stormwater runoff were considered to be negligible. Roofs are the largest surface type in the catchment (58% of impermeable surfaces), with 51% of roofs galvanized and 25% of roofs concrete tile ( Figure 3). MEDUSA2.0 was run for each of the delineated surfaces.

Optimized Model Fits
The MEDUSA2.0 model produced moderate to strong goodness of fit NSE values for all pollutants, for all four surfaces (Table 3)

Example Application of MEDUSA2.0 to A Case Study Catchment: Okeover Catchment, Christchurch, New Zealand
Once the model was calibrated and its goodness of fit assessed, the calibrated MEDUSA2.0 model was applied to the local Okeover catchment, where the untreated runoff samples used for model calibration had been collected. The model was run for a full year of rain events from the year 2012 (see Appendix E for summarized event details), as researchers at the University of Canterbury had measured rainfall pH for several rain events in 2012. Therefore, a complete set of characterized rainfall events was available, with minimal assumptions required for rainfall pH. Although there will be variation from year to year, 2012 also had relatively typical annual rainfall for Christchurch (Christchurch Botanic Gardens weather station recorded 631 mm annual rainfall for 2012 [41]); Christchurch's mean annual rainfall is 647 mm [42] and it provides an indication of the expected variation of rain events across a year. Average event loads were derived from the average of all 88 rain events of 2012.
The modelling predicted substantial annual pollutant loads (Table 4), with roads and carparks contributing most of the predicted 4.9 t/year of TSS, but roofs contributing most of the predicted 54 kg/year of zinc. There was also significant load variation (multiple orders of magnitude) for each pollutant across the 88 modelled events (Figure 7).  MEDUSA2.0 produced moderate to strong NSEs when applied to untreated runoff data from the same surface types in the adjacent Heathcote catchment used for validation (Table 3), with the sole exception being total and dissolved copper prediction for the galvanised roof. It should be noted that the copper concentrations on the sampled galvanized roofs were found to be low, as the sole source was atmospheric deposition.

Example Application of MEDUSA2.0 to A Case Study Catchment: Okeover Catchment, Christchurch, New Zealand
Once the model was calibrated and its goodness of fit assessed, the calibrated MEDUSA2.0 model was applied to the local Okeover catchment, where the untreated runoff samples used for model calibration had been collected. The model was run for a full year of rain events from the year 2012 (see Appendix E for summarized event details), as researchers at the University of Canterbury had measured rainfall pH for several rain events in 2012. Therefore, a complete set of characterized rainfall events was available, with minimal assumptions required for rainfall pH. Although there will be variation from year to year, 2012 also had relatively typical annual rainfall for Christchurch (Christchurch Botanic Gardens weather station recorded 631 mm annual rainfall for 2012 [41]); Christchurch's mean annual rainfall is 647 mm [42] and it provides an indication of the expected variation of rain events across a year. Average event loads were derived from the average of all 88 rain events of 2012.
The modelling predicted substantial annual pollutant loads (Table 4), with roads and carparks contributing most of the predicted 4.9 t/year of TSS, but roofs contributing most of the predicted 54 kg/year of zinc. There was also significant load variation (multiple orders of magnitude) for each pollutant across the 88 modelled events (Figure 7).

MEDUSA2.0 Model Performance
MEDUSA2.0's predictive performance was good for all the modelled pollutants with NSE of ≥0.43 for TSS, ≥0.46 for TCu, and ≥0.63 for TZn. MEDUSA2.0 offers a process model that can be calibrated to a particular catchment using local runoff quality data, for estimating pollutant loads from individual impermeable surfaces during each rain event. Where MEDUSA2.0 was able to be applied to surfaces outside the Okeover catchment (i.e., with the Heathcote catchment data), the model produced moderate to strong NSEs with the sole exception of copper loads from the painted galvanised roof. However, as the only source of copper on such a roof is from atmospheric deposition and is of low magnitude, it is not surprising that an effective model fit cannot be readily found. A mean value could instead be derived from local runoff data and applied instead of directly modelling the copper load on such a low-copper surface using rainfall characteristics.

Influence of Rainfall Characteristics on Pollutant Loads
In the MEDUSA2.0 model, ADD is used to predict TSS loads, regardless of surface type, which is based on observations of dry weather pollutant build-up on surfaces [1,43]. The MEDUSA2.0 model also assumes rainfall pH is a significant variable relating to both initial and second stage copper concentrations, and the initial concentrations are also dependent on ADD and average intensity. For the TCu model, the copper roof was substantially more influenced by rainfall pH than the other two roof surfaces and, to a lesser extent, by average rainfall intensity (see Appendix D for coefficient values). For the TZn model, the galvanized roof was more influenced by rainfall pH (particularly during second stage conditions) than the other two roof surfaces. It was also more influenced by ADD and, to a lesser extent, average rainfall intensity. This could be expected, as the natural (slight) acidity of rainfall is a key driver of the copper and zinc generation (via dissolution) on the copper and

MEDUSA2.0 Model Performance
MEDUSA2.0's predictive performance was good for all the modelled pollutants with NSE of ≥0.43 for TSS, ≥0.46 for TCu, and ≥0.63 for TZn. MEDUSA2.0 offers a process model that can be calibrated to a particular catchment using local runoff quality data, for estimating pollutant loads from individual impermeable surfaces during each rain event. Where MEDUSA2.0 was able to be applied to surfaces outside the Okeover catchment (i.e., with the Heathcote catchment data), the model produced moderate to strong NSEs with the sole exception of copper loads from the painted galvanised roof. However, as the only source of copper on such a roof is from atmospheric deposition and is of low magnitude, it is not surprising that an effective model fit cannot be readily found. A mean value could instead be derived from local runoff data and applied instead of directly modelling the copper load on such a low-copper surface using rainfall characteristics.

Influence of Rainfall Characteristics on Pollutant Loads
In the MEDUSA2.0 model, ADD is used to predict TSS loads, regardless of surface type, which is based on observations of dry weather pollutant build-up on surfaces [1,43]. The MEDUSA2.0 model also assumes rainfall pH is a significant variable relating to both initial and second stage copper concentrations, and the initial concentrations are also dependent on ADD and average intensity. For the TCu model, the copper roof was substantially more influenced by rainfall pH than the other two roof surfaces and, to a lesser extent, by average rainfall intensity (see Appendix D for coefficient values). For the TZn model, the galvanized roof was more influenced by rainfall pH (particularly during second stage conditions) than the other two roof surfaces. It was also more influenced by ADD and, to a lesser extent, average rainfall intensity. This could be expected, as the natural (slight) acidity of rainfall is a key driver of the copper and zinc generation (via dissolution) on the copper and galvanized roofs, respectively. Likewise, the influence of ADD could be expected as the dry weather period allows build-up of pollutants from atmospheric deposition, but also, importantly, contributes to weathering and degradation of the metallic surfaces with a corresponding increased leaching rate.
Gnecco et al. [11] did not find any correlation between runoff pollutant amounts such as event mean concentration (EMC) and ADD, and attributed this to the low-to-medium rainfall intensity and low total rainfall volume of their sampled events, where the amount of pollutant wash-off was too low to show differences in dry weather TSS build-up between each event. This study's rainfall is similarly of low intensity. A study within the same Okeover catchment of atmospherically deposited TSS, copper, and zinc did show that pollutant build-up was significantly influenced by ADD; however, pollutant wash-off processes had a stronger influence than build-up processes on the overall pollutant loads generated from atmospheric deposition [18].

Individual Model Limitations
Currently, MEDUSA2.0 has a restricted number of pollutant-rainfall parameter relationships, which may inadvertently present inaccurate model coefficient values. For example, the calibrated TSS coefficient values for MEDUSA2.0 suggests that the copper roof was substantially more influenced by ADD than the concrete and galvanised roofs (i.e., higher ADD coefficient values; see Appendix D). However, because MEDUSA2.0's framework is restricted to relating TSS loads to ADD only, it is likely that the high TSS loads from copper roofs are not due directly to ADD (patination is largely driven by water, carbon dioxide, and time), but simply a means by which the model can reproduce high TSS loads. Furthermore, the NSE values of the calibrated MEDUSA2.0 model were lower for TSS than for metal predictions, suggesting that factors beyond rainfall characteristics are important drivers of pollutant build-up and wash-off. As always, there is a balance sought in the model framework between achieving a reasonable fit without over-parameterization of the model or creating an overly complex running process. Further research is needed to characterize a wider variety of surface types (such as commercial and industrial carparks and roads of different traffic characteristics) to better represent them in the model and incorporate surface factors such as aspect and slope as appropriate. MEDUSA2.0's use of generalized build-up and wash-off equations allows the model to be applied to catchments outside of Christchurch's low intensity rainfall climate, and further studies are presently underway to assess the ease and effectiveness of recalibrating MEDUSA2.0 to other catchments as well as serving as further model validation.
MEDUSA2.0 predicts pollutant loads as they are generated at each surface, and therefore does not account for pollutant changes as runoff is conveyed through the stormwater network or discharged into the receiving waterway. Additionally, the presence of sumps (and other (pre)treatment systems), for example, may induce settling of coarse sediment and associated metals, which are removed from the stormwater runoff.
Within its current scope, MEDUSA2.0 allows stormwater managers to identify individual or aggregated impermeable surfaces that can be targeted for optimized stormwater management. However, further research is needed to incorporate multiple stormwater management options and pollutant transformation processes into the MEDUSA2.0 model framework so that different management strategies can be computed within the modelled scenarios.

Data Limitations
As the available dataset for MEDUSA2.0 model calibration was limited, the entire Okeover dataset was used for calibration instead of truncating the dataset for calibration and having a small validation dataset (following similar approaches by hydrologic modellers, for example Shrestha et al. (2016) [44]). Limited validation data were then provided from untreated runoff sampling in an adjacent catchment for two of the same surface types sampled and modelled in the Okeover catchment. These showed good model performance for the key pollutants of concern from those particular surface types. Further validation data should now be collected from the same surface types as the model has been calibrated for in different geographical locations to expand the conditions under which the model performance is validated. The significant load variation observed in the modelled results for the Okeover case study application indicate that validation data would be particularly valuable for events with rainfall characteristics that produce high event loads (e.g., long duration rain events).
The sampling dataset (that was used to calibrate the model and apply it to the Okeover catchment) was restricted to the four most common impermeable surface types within the catchment. Therefore, assumptions were made by assigning model coefficient values to all the other surface types present in the catchment (Appendix C). Further untreated runoff water quality data are needed from different surface types, particularly carparks, unpainted galvanised roofs, and new painted Zincalume roofs (e.g., Colorsteel) to enable the calibration of model coefficients for a wider range of surface types.

Implications of Model Predictions on Management Approaches
The variation in MEDUSA2.0 pollutant loading from individual surfaces confirms that it is necessary to apply pollutant load models at an individual surface scale rather than a catchment or land use scale, as surface type is a key driver of pollutant generation. The variations for each pollutant type, for the same surface, also reinforce the need to use specific relationships for each pollutant and rain event, rather than assuming metal loads are proportional to sediment loads as is done with many current models. Metals in dissolved form were a major contributor to metal loads, and therefore it is important that any pollutant model framework does incorporate dissolved metals predictions, as understanding the proportion of metals in dissolved form influences the treatment system selection and source reduction decisions for most effective pollution management. The predicted load variation across multiple rain events must be considered when selecting and designing pollutant treatment systems to ensure effective sizing and maintenance planning.

Conclusions
The improvements made to develop MEDUSA2.0 have resulted in a model that performs well in predicting TSS, total and dissolved copper, and total and dissolved zinc loads generated from individual impermeable urban surfaces, with common rainfall parameters being used as the predictor variables. Moderate to strong NSE goodness of fit values were achieved in most instances.
Quantifying the amount of pollutants contributed by each surface during rain events is necessary for the development and implementation of effective and efficient stormwater management options. Furthermore, adopting a modelling approach to predict pollutant loads derived from different impermeable surfaces offers a more cost-effective and reliable tool for understanding pollutant "hotspots" than can be achieved through water quality sampling alone.

Total and Dissolved Metal Relationships from Untreated Runoff Sampling Data
Comparison of observed dissolved loads against total loads for copper and zinc show a strong linear relationship, particularly for the surfaces with the highest loads ( Figure A1). This provides a sound basis for the MEDUSA2.0 framework to use a proportional relationship in calculating dissolved metals from total metal loads.

Appendix C
Notes: Calibration data were available for four different surface types: a concrete tile roof (Cr), a copper roof (Cu), a galvanized roof (Gv), and an asphalt road (Rd). Validation data were available for Gv and Rd.
See Equations (1) to (11) for coefficient relationships to model variables.