1. Introduction
Stormwater quality models are essential tools to support the planning of urban water infrastructure. Having reliable model outputs is of high relevance since infrastructural stormwater measures are cost-intensive and have a long service life. Available stormwater quality models still replicate natural pollutant processes in a simplified manner, which in turn leads to uncertain model results [
1,
2].
Pollutant processes are commonly differentiated into two conceptual phases, (i)
buildup and (ii)
wash-off, which are both deterministically described by empirical formulas [
3]. These model concepts assume that the amount of pollutant masses at the surface generally increases to a maximum as a function of antecedent dry weather periods and decreases as a consequence of rainfall/runoff.
Previous studies, however, demonstrated the inadequacy of this simplified concept to continuously model pollutant concentrations. For example, Muschalla et al. [
4] calibrated a buildup/wash-off approach of a stormwater quality model to simulate chemical oxygen demand (COD) concentrations in stormwater discharges by means of a multi-objective auto-calibration scheme. The results obtained did not outperform a model employing a constant stormwater concentration approach. Sage et al. [
5] applied a Bayesian calibration scheme based on the Markov chain Monte-Carlo (MCMC) method to assess the build/wash-off model performance to replicate continuous total suspended solid (TSS) concentrations and event loads. The authors confirmed the poor predictive power of the model applied and generally questioned the buildup/wash-off approach.
Bonhomme and Petrucci [
6] indicate that pollutant models and their parameters lack physical meaning and, thus, represent rather black-box models. In fact, numerous authors propose a modified wash-off equation to appropriately account for rainfall characteristics. Egodawatta et al. [
7] and Muthusamy et al. [
8], for example, introduce a capacity factor to reflect the impact of rainfall intensity and that only a fraction of pollutants are mobilized during storm events. Both rainfall intensity and a ratio of sediment mass per unit catchment area to rainfall intensity are also considered in a modification suggested by Zhao et al. [
9]. In addition to the sensitivity of rainfall intensity on the wash-off process, Alias et al. [
10] highlight the intra-event variability of rainfall as another influential factor. Obviously, wash-off is also influenced by particle characteristics and environmental variables, such as surface type and land use, as pointed out by Egodawatta et al. [
7] and Zhao et al. [
11].
While a more physical-based description of rainfall-induced wash-off which also appropriately respects environmental conditions would clearly improve the representativity of model outputs, both pollutant buildup and wash-off are significantly affected by stochastic inputs [
12] which, in turn, can hardly be predicted. As a consequence, Sage et al. [
5] stress the need for an alternative modelling approach, which also incorporates the effects of stochasticity on pollutant buildup and wash-off. This aligns with Harremoës [
13] who already claimed to respect stochasticity when using stormwater quality data.
The calibration of stormwater quality models conventionally aims to minimize the difference between observed and simulated pollutographs. While this allows to incorporate intra-event variability, pollutant stochasticity is rarely taken into account as goodness-of-fit is calculated specific to the event.
Several studies in the past decades have respected probabilistic pollutant characteristics. Scholz [
14] applied an autoregressive moving-average modelling approach for both continuous buildup and wash-off of pollutant concentrations to account for unpredictable environmental impacts. However, the approach could not be appropriately calibrated because of a lack of data. Motivated by unavailable urban storm runoff quality data, Osman Akan [
15] analytically derived a frequency distribution to predict annual solids wash-off from impervious urban areas. His concept includes rainfall characteristics and catchment parameters for buildup and wash-off and is exemplified for an artificial industrial catchment. Due to a lack of data, the approach could not be validated. A probabilistic approach to model TSS loads and dynamics of urban areas has also been proposed by Rossi et al. [
16]. Their concept uses (i) a parameterized power function to approximate intra-event TSS dynamics with a normal distributed exponent; (ii) log-normal distributed event mean concentrations (EMC) to estimate total TSS masses per event; and (iii) a uniform distributed daily wastewater discharge combined with a constant TSS concentration. While the practical benefit of the model is clearly highlighted, the authors point out the simplified process description and its limited predictive power. Chen and Adams [
17] introduce a general probability distribution approach in which cumulated distribution functions for pollutant loads and event mean concentrations are obtained from probabilistic rainfall-runoff transformation. Sharifi et al. [
18] performed Monte-Carlo simulations and used corresponding results to assess the effects of stormwater best management practices on water quality for six toxic metals. Rossi et al. [
16] also assumed a power–law relationship between runoff and pollutant concentrations during an event. However, they stochastically considered the exponent of the used power equation for the intra-event relationship, which, in turn, led to a large number of pollutographs to be analyzed.
A refinement of the exponential wash-off equation by incorporating stochastic fluctuations is analyzed by Daly et al. [
19]. Here, the coefficient dominating the wash-off process is assumed to be random and consequently addressed by adding Gaussian noise. A good agreement to empirical distributions for TSS and TN (total nitrogen) is reported, but this required a large amount of data. Qin et al. [
20] obtained the frequency distributions of (i) the event pollutant load; (ii) the event mean concentration; and (iii) the peak concentration of COD from a continuous simulation of an urbanized catchment. Exponential equations for buildup and wash-off were employed and calibrated with regard to continuous COD concentration measurement data using a genetic algorithm. It is, however, mentioned that the predictive power is limited because the study site undergoes further developments. Annual loads for micropollutants have been estimated based on theoretical distribution functions of the event mean concentration for three residential catchments by Hannouche et al. [
21].
The literature shows various approaches to take the stochasticity of pollutant processes into account. While early studies primarily used probabilistic methods to overcome the scarcity of stormwater quality data, recent studies using continuous quality data tend to admit the variability of natural pollutant processes by employing stochastic concepts. With regard to continuous long-term stormwater quality simulations, the presented alternative modelling approaches incorporate stochasticity through (i) probabilistic description and transformation of model input data (rainfall-runoff); (ii) modification of empirical pollutant buildup/wash-off equations; (iii) distribution-based parameterization of intra-event dynamics; and (iv) probabilistic analysis of model results after calibration (post-processing).
It has, however, not been investigated whether available stormwater quality models can be calibrated towards probabilistic pollutant characteristics. Using a distribution-based calibration proposes an additional alternative to incorporate pollutant stochasticity. In contrast to approaches already introduced, this method maintains existing model concepts and avoids expensive post-processing.
The present paper reports on the development of an innovative stormwater quality model calibration approach using TSS event load distribution. The approach is demonstrated on two real-world models for which reliable distributions are available. Calibrated models are finally used to estimate annual TSS loads, which is a key parameter for emission control in several stormwater management guidelines.
3. Results and Discussion
Calibration results for both sites are shown in
Table 6. Statistics for both model parameters and the Kolmogorov–Smirnov-based objective function are given. The best fit parameter sets yielded to an objective function of roughly 0.05 for both models.
According to the low Kolmogorov–Smirnov statistic Dn of approx. 0.05 for both sites, the best-fit parameter sets obtained by the distribution-based calibration approach lead to well-approximated parameterized lognormal distributions. From a statistical perspective which also takes the number of samples into account, it can be legitimately assumed that both distributions (lognormal and simulated TSS event loads) follow the same distribution. Both KS statistics are below the critical values at the 90% significance level (0.082 for site FR and 0.118 at site PL).
Distributions of simulated and observed event mean concentrations (EMC) are compared in
Table 7. A notably high agreement of mean EMC is obtained for both sites (FR: 33 mg L
−1, PL: 62 mg L
−1). It can also be observed that EMC percentiles of simulation for site FR are slightly higher than the observed percentiles until the 0.75 percentile. Site PL shows the opposite behavior: EMC percentiles of simulation are slightly lower than observed percentiles until the 0.5 percentile. However, in both cases, the maximum observed EMC percentiles are strongly underestimated which suggests an inappropriate accumulation process model to account for random influences (e.g., traffic-induced pollutant emissions [
33]).
The fact that events with high TSS event mean concentrations are underestimated affects the goodness-of-fit concerning the total TSS event load of the events observed (
Table 8). This is especially evident at site PL, where the total TSS event load is underestimated by roughly 28%. Events with more than 0.5 g m
−2 are poorly represented (cf.
Figure 2).
At site FR, the relative deviation is only about 5%. This signals that the error is compensated by events whose simulated TSS event load is higher than that which is observed (intersection at approx. 0.1 g m
−2, cf.
Figure 2).
Cumulative distribution functions of simulated TSS event loads are depicted for both models in
Figure 2. Simulation results are opposed to the parameterized lognormal distribution function used for calibration and the original empirical distribution function from observation. Additionally, absolute residuals between observed and simulated TSS event loads are presented on the right-hand side of the figure (FR: subplot (b), PL: subplot (d)). At site FR, the mean of TSS event load residuals is −0.0087 g m
−2 (sd: 0.19; min: −0.41; max: 0.94). At site PL, the mean of the TSS event load residuals is 0.065 g m
−2 (sd: 0.19; min: −0.27; max: 0.74).
At site FR, the calibrated model replicates the distribution function until the 0.8 percentile with a high goodness-of-fit. Events exceeding this value are generally underestimated by the model and lead to lower simulated event loads than suggested by the lognormal distribution. Since the KS statistic represents the maximum distance between two cumulative distribution functions, the maximum 5% of the events with more than the 0.8 percentile of event loads are underestimated.
The results for site PL show a similar effect. Here, the model shows a good fitting of the distribution function until the 0.9-percentile, which accordingly implies that the maximum 5% of the events with more than the 0.9-percentile of event loads are underestimated.
Both calibrated models tend to underestimate events with high TSS loads which indicates that the calibration approach and the objective function applied is heavily influenced by events with low TSS event load which, as a matter of fact, is the case for the majority of events for both sites. Applying an alternative goodness-of-fit measure as an objective function, which also emphasizes the upper tailing of a distribution function could lead to superior model performance. This, however, remains unclear as the applied pollutant model itself also has limitations in replicating natural pollutant processes [
5,
12,
34].
Observed and simulated MV curves are shown in
Figure 3. Simulated MV curves are calculated for both the stormwater quality observation period and the five-year period using all available rainfall data.
Mass-volume curves for site FR reveal that simulated intra-event processes do not reflect the observed dynamics in general. In particular, the prevailing first-flush characteristic is not appropriately replicated. Instead, simulated wash-off tends to occur proportionally to runoff.
In contrast, statistics of simulated intra-event processes at site PL correspond well to the data observed. It can be seen that the calibrated model also tends to generate wash proportional to runoff. The high agreement of observed and simulated MV curves at site PL is obtained since the observed MV curves already show a more runoff-proportional wash-off behavior. Although the general characteristic at site PL is satisfactorily represented, the results from both sites indicate that the observed intra-event dynamic can hardly be deterministically described by the model for a continuous simulation period. As pointed out in previous studies [
5,
12], pollutant buildup and wash-off are highly affected by stochastic inputs, which consequently limits the goodness-of-fit of replicating intra-event dynamics.
Simulated distribution functions from the observation period (used for calibration) are compared to the results using the five-year period (validation) in
Figure 4. Corresponding goodness-of-fit is given in
Table 9.
At site FR, the KS statistic between both distributions is 0.035, implying that the observation period is highly representative. The KS statistic of 0.062 from validation only slightly differs from calibration (KS: 0.053), which indicates a successful model validation.
In contrast, the distribution function from validation at site PL constantly underestimates the assumed lognormal distribution. This is also expressed by a higher KS statistic of 0.073. The distance between calibration and the validation period is slightly higher (KS: 0.083), indicating a less successful model validation. However, it is noticeable that the simulated TSS event distribution of the observation period falls below the lognormal distribution between 0.25 g m−2 and 0.4 g m−2 and exceeds the lognormal distribution for event loads higher than 0.5 g m−2. This indicates that the observation period is less representative as the number of events is significantly lower.
The validated models were finally used to estimate annual TSS loads (
Table 10) which is of special interest for practical purposes. In the present study, the estimated mean annual TSS load for site FR is 9.9 g m
−2 a
−1 which according to Dierschke [
35] represents a roof with “low to normal” load contribution. The annual TSS load for site PL was estimated at 13.7 g m
−2 a
−1, which is significantly lower than reported from measurements by Burton and Pitt [
36] (~40 g m
−2 a
−1). As already stated, the model disregards traffic-related stochastic inputs, which could explain the low annual TSS loads estimated. Consequently, the result must be carefully interpreted. This highlights the need to especially account for load-intensive events either through an alternative objective function or modification of the model concept which, e.g., occasionally allows the incorporation of pollutants from additional sources.
Generally, the distribution-based calibration approach allows to calibrate stormwater quality models even if data is incomplete but tends to underestimate events with high TSS loads. However, compared to the conventional calibration, the approach has two clear advantages. First, the occurrence of events and its corresponding pollutant contribution is probabilistically considered, which implies that stochasticity is taken into account. Second, measurement data of stormwater quality processes are rarely completely available for continuous periods which consequently complicates the application of a conventional calibration approach and could result in misleading model outputs. Theoretical distribution functions are continuously defined.