This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

There is a growing requirement to generate more precise model simulations and forecasts of flows in urban drainage systems in both offline and online situations. Data assimilation tools are hence needed to make it possible to include system measurements in distributed, physically-based urban drainage models and reduce a number of unavoidable discrepancies between the model and reality. The latter can be achieved partly by inserting measured water levels from the sewer system into the model. This article describes how deterministic updating of model states in this manner affects a simulation, and then evaluates and documents the performance of this particular updating procedure for flow forecasting. A hypothetical case study and synthetic observations are used to illustrate how the Update method works and affects downstream nodes. A real case study in a 544 ha urban catchment furthermore shows that it is possible to improve the 20-min forecast of water levels in an updated node and the three-hour forecast of flow through a downstream node, compared to simulations without updating. Deterministic water level updating produces better forecasts when implemented in large networks with slow flow dynamics and with measurements from upstream basins that contribute significantly to the flow at the forecast location.

The increasing richness of hydrological data from cities leads to an increasing use of spatially distributed continuous hydrologic simulation models [

For any online model that is used for real-time decision-making it is crucial to keep the model in touch with reality, e.g., by assimilating measurements into the model. For simple, linear models this can be achieved using a version of the Kalman Filter [^{2} and includes 5000 nodes [

Perhaps due to the limitations mentioned above there are no references in the open literature to applications of data assimilation methods applied to the hydrodynamic part of distributed urban drainage models. The only available data assimilation tool for this kind of model that the authors have come across is MOUSE UPDATE; a pragmatic tool that inserts measured water levels or flows directly into the hydrodynamic module of the MIKE URBAN software. This alternative updating method (in the following referred to as “deterministic updating”, or the “Update” method) ensures that the simulations are in accordance with the available measurements, and it should thereby result in an improvement in model forecasting performance. The Update method has been used in a few practical urban drainage projects, but their results are not publically accessible. The authors have presented initial studies of the Update method in conference contributions [

The article is divided into six sections: 1: Introduction; 2: Update Procedure; 3: Evaluation Procedure; 4: Case Studies; 5: Discussion; and 6: Conclusions.

Physically based, distributed hydrodynamic urban drainage models, such as the MIKE URBAN model, are the most detailed type of models available for urban drainage systems. They are divided into two main components: a surface module and a hydrodynamic model. The surface module converts precipitation data into inflow to the pipe system for each sub-catchment in the system, while the hydrodynamic model calculates the flow in the pipe system using the flow from the surface module as model forcing. Additional model forcing components can furthermore be defined, such as infiltration-inflow or pumped flows. The surface module is lumped-conceptual on the sub-catchment scale but since these are distributed in space the surface module itself can be regarded as a distributed model. Several types of surface modules are available; the one used here is based on a time-area principle. The hydrodynamic model is physically distributed and like for the surface module several version are also available; the one used here is the MOUSE hydrodynamic engine.

There are several examples in the literature of updating the lumped-conceptual part of MOUSE models. In [^{2} suburban catchment at 30 min time-intervals by adjusting an overall system scaling factor, the surface concentration time and the dry weather flow, and in [

The computation technique applied in the MOUSE hydrodynamic (HD) engine for solving the Saint Venant Equations uses a double sweep algorithm, which solves two sets of equations that are set up using a “branch” and a “nodal” matrix [_{in}_{out}_{cr}^{n+1}

Normal (

The idea behind the Update procedure is that the new water level in the node is known from a measurement produced by a level sensor located in the node. This means that the so far unknown variable in Equation (1), ^{n+1}_{correction}

^{n+1}_{correction}^{n+1}

A similar method has been implemented for the use of measured flow data for updating in computational grid points in the pipes where flow gauges are located. To make the computation match the measured flow, a correction flow is introduced that adds or extracts water from the pipe at the location of the flow gauge. In this case, another set of equations is rearranged to make the update computation, but the basic principle is the same as in the water level updating procedure described above.

When using a model in real time, measured data can only be available until the present time

The flow added or extracted in the computation in order to make the resulting water level match the measured water level is fully reported to the results as a correction flow time series and as the accumulated volume of the correction flow, together with a logging of the periods where the update function has been active. This means that updating does not ruin or violate the water balance of the simulation.

The updating feature can be configured to work only when the applied sensor signal is within a specified range. This is relevant when using sensors with known maximum or minimum limits. Pressure-based water level gauges can, for instance, only measure levels that are above the level of the sensor, while an ultrasonic water level gauge only measures correctly when the water level is below the sensor.

An adjustment factor that is multiplied with the calculated correction flow before this enters the model can furthermore be specified for each location where the update feature is applied. This is useful in situations with noisy measurements or when big and sudden corrections are prone to create problems for the model. An adjustment factor less than one will make the corrections smoother and thus limit the risk of introducing large gradients between the state value in the updated node and its neighbors; however, it will also make updating react slower. Since the purpose of the presented work has been to study the maximum effect of the update feature, the adjustment factor is set to one in all examples presented in the paper.

When a model’s upstream water levels are updated to more true-to-life values, the impact will propagate with the flow down through the system, in which case downstream flow simulations are likely to improve as well, given that the behavior of the physical system is described correctly by the model. In the following examples, water levels are updated in one or several nodes at a time. The effect of each update is examined by comparing measured downstream flows with the corresponding flows simulated with and without updating in upstream nodes. The forecast potential from updating is also examined. This is achieved by comparing the simulation results _{f}

Good rainfall forecasts up to a few hours into the future (nowcasts) are likely to become available within the next few years as a result of utilizing improvements within radar forecasting in combination with numerical weather prediction models [

The _{f}_{f}_{f1} and t_{f2}).

After updating stops, the change to state variables incurred by the updating algorithm will gradually loose its importance and the simulated values will converge towards the simulation run without updating, as illustrated by the dashed lines on _{f}_{i}

Sketch of how the forecasting time series is produced. The black and red solid curves represent simulation results when the model is run without and with updating, respectively. The dashed curves represent simulation results from a model run with updating (red solid curve) until the dot, after which the simulation continues without updating. The time between these dots (_{i}_{f}_{f1}, green squares) and a long (t_{f2}, blue squares) forecast horizon.

The process of generating the forecast time series for the purpose of testing the Update algorithm, as illustrated in

A simple hypothetical urban drainage system and a case of defined “unknown input” is set up to illustrate the impact of updating, how level measurements with a limited range can be utilized and that the water balance of the model fits when updating is active. The system consists of six nodes with a storage pipe section in the middle, as illustrated in

Illustration of the model setup for the hypothetical example along with input and output locations.

To generate artificial observations at S and O a rain event measured by a rain gauge is used along with an inserted flow located in the pipe just before the storage pipe, as shown in

The rainfall-runoff process is modelled using a simple time area method, and the inserted flow represents water unaccounted for in the model due to, for example, infiltration inflow or deviations between actual and measured rainfall. Rain input and additional inflow, which can be seen in

Rain input and the additional inflow inserted upstream from node S in the hypothetical example, also referred to as the “unknown flow input”.

Three graphs are joined into one in this figure. The top panel shows the correction flow (Q_{correction,} solid curve) as well as the accumulated volumes inserted and extracted (dotted curves) in node S by updating. In the middle panel, the water level in node S is shown. Both the observed and simulated water levels with and without updating are shown. Rain input can also be seen in the middle panel. The bottom panel shows the observed and the simulated flow with and without updating in node O. In all three graphs the shaded background indicates when updating is activated.

The accumulated correction flow in the top panel in ^{3} in total), which is expected because the update algorithm has to compensate for the additional inflow representing unaccounted water, as highlighted in

The volumes inserted and extracted by the updating illustrated in ^{3} and 652 m^{3}, respectively, resulting in a net volume of 3419 m^{3} being inserted into the model. The total modelled inflow volume from the rain event without updating is 2430 m^{3}. The ‘unknown’ flow input in this experiment simulation is actually 3450 m^{3} until the last point in time with active “update”. This magnitude matches the net volume generated by the update method.

A model of the urban drainage system in the city of Kolding, Denmark, is used to examine the effects of updating water levels in multiple upstream nodes on the simulation result and the forecast quality at the catchment outlet. All data and model details were provided by the consultancy currently investigating the feasibility of using advanced real time control in the catchment (see Acknowledgements). The model consists of 2303 nodes, 76 pumps, 94 weirs and 1223 sub-catchments with a total impermeable area of 544 ha.

Water levels are updated using measured data in eight different locations, namely six basins and two manholes, as illustrated in

The event used in the simulations is from December 2009 (00:00 25 December to 12:00 26 December), and it represents events that occur very commonly, with a return period below 0.3 years for durations between 1 and 360 min (

Overview of the distributed urban drainage model for Kolding. The squares, dots and triangles indicate where the observations from the six basins, two manholes and the outlet are located. The green and red areas indicate whether stormwater and wastewater flow in the same pipe (green, the system is combined) or in different pipes (red, the system is separated). Small villages that contribute to the flow are not included on the map.

The fact that the measured flow continues to be rather high for at least half a day after the rain has stopped indicates that the model error is due to a slow-changing process, such as infiltration or snowmelt, which are processes that are not included in the model. The graph for the updated simulation is much closer to the measured values for a large part of the event, showing that updating in eight upstream locations (6 basins and two manholes, as indicated on

Rainfall (from the top, black curve) and observed flow (grey curve) at the outlet in Kolding on 25 and 26 December 2009, along with the simulated flow with (red curve, Update) and without (black curve, No Update) updating in the upstream nodes.

The period where updating improves the result is between 11:00, 25 December and 03:00, 26 December (_{,} the forecasted flow is very similar to the simulated flow when updating. This means that for an almost 12-h period the 3 h forecast (with updating prior to the forecast period) provides results that are closer to the observations than a forecast made without updating. Some discrepancy between the forecasted and observed are however visible, especially for peak flows.

Rainfall (from the top, black curve) and observed flows (grey curves) at the outlet on 25 and 26 December 2009 (part of the hydrograph shown in

The results show that updating the water level in eight upstream nodes first improves the simulation results for the tail of the hydrograph (

The simulation of both the peak and tail could most likely be improved further by updating in more nodes, but there is a limitation to how much the forecasting of the peak flow can be improved by employing this method. This is due to a lack of basins near the outlet and because the majority of the impermeable area in Kolding is close to the outlet. However, updating in more basins over the entire model area should improve the tail of the hydrograph for the model simulation and the forecast time series.

The investigated update algorithm is a fairly simple data assimilation tool and has the advantage of being easy to use and being computationally efficient. Since measured water levels are assimilated directly into the model, measurement uncertainties are however transferred to the model. This means that the measurements have to be of good quality, and the use of updating should thus preferably be combined with automated quality control of the measurements, where the raw data from routine monitoring programs are filtered using heuristic or statistical methods e.g., [

The fact that the updating method only updates in the few selected points can give rise to some curious and undesirable effects, due to the possible introduction of large local gradients. If the update scheme is to be used in pipes or storage pipes, care must be taken not to induce oscillations into the system, though this can be handled by setting the adjustment factor to a sufficiently small value.

If the reported correction flows and volumes are large compared to the results from normal simulations, this may be due to a poor calibration of the model or to errors in applied input or measurement data. Updating is thus not a shortcut to bypassing the work on creating a well-calibrated model. If, for instance, the response time of the model upstream of the updating point is far from reality, it will affect the updated model in a way similar to using time-displaced rain data (see e.g., [

Deterministic updating as investigated here is intended to improve the accuracy of model forecasts in online applications. The success of the method depends on the quality of the model, including its calibration, on the quality of data from the applied level and flow sensors and on the characteristics of the system under consideration. The method introduces changes to the calculations compared with a normal simulation, but all modifications are reported to the results as correction flows and accumulated correction volumes, thus ensuring that water balance validation of the model can still be performed. The only difference is that part of the inflow (or outflow) is not accounted for by physical process description in the model. Nonetheless, this is not a major concern when using a model online, in which case the main priority is to make the model fit to reality.

The proposed updating method is not in any way optimal—but it is feasible and easy to use. A state-of-the-art updating procedure would be capable of weighing up the uncertainty of the measurements against the uncertainty of the modelled values and instantaneously make a system-wide correction to the entire model while producing corresponding uncertainty estimates. Ensemble-based data assimilation methods, such as variants of the Ensemble Kalman Filter, are capable of doing this for non-linear models, but these depend on running large ensembles of models that ideally represent all uncertainty in model and input data. It will presumably take many years before ensemble-based data assimilation methods can be applied successfully to large operational distributed urban drainage models, which are inherently slow and furthermore tend to grow in complexity with the increase in computer power. Until then, the deterministic updating method described in this paper seems like a reasonable compromise. Analysis of further case studies with different characteristics (size, flow time, type and location of measurements) as well as rain storms of varying type (intensity, duration, return period) will contribute to further understanding the method as well as the potentials and disadvantage of using data assimilation in connection with distributed hydrodynamic urban drainage models.

The deterministic updating of water levels, as implemented in the MOUSE UPDATE tool investigated here, is a simple tool that works by inserting or extracting enough water at every computational time step into the point of update in the model, to make the modelled values fit the measured value at this specific location. Updating can improve simulation results for the updated node as well as for downstream nodes, as exemplified here by both a hypothetical and a real case study.

The results from a hypothetical test catchment with an unknown infiltration inflow illustrated how the updating works and affects downstream nodes. The results from a real 544 ha catchment showed improved forecasts when updating water levels in six basins and two nodes, even though these represent only flow from 33% of the impermeable catchment area. Our main conclusions are that:

Point-wise deterministic updating of water levels in a distributed hydraulic urban drainage model improves model simulations, even in locations where measurements are unavailable, and can thus be used to give a better evaluation of the state of the system than traditional simulations without updating;

Updating works best in systems with slow flow dynamics and where updating occurs in multiple upstream basins with slow water level variations, which represent a dominant part of the contributing area;

Updating improves forecasts compared to not updating, and there is hence some potential for using updated models in model-based warning and control systems;

An example, based on a real full-scale system, shows that a 3 hr forecast with updating provides flow predictions closer to the measured flow than a traditional simulation without updating;

Updating of water levels is a pragmatic tool that can help to compensate for the ever-present deviations between model simulations and measured data.

In the future, when computer power has hopefully increased manifold, a simplified data assimilation tool like the one examined in this paper may be replaced by ensemble-based data assimilation methods that additionally produce uncertainty estimates for the entire system. Until then, the investigated Update method may be sufficient for many online applications.

_{cr}

Wetted horizontal cross-sectional area of a model node (m^{3});

_{in}

Sum of flows into a node (m^{3}/s);

_{out}

Sum of flows out of a node (m^{3}/s);

_{correction}

Correction flow in or out of a node (m^{3}/s);

Water level in a node (m);

Time step index of hydrodynamic computations (-);

Time step size of the hydrodynamic computations (s);

_{f}

Forecast horizon (min);

_{i}

Time step size of the forecast time series (min).

This research was funded partly by the Danish Council for Strategic Research, Program Commission on Sustainable Energy and Environment, through the Storm- and Wastewater Informatics (SWI) project [

The authors declare no conflict of interest.