An E ﬃ cient and Structured Procedure to Develop Conceptual Catchment and Sewer Models from Their Detailed Counterparts

: Modelling ﬂow rates in catchments and sewers with a conceptual, also known as hydrological, approach is widely applied if fast simulations are important. In cases where a detailed hydrodynamic model exists, it is common to start conceptualizing from this detailed counterpart. Unfortunately, no generalized procedure exists, which is surprising as this can be a complex and time-consuming task. This research work proposes a procedure that is validated with two independent combined sewer case studies. The conceptual models provide the targeted results with respect to representation of the ﬂow rates and reduction in the computational time. As the desired performance could be reached for di ﬀ erent levels of model aggregation, it is concluded that the conceptual model can be tailored to the points where accurate ﬂow rates need to be predicted. Furthermore, the comparison of the conceptual model results with ﬂow measurements highlights the importance of analyzing and eventually compensating for the limitations of the detailed model. Investigation, J.M.L., P.A.V.; administration, A.C., Supervision, P.A.V.; Validation, J.M.L. L.P.; J.M.L.; L.P., D.M. and P.A.V.


Introduction
The use of lumped conceptual models is widespread in urban drainage modelling where fast calculations are necessary for multiple model evaluations, such as sensitivity or uncertainty analysis and optimization questions [1], or for simulations of long timeseries with complex models, such as integrated models where multiple sub-system models are evaluated at the same time [2,3]. The potential beneficial use of lumped conceptual models was proven with several successful case studies over the last decades. Some examples of their application include sensitivity analysis [4,5], uncertainty analysis [6], real time control (RTC) and model predictive control [7,8] or optimization and integration [9][10][11].
Hydrodynamic routing, also known as distributed flow routing, calculates the flow based on a time and space component using the de Saint-Venant equations [12]. The evaluation of these equations is however computationally demanding and different approaches exist for simplification or In the studied area, rain data from seven rain gauges are available [26]. A map of the considered sewer system is shown in Figure 1.

Case Study 2: Bordeaux, France
The second case study is located in the southern part of Bordeaux Métropole, France and covers the catchment of the Clos de Hilde (CdH) wastewater resource recovery facility (WRRF, see Figure 2). The WRRF has four different sewer tributaries, which together cover an area of approximately 8000 ha. The catchment contains both combined and separate sewer systems. Rain data is available for four rain gauges and in contrast to the first case study, flow rate measurements are also available and provided by the local utility. The relevant flow rate measurements are located at the four different tributaries of the WRRF and the pumping stations Jourde, Carle Vernet and Noutary (see Figure 2 for the locations of the WRRF and pumping stations).

Modelling Approach and Software
The catchment model is based on the KOSIM catchment model implemented in the software WEST (DHI, Horsholm, Denmark) [27]. The model couples a module for wet weather flow (WWF) and a module for dry weather flow (DWF), as indicated in Figure 3. In comparison to the original KOSIM-WEST model, the WWF can be split into fast and slow flow concentration and local routing through a series of reservoirs to represent the fast and slow responses characteristics of inflow and infiltration responses, respectively [26]. As conceptual catchments are generally aggregated over several detailed catchments, this process represents both flow concentration and routing through the local sewer network that is no longer explicitly modelled. Inputs to the DWF module are the number of people equivalents and their average wastewater generation rates, as well as the average wastewater production by local industry. In comparison to the original KOSIM-WEST model, the DWF is also routed with a series of linear reservoirs representing the local sewer network [26].
The conceptual model of the sewer is based on the reservoir in series approach, also known as a cascade of reservoirs. Equation (1) represents the principle of mass conservation, which requires the difference of the inflow Q in and outflow Q out to be equal to the change of storage volume V(t) [12]. Equation (2) relates the outflow Q out to the storage volume V(t) and the storage constant of the reservoir k, also known as residence time [12].
If the value of the constant p equals one, Equation (2) corresponds to a linear reservoir, otherwise it is a non-linear reservoir [12]. For both approaches, methods exist to determine the reservoir parameters from the detailed model. Euler [28] adapted the Kalinin-Miljukov method to define the linear reservoir parameters from the pipe characteristics. An alternative method to determine the parameters is the Muskingum method [12]. In the non-linear case the parameters can also be defined using the pipe characteristics from the detailed model, maximum flows and volume-outflow gradients [29].
To approximate backwater phenomena in conceptual modelling, an approach using a sequence of splitter and combiners has been developed and tested for the linear reservoir model [15].
All modelling and simulation work was performed utilizing the software WEST (DHI, Horsholm, Denmark), which is a general modelling and simulation environment [30].

Model Performance Criteria
The model performance criteria chosen for this study are the percent volume error (PVE) in Equation (3), percent error in peak (PEP) in Equation (4) and the Nash-Sutcliffe efficiency (NSE) in Equation (5). The PVE, also known as the percent bias, measures the overall adequacy between predicted (P i ) and observed (O i ) data. The PEP characterizes the difference between the observed peak (max({O i })) and the modelled peak (max({P i })) for a single event but does not evaluate the timing of the peak. The NSE compares the squared residuals with the squared residuals a model written as the mean of the data (O) would create. The optimal value equals one, zero means that the model is equally good as a mean value model and a negative value means that the model is performing worse than the mean value of the observations. Due to the squared nature of the criterion, it compares to the well-known Root Mean Square Error (RMSE) model performance criterion, used in other disciplines. This criterion is sensitive to extreme values [31].

Proposed Methodology
The proposed procedure to develop a conceptual model from its detailed counterpart consists of four main stages ( Figure 4): Project definition, model development, calibration, and validation. Each of the stages will be explained in more detail in a dedicated sub-section.

Project Definition
In the stage of the project definition, the first step is to determine the conceptual model's objectives. These objectives usually reveal on the one hand a certain need of model performance and on the other hand the need of fast calculations, for example for sensitivity or uncertainty analysis or model predictive control. A measure for calculation time is the speed-up factor that needs to be attained for a case study, which is calculated by dividing the simulation time of the detailed model over the time of the conceptual model. The objectives determine whether the development of the conceptual model is an appropriate solution.
The second step is the review of the available data and the detailed hydrodynamic model from which the conceptual model can be developed. The quality of those data and the detailed model have to be assessed. Special attention should be given to the purpose for which the detailed model was built. This influences the limitations and assumptions of the detailed model and therefore also of the conceptual model. Depending on the objectives of the conceptual model, simplifications of the detailed model might be considered to facilitate conceptualization. Possible simplifications affect hydraulic structures where complexity can be reduced, for instance by replacing complex hydraulic relationships and/or RTC rules by simplified overflow structures.

Model Development
When developing a conceptual model, it is crucial to identify the comparison points, i.e. the points where the conceptual model should predict accurate flow rates and is therefore compared to the detailed model. To do so, it is important that locations of rain gauges, overflows and key hydraulic structures are known. Because conceptual models only predict flow rates at the outlet of a catchment or sewer conduit but not within, no aggregation of catchments and sewers should take place over the comparison points. The selected comparison points therefore have to be calibrated and validated with a corresponding point in the detailed model. The next step is the delineation and aggregation of catchments and sewers in accordance with previously identified comparison points. The delineation of catchments and sewers has to be carried out simultaneously as they are directly linked. Figure 5 illustrates a simple sewer system and its conceptualization. In the example sewer system, two points are identified as comparison points where flow rates have to be predicted. The illustration shows that the local sewers (dotted lines) are represented as sewer conduits in the detailed model. In the aggregated conceptual model, however, they are no longer represented as a sewer conduit model but are incorporated in the catchment model. Only the main sewer trunk between comparison point 1 and 2 is represented in a specific sewer conduit model. Special attention must be paid to catchments through which a conceptual sewer flows as the parameters of the catchment and the sewer model cannot be calibrated independently at the downstream comparison point. In Figure 5, this situation corresponds to comparison point 2, where the flow rate at this point represents both the flow from the sewer conduit and catchment 2. The parameters of the sewer model can be identified by using the methods described in Section 2.2. Therefore, the flow rate at point 2 can be used to calibrate and validate the catchment parameters of catchment 2 after having calibrated and validated catchment 1. It might be that structural properties of the detailed catchment models are too different and do not allow for aggregation. However, if possible, it is suggested that only one conceptual catchment model is calibrated per comparison point to avoid overparameterization. The catchments have therefore to be delineated accordingly. Figure 5. Conceptualization schema. Schema illustrating a detailed model and its conceptual counterpart with two comparison points resulting in a conceptual model of two catchments and one sewer (inspired by [18]). The labelling of the comparison points indicates the calibration order.
The conceptual catchment and sewer models have to be parametrized in the next step. The parameters that can be directly parameterized depend on the model structure of both the detailed and the conceptual model. A comparison of the modelled processes will reveal the parameters that can be directly aggregated or translated from the detailed to the conceptual model and which parameters need calibration and validation. The illustration of the catchment model in Figure 3 shows typical model assumptions for input and generation of both WWF and DWF. It seems inherently clear that the parameters related to model input can usually be aggregated directly from the detailed model, e.g. if a conceptualized catchment contains several detailed catchments, the total person equivalents or effective area of the conceptualized catchment corresponds to the sum of the person equivalents, respectively the effective area, of the detailed catchments. The parameters regarding flow concentration and routing, however, will usually need calibration and validation as no direct translation is possible. Figure 3 shows that the concentration and routing module of the conceptual model differs from the detailed model as the concentration and routing processes are lumped together in the conceptual model (see also Figure 5). The parameters of the conceptual sewer model, representing the routing in the main sewer trunks, can be estimated using the established methods mentioned in Section 2.2 such as the Kalinin-Miljukov method [28], as all the necessary pipe characteristics are available from the detailed model.

Calibration
For the calibration of a conceptual model from a detailed model, two approaches are possible. The first is the parallel calibration of all sub-models [21]. For each of the conceptual sub-models the detailed model results serve as input at the upstream comparison point. The parameters of the conceptual model are then calibrated at the downstream comparison point by fitting to the detailed model output. This allows independent calibration of all sub-models, which thus permits parallelisation of the calibration task. The second approach is the sequential calibration of the conceptual model, where the output of the previously calibrated upstream conceptual model serves as input to the conceptual model to be calibrated, and not the simulation results of the detailed model. For the proposed procedure, the second approach is adopted, as this approach allows for correction of inevitable model structure errors that occur during conceptualization. Even though the performance of each sub-model might be smaller due to the substitution of the detailed model's input with the upstream conceptual model, it is assumed that the overall performance at the downstream emission point is better, since upstream errors can be compensated for.
The order in which the parameters are calibrated is important and should be established prior to performing a calibration, as this will ensure that upstream model parameters are calibrated before the downstream parameters. In Figure 5, comparison point 1 is a first order comparison point, as it is further upstream, whereas comparison point 2 is a second order comparison point. Assigning a calibration order for each comparison point has also the effect that it allows for parallel calibration of comparison points with the same order of calibration and therefore speeds up the calibration process.
Once the calibration order is identified the actual calibration is performed. If the catchment model is not known to the modeller, it is suggested to carry out a sensitivity analysis of the catchment model prior to calibration to determine the impact of the available model parameters. The calibration is first carried out for DWF and then for WWF, respecting the calibration order in both cases. For both DWF and WWF, the flow volume is calibrated before the flow dynamics. The previous step identified the parameters that can directly be translated from the detailed to the conceptual model. If the input and generation parameters can be translated directly, volume calibration is not necessary, but validation is recommended. The concentration and routing parameters representing the dynamics of the conceptual catchment model, however, are to be calibrated. Depending on the objective of the conceptual model, different performance criteria can be selected to assess the goodness of fit between the detailed and the conceptual model, see also Section 2.3. If the attained model calibration performance cannot be reached, it is suggested to go back to the previous stage of model development and refine the structure of the model.

Validation
In the last stage, the conceptual model is validated using a different rain time series. The rain data are used as an input to both the detailed model and the conceptual model. Comparing the flow rates at the identified comparison points with the chosen performance criteria will either validate the model or reveal that a recalibration of the conceptual model is necessary. If flow rate measurements at some points are available, it is strongly suggested to also validate the conceptual model with actual flow rate measurements. If the model validation is not successful it is suggested to go back to the stage of model development and refine the model structure.

Developed Conceptual Models
The first stage of the project definition is summarized for both case studies in Table 1. Table 1. Project definition of case studies. Objectives of conceptual models, detailed models and available data. Step Ottawa Bordeaux Objectives conceptual model To identify the comparison points, the location of the rain gauges, overflows, and key hydraulic structures are indicated in Figure 1 for Ottawa and Figure 2 for Bordeaux. The chosen comparison points are shown in the same figures. The input and generation sub-models of the catchment were parametrized by aggregation and translation of the detailed model information. The sewer routing parameters were calculated for both cases by using the Kalinin-Miljukov method [28] mentioned in Section 2.2. The flow concentration and routing parameters in the catchment could not be derived from the detailed model and are therefore calibrated and validated in the next stage.
As a first step in the calibration stage, the calibration order was determined for both case studies. This process is illustrated in Figure 6 for one of the tributaries of the Bordeaux case study. Catchments of the same order of calibration were calibrated in parallel. Following the procedure, the models were first calibrated for DWF and then for WWF. The calibration procedure applied was a grid search, where the best performing set of parameters was chosen if the performance objectives were met. Otherwise the grid was refined. If this did not lead to the desired results, the model structure had to be adapted. The summary of the results given in Table 2 shows that the calibration objective is met for all comparison points. The full results are provided in Table A1 for Ottawa and Table A2 for Bordeaux. For the fourth and last stage (model validation), a summary of the attainment of the objectives is given in Table 2. It shows that the objective for the NSE is met in both case studies. The full validation results can likewise be found in Tables A1 and A2. The results of the additional criteria for the simulated overall flow volume and peak flow values show that the conceptual model is not performing as well as during the calibration phase. Nevertheless, the values are still considered acceptable for the current case studies.  A summary of the developed models can be found in Table 3, which indicates the number of catchments and sewer conduits for both the detailed and the conceptual models, as well as the calculation time needed for the same flow rate simulations. The speed-up factor was calculated by dividing the simulation time of the detailed over the conceptual model. It is to be noted that the conceptual model for both case studies already includes advective transport for water quality components in contrast to the detailed model, where this feature was deactivated as these models were never meant to be used for water quality. Nevertheless, a speed-up factor of over 10 could be reached for all studied flow conditions. The objective of simulating a WWF event within one minute (Table 1) is met for both case studies.

Level of Aggregation
For the Ottawa case study, the influence of the level of aggregation on model performance was evaluated. For the previously developed model (V1), it was ensured that catchments and sewer conduits were of similar size and that the aspect ratio of the catchments was not too elongated.
To do so large catchments and sewers were further divided to avoid a large variation in size and shape. For the further aggregated model V2, this was not considered anymore. This means that, with model V2, the maximum level of aggregation for the chosen comparison points was attained. The resulting characteristics of the catchment and sewer sub-models of the two different aggregation levels are indicated in Table 4. The more aggregated model V2 has approximately half the number of sub-catchments and sewer conduits than model V1 and thus shows an increased range of size and length parameters.  Figure 7. A summary of both calibration and validation results is given in Table 4, while the full calibration and validation results are provided in Table A1. The validation results indicate that the performance of the model V2 is generally lower, but the validation objective for the NSE (NSE > 0.65) is met at all comparison points. The observation that the dynamics of the flow are generally a little less well represented in the model V2 makes sense, as the further aggregation results in a loss of resolution. The results of the comparison of the simulation times between both levels of aggregation are also summarized in Table 4. As expected, the further aggregated model V2 is faster than V1, using approximately 2/3 of the simulation time of model V1.

Comparison of Conceptual Model with Actual Flow Rate Data
As mentioned in Section 4.1, flow rate measurements are available for the Bordeaux case study. The model can thus be compared to actual measurement data and not only to simulation results of the detailed model. This was first done without any further model parameter adjustments after validation with the detailed model and is thus a true validation with respect to the model's capability to represent reality. Figure 8 shows the total influent rate at the WRRF CdH inlet (a) and one of the four tributary branches (b) for 9-13 May 2017. From the left-hand side illustration, it can be concluded that the overall average DWF (DWF volume) is approximately correct, but that the dynamics are not well represented (dry weather day 8). The WWF, as such, seems underestimated (wet weather days 9-10) but this, as later will be demonstrated, is mainly due to the errors in the DWF. Furthermore, observations at one particular tributary to the WRRF CdH (right-hand side) shows that not only the dynamics do not match, but the average DWF flow for this tributary is clearly underestimated.  Table 5 summarizes the comparison of the conceptual model results (developed solely based on the detailed model), with the available flow rate measurements, the location of which is indicated in Figure 2. While the overall percentage volume error (CdH total) lies almost within an acceptable 10% error, the errors for each of the individual tributary branches at the inlet of the CdH WRRF are mostly higher. The NSE values demonstrate that the dynamics are poorly represented. Visual analysis of the results indicates that this is mainly caused by the poorly calibrated DWF volume. Good performance under DWF conditions was however never the intention of the detailed model.
As the results indicate shortcomings under DWF conditions, the DWF flow generation in the catchment was recalibrated based on the available flow rate measurements. The parameters changed were the number of people equivalents per catchment and the hourly representation of the daily DWF profile. In addition, it was recognized that some WWF pumping capacities in the system were increased in the time period between the development of the hydrodynamic model (2012) and the collection of more recent flow measurements (2017). These modelled capacities were revised to reflect current maximum pumping capacities.
The results of the recalibrated DWF model are shown in Figure 9 for the same validation period as in Figure 8. The total inflow to the WRRF CdH is shown in (a) while the flows within one of the four tributary branches is depicted on (b). The example shows that both the average flow rate and the dynamics of the hydrograph are matching much better, even though shifts in time can be observed. This is due to the fact that the DWF profile in the conceptual model is now calibrated based on representative data, but the reality is that the system does not have such a consistent DWF pattern at all locations where it is applied in the model. With respect to the WWF response, one can observe that the measurements and the conceptual model simulation results match much better, even though no WWF parameters were changed. Table 5. Comparison of conceptual models with all available actual flow rate measurements. The location of the flow rate measurements is indicated in Figure 2. First the conceptual model built using the detailed model only is compared to the measurements and then the conceptual model with measurement based recalibrated DWF is compared to the measurements.  Figure 9. Comparison of conceptual model with recalibrated DWF contributions with actual flow rate measurements. Actual flow rate measurements were used to recalibrate the conceptual model for comparison with measured influent flows for the total influent flow rate at the WRRF CdH (a) and one of the four tributary branches (b).
The performance of the conceptual model with the recalibrated DWF contributions in comparison to the available flow measurements is also summarized in Table 5. It can be noted that the recalibration of the DWF greatly improved the performance. However, the conceptual model indicates a small overflow at Jourde, whereas the flow measurements show no such overflow. For this comparison point, the performance criteria could not be calculated (division by zero). However, the actual volume of the overflow reported by conceptual model is comparably small.

Development of Conceptual Models
The proposed procedure has been successfully tested with two independent case studies. The results demonstrated that the conceptual models represent the detailed model with the desired level of accuracy and result in considerably shorter simulation times compared to the detailed models.
The question may arise why a conceptual model is better developed from a detailed model and not directly from information about the sewer system and flow rate measurements. The sewer system can be conceptualized with information about its physical properties only (see methods in Section 2.2), but the concentration and routing parameters in the catchment models need to be calibrated and validated on the basis of dynamic flow rate data (see the explanation in Section 3.2). Even though, in general, flow rate measurements can be available at several measurement points throughout the system, they are rarely available at every identified comparison point of the system. A detailed hydraulic model provides the best estimate for this non-existing data. In addition, the detailed model already and inherently contains a significant amount of characteristic data related to the catchment and the sewer system that are needed for the conceptualization, such as the people equivalents per catchment and the physical properties of the sewer pipes. Conceptualization is thus made more efficient by the fact that this data does not need to be collected from other sources.

Level of Aggregation
Comparing a less aggregated conceptual model (V1) to a maximally aggregated model (V2) for the Ottawa case study showed that model V2 was still able to represent the flow dynamics of the detailed model at the comparison points (Table 4) even though only about half the number of sub-catchments and conduits were used. This means that catchments and sewers can be aggregated to their maximum regarding the comparison points that are to be represented, as long as the special structures' locations are taken into account. While the number of sub-models was halved, the calculation time dropped only by about one third. This is due to the fixed overhead calculations (e.g. reading input files or plotting), which are independent of the number of sub-models used.
Further model aggregation results in faster simulations and less work to be spent on calibration and validation of the model. However, it comes also at a loss of information at the intermediate points that are no longer simulated and comes with the potential loss of accuracy, if the model structure is oversimplified and rain gauge influence zones are no longer respected.

Comparison to Flow Rate Measurement Data
Comparing the performance of a conceptual model purely built from a detailed model with real flow rate measurements highlighted two important points. First, the performance of the conceptual model with respect to replicating flow measurements is limited by the performance that the detailed model has with respect to the same flow measurements. This highlights the importance of the project definition stage, where the limitations and assumptions of the detailed model are analysed (see Section 3.1). For the Bordeaux case study, the detailed model was built to evaluate the sewer system under WWF conditions for current and future scenarios (see Table 1). Therefore, average flow rate approximation under DWF conditions was deemed sufficient. The poor performance of the conceptual model with respect to flow rate measurements was therefore caused by the purposeful omission of a DWF calibration and validation of the detailed model. These findings highlight the different purpose for which the detailed model was developed. In addition, it should be noted that the detailed model was developed in 2012, whereas the conceptual model was validated with 2017 flow rate measurements. A part of the discrepancy under DWF conditions might therefore also be caused by additional housing and industrial developments in specific sub-catchments over these 5 years. It is important to note, however, that the sewer network itself was not substantially changed or upgraded during this period.
Second, if the limitations of the detailed model are accounted for and/or rectified (in this case, the recalibration of the DWF model), the conceptual model can perform well in comparison to flow rate measurements without any further adjustments (see Table 5). It can therefore more generally be concluded that if the purpose of the detailed and the conceptual model are not identical, one has to carefully identify the assumptions underlying the detailed model and compensate for them when developing the conceptual model. Nevertheless, the advantages of developing the conceptual model by leveraging the modelling efforts already invested in the development of the detailed model remain very strong.

Conclusions
A four-stage modelling procedure was established to develop conceptual catchment and sewer models by maximizing the reuse of information and efforts invested in the development of a detailed hydraulic model. It was applied by different modelers on independent case studies. The procedure resulted in the successful validation of conceptual models for both cases, providing a speed-up factor of 10 to 80 for all comparison conditions. Thus, the conceptual models provide similar results to the detailed models at the selected comparison points but at a simulation rate that is at least ten times faster. It can therefore be concluded that, by applying the procedure, a faster conceptual model can be developed in a structured way. At the same time, it can be concluded that the procedure is sufficiently generic and transportable for application to different case studies. The developed procedure follows similar stages as the Good Modelling Practice protocols reviewed for other disciplines, but is tailored to conceptualization, focusing on aggregation of catchments and sewers.
The study of additional aggregation showed the advantages and disadvantages of further aggregated models. A significant decrease in simulation time (33%) was obtained for an increased level of aggregation, but it was not found to be directly proportional to the level of aggregation. This can be expected for other case studies as well, but the reduction in simulation time is likely to depend on the modelling approach of the conceptual models and certainly also depends on the computational efficiency of the software chosen for both the detailed and the conceptual model.
From the validation of the conceptual model with actual flow rate measurements it could be concluded that the detailed and the conceptual model's objectives, and with this the modelling assumptions, need to be aligned. If they differ, they reveal where recalibration on actual measurements may be useful. The challenge of the Bordeaux model with actual flow rate data, however, also demonstrated that, if the assumptions of the detailed model are corrected, the conceptual model performs very well without further adjustments.
It is overall concluded that the proposed procedure provides a structured way to use the detailed model to develop the conceptual model. The procedure helps modelers to systematize the modelling process. The suggested procedure therefore improves the current situation in conceptual modelling, for which such a generally applicable procedure was missing.