(2018). Impact of the Storm Sewer Network Complexity on Flood Simulations According to the Stroke Scaling Method. Water (5),

: For urban watersheds, the storm sewer network provides indispensable data for ﬂood modeling but often needs to be simpliﬁed to balance the conﬂict between the large amount of data and current computing power. The sensitivity of a ﬂood simulation to the data precision of a storm sewer network needs to be explored to develop reasonable generalization strategies. In this study, the impact of using the stroke scaling method to generalize a storm sewer network on a ﬂood simulation was analyzed in terms of the total inﬂow of the outfalls and ﬂood results. The results of the three study basins showed that different complexities of a sewer network did not have a signiﬁcant effect on the outfall’s total inﬂow for an area with a single drainage system but did for an area with multiple drainage systems. In addition, serious ﬂooding was mainly distributed at the backbone pipes, which can be identiﬁed with the simpliﬁed sewer network. Several effective generalization strategies were developed for sewer networks that consider the distribution characteristics of the drainage system and application requirements. This study is theoretically important for better understanding the data sensitivity of ﬂood modeling and simulation and practically important for improving the modeling efﬁciency and the accuracy of urban ﬂood simulation.


Introduction
Because of global climate change and accelerating urbanization, extreme weather events-especially extreme precipitation and flood hazards-are becoming increasingly frequent and serious. They are a major global problem facing humans [1,2]. There are several engineering and non-engineering measures for dealing with heavy rainfall and flood hazards by predicting, preventing, and alleviating the effects of urban floods on urban systems to some extent [3]. Among them, non-engineering measures against flood hazards frequently use the simulated and predicted results of hydrologic models [4][5][6].
However, flood modeling and simulation are more challenging for urban areas than rural areas because of the higher requirements for input data, the more complex simulation of the drainage system, and the more sophisticated exchange between surface and groundwater flows [7][8][9]. In particular, the quantity and quality of data are significant factors that limits the accuracy of urban flood simulations [10][11][12][13]. Because of the high non-permeability of urban surfaces, storm runoff that cannot infiltrate underground is mainly discharged into rivers through the storm sewer network [14][15][16]. Therefore, the quantity and quality of data on the storm sewer system are indispensable for flood simulation.
Given the limits of data availability and current computing power, simplified or low-precision data are usually used for urban hydrologic modeling. However, low-precision data may reduce the accuracy of the flood simulation. Many studies in the literature have used multiscale data to examine the impact of the data precision or resolution on the accuracy or sensitivity of flood modeling and simulation [17][18][19]. Some researchers have investigated the influence of different resolutions of topographic data on the simulation results of hydrologic models [19][20][21][22][23]. For example, Chaubey et al. [21] evaluated the effect of the digital elevation model (DEM) resolution on the output uncertainty of the Soil and Water Assessment Tool (SWAT) model in seven scenarios and found that the DEM resolution affects the model in terms of watershed delineation, the stream network, and sub-basin classification. Several studies have also explored the effect of the degree of aggregation on hydrologic simulations by varying the number of watershed subdivisions [17,18,24]. For example, Cleveland et al. [17] simulated runoff with the Hydrologic Modeling System (HEC-HMS) and discovered that the size or number of sub-watersheds have little influence on the computed runoff hydrographs. Carpenter et al. [25] studied the effects of model parameters and precipitation uncertainty on streamflow simulations of a distributed hydrologic model, and Yu et al. [9] explored the effect of hydrologic parameters for a catchment on urban surface water flooding based on a hydro-inundation model.
For urban watersheds, it is also important to explore the effect of the degree of generalization for a storm sewer network on the accuracy or sensitivity of the flood modeling and simulation [26][27][28]. Park et al. [28] investigated the spatial resolution of the sewer network on the Storm Water Management Model (SWMM) and found that it did not have a significant effect on the simulated runoff volume. Krebs et al. [27] demonstrated that, while the runoff volume is almost unaffected by the spatial resolution of a sewer network, lower resolutions lead to the overestimation of peak flow because of the excessively rapid catchment response to storm events. Ghosh et al. [26] examined actual and artificial drainage networks and found that peak flows show a dual-scale effect, where aggregation reduces peak flows for larger storms and increases them for smaller storms.
However, most related studies have not explored the driving mechanism of the complexity of urban storm sewers on the results of hydrologic models. In addition, there has been little research on developing efficient or automatic generalization methods for storm sewer network, even though this process is necessary to improve the simulation efficiency. The degree of generalization or approach can differ depending on the precision requirements, topographic features, and so on. Thus, different generalization strategies or methods need to be developed for different application requirements, regional features, or structural characteristics of storm sewers while finding a balance between the simulation accuracy and efficiency.
The objectives of this study were as follows: (1) design an automatic and high-efficiency method for storm sewer network generalization; (2) explore the impact of urban drainage system complexity on flooding simulation by using multiple basins with different drainage system structures and summarize the laws and driving mechanisms for the flood simulation sensitivity induced by the spatial resolution of the sewer network; and (3) propose different generalization methods of the storm sewer network for different application purposes, geographic features, and drainage structures.

Storm Water Management Model
The open source hydrologic model (the SWMM), which was developed by the US Environmental Protection Agency [29], was selected as the modeling platform for flood simulation in this study. The SWMM is a 1D dynamic rainfall-runoff simulation model for urban watersheds and combined sewer overflow phenomena [30]. The main implementation principles and ideas fully consider the features of the urban watershed environment [29]. The SWMM includes four calculation modules: flow generation, transport, extended delivery, and storage/processing. These are used to simulate the surface water production and sinking of urban watersheds, storm sewer network convergence, and water storage and sewage treatment processes [29]. The rainfall-runoff produced in subcatchments flows into the storm sewer network through manholes and combines with dry-weather flow and ground-water infiltration [30]. Flow routing from upstream and downstream boundaries of the storm sewer system is governed by the conservation of mass and momentum equations for gradually varied, turbulent, and unsteady flow (Saint Venant) equations [30,31]. For more details, see [30]. For overland flooding simulations with SWMM, the excess volume of overflow junctions flows evenly into a ponded area that can be set to the area within the subcatchment that can be flooded (subtracting the building footprint) [29].
In this study, flow routing was computed according to dynamic wave theory and infiltration based on the Green-Ampt method [27]. For the main estimation procedure, the measured parameters [32,33] (e.g., area, width, and slope) are based on measurement and spatial analysis technologies with a geographic information system (GIS) [34,35], and the inferred parameters [32,33] (e.g., roughness coefficients) are based on empirical estimation [35]. Parameters are calibrated and verified by trial and error with an automatic optimization procedure, where the objective is to maximize the Nash-Sutcliffe (NS) coefficient of efficiency likelihood values. Details are presented in [35].
The catchment discretization procedure [36] is mainly based on [34] and [35] and uses GIS technology to fully consider various factors that affect the flow process of a city's water flow, including rivers, roads, storm sewer networks, and buildings. As shown in Figure 1, DEM data are first used to analyze the flow paths of urban surface water and obtain the natural basins with GIS hydrologic tools [34,37]. Second, given that rivers and main roads obstruct rainfall across themselves, the centerlines of rivers and main roads are applied to split natural basins into catchments with the GIS clip tools. Finally, the catchments are divided into subcatchments based on the Thiessen polygon method to help assign rainfall data from each rain gage to the nearest subcatchment [38,39]. The shapes of subcatchments are then revised by considering the effect of buildings on the direction of storm runoff.

The Stroke Scaling Method
The main approach to expressing a storm sewer network is based on the node-arc structure. This structure separates conduits and has poor visual coherence, which makes it difficult to match a human's overall perception. Moreover, the large number of junctions and conduits in the storm sewer network result in inefficient or low accuracy of the generalization or classification method, even within a small area. Therefore, we used the stroke technique [40][41][42] to design a stroke scaling method for simplifying and classifying a storm sewer network in a highly efficient and accurate manner.
The stroke technique concatenates separate line segments (e.g., conduits) into longer lines to detect and resolve spatial inconsistencies [40][41][42]; this provides a more integrated structure to further improve the efficiency of subsequent processing. During the generalization procedure for a storm sewer network, we usually keep or remove conduits and connected junctions with similar shapes or attributes. For example, we may remove conduits with a diameter of less than 400 cm or that are not laid on the main roads. Therefore, we can consolidate the connectivity of conduits with similar shapes or attributes into one pipe stroke to simplify the structure of the storm sewer network and improve the later processing speed.
The main idea of pipe stroke construction is to convert L conduits into S pipe strokes (S ≤ L) based on certain constraints. This can be expressed as follows: where L is all conduits of the storm sewer network, i is the number of conduits, S represents all pipe strokes of the storm sewer network, j is the number of all pipe strokes, and s x is one of S. We select the angle (a) between conduits and the horizontal direction, the road level (r) that the conduits are laid on, and the diameter (d) of the conduit as constraints for combining two conduits (l 1 , l 2 ) that are directly connected to each other into one pipe stroke. The following conditions need to be met: where T a is the difference threshold between the a values of l 1 and l 2 , and T d is the difference threshold between the d values of l 1 and l 2 . The values of T a and T d depend on the shape and attribute characteristics of the sewer network in the study area. For convenient generalization or classification of the sewer network, in addition to the conduits, we also add the junction information, start the search junction, and the level of the pipe stroke. Thus, the pipe stroke comprises all of the information of the original components but with more integrity. The data structure of the pipe stroke can be described as follows: where PS is the pipe stroke, L is the set of conduits of the pipe stroke, P is the set of junctions of the pipe stroke, StartP is the start search junction of the pipe stroke, and Le is the level of the pipe stroke. The level is assigned according to the distance from the pipe stroke to the outfall. The level of the pipe stroke (l 1 ) that is directly connected to the outfall is set to 1, the level of the pipe stroke that is directly connected to l 1 is set to 2, and so on. We also set Le of the pipe stroke at the end of the sewer network and the total length to be less than a given threshold, which depends on the actual situation at the last level. This ensures that the unimportant segments are deleted during the generalization procedure. The complete construction algorithm of pipe strokes is displayed in Figures 2 and 3. PSs are the set of all pipe strokes of the storm sewer network. AL and AP represent the shape and attribute information of all conduits and junctions of the original storm sewer network. Lset 1 represents the conduits that connect to StartP directly from the set of all conduits (AL). Lset 2 represents the conduits that connect to l 1 directly from the set of all conduits (AL). Pset is P of the PS for which construction was just finished. FindSS() is the function for searching the pipe strokes set of the original sewer network.
FindNextL() is the function of finding the next conduit and junction of a pipe stroke. The main idea of this algorithm is to proceed from the outfall to the upstream pipes to find each pipe stroke and assign Le to strokes based on the upstream and downstream relationships of the storm sewer network (i.e., function FindSS()), as shown in Figure 2. The main idea of the construction algorithm for each pipe stroke is to start from the given junction, which is set as StartP, and then select one conduit that connects with StartP directly as the seed conduit (l 1 ). Then, a traversal search is performed for all conduits (L) and other junctions (P) that belong to this pipe stroke based on the connectivity between conduits and the constraints (Equations (5)- (7)) to determine whether two conduits belong to one pipe stroke (i.e., function FindNextL()), as shown in Figure 3.    Figure 4 and the pipe diameter of each pipeline. We can get seven pipe strokes (PS 1 -PS 7 ) based on the algorithm of the pipe stroke construction (Figure 2). The right side of Figure 4 displays the P, L, StartP, and Le of each stroke. As shown in Figure 5, we can divide the structure of the pipe system into three levels from simple to complex. Leve1 1 contains PS 1 ; Level 2 contains PS 1 , PS 2 , and PS 3 ; Level 3 contains PS 1 , PS 2 , PS 3 , PS 4 , PS 5 , PS 6 , and PS 7 .

Study Area
Nanjing is one of the biggest cities in China, and its terrain is dominated by low hills. Its main water system comes from the Yangtze River and three urban rivers: the Moat, the Qinhuai River, and the Jinchuan River. In the rainy season, the city is frequently hit by extreme precipitation; this, combined with the increasing imperviousness and low capacity of the drainage system, results in waterlogging or flood hazards. Heavy rainfall-driven hazards cause shutdowns of the traffic system, inconvenient travel, building damage, and so on; thus, they have become a significant factor affecting the socioeconomic development of Nanjing.
Nanjing Normal University is located in the northeast of the city and belongs to an area where mountains meet the plains. The terrain of the campus is undulating and complex. We selected the campus of Nanjing Normal University as a study area to identify the impact of urban drainage system complexity on flood simulation. We divided the study area into three sub-basins: the northern district, the convention district, and the main district. These sub-basins have different terrain characteristics and distributions of the construction area, as shown in Figure 6. The drainage systems of these sub-basins are isolated because of walls, roads, and cutoff ditches at their boundaries. The walls and roads obstruct the exchange of water from the sub-basins and outside, and the cutoff ditch prevents the inflow of storm runoff from the hillsides into the sub-basins. In this study, the main datasets included DEM data with a resolution of 1 m for rivers, buildings, green land, roads, and the storm sewer network. As listed in Table 1, the area, impermeability, total conduit length, and numbers of junctions and outfalls were 15.83 ha, 70.84%, 9.15 km, 1057, and 1, respectively, for the northern district; 3.47 ha, 28.56%, 0.60 km, 97, and 2, respectively, for the convention district; and 60.00 ha, 64.62%, 26.22 km, 2701, and 11, respectively, for the main district. Figure 7 shows the distribution of the storm sewer networks for the three sub-basins. One drainage system discharges rainwater into the drainage system outside the campus through outfall 1Y108 in the northern district, two drainage systems discharge into the lake through outfalls 4Y438 and 4Y455 in the convention district, and 11 drainage systems discharge into the river in the main district.

Scenario Design
In order to consider the structure and complexity of the storm sewer system in the study area, we classified the networks in the three sub-basins (northern district, convention district, and main district) into four levels (1-4) based on the stroke scaling method presented in Section 2.2. The complexity of the sewer network increased from Level 1 to Level 4. The four levels of the sewer network were combined with the DEM, rivers, buildings, green land, and roads dataset for four subdivisions (1-4) based on the catchment discretization techniques described in Section 2.1.
We constructed Chicago design storms [43,44] with four different return periods (2, 10, 50, and 100 years) with a rainfall duration of 2 h based on the following Nanjing storm intensity equation, which was derived from local rainfall records for more than 60 years: i = 64.300 + 53.800lgP (t + 32.900) 1.011 (9) where i is the rainfall intensity (mm/min), t is the rainfall duration (min), and p is the return period (year) of the rainfall event. We chose a rainfall duration of 2 h for the modeling and simulation because the Chicago hyetograph with a rainfall duration of 2 h is considered an appropriate way to simulate extreme rainfall in China [45,46]. The error would increase if a storm longer than 2 h was simulated [45,46]. Therefore, we designed 16 scenarios for the three sub-basins, as listed in Table 2. S4, S8, S12, and S16 were control groups for the sewer network in scenarios that were not simplified, and the rest of the scenarios were experimental groups.

Classification and Discretization Results
Figures 8-10 display the classification results for the four complexity levels of the storm sewer networks and corresponding discretization results for the three sub-basins. For the northern district, the total conduit lengths at Levels 1-4 were 0.08, 1.54, 6.58, and 9.15 km, respectively; this corresponded to 3, 79, 531, and 1055 subcatchments, respectively. For the convention district, the total conduit lengths were 0.07, 0.23, 0.43, and 0.60 km, which corresponded to 5, 22, 47, and 96 subcatchments, respectively. For the main district, the total conduit lengths were 2.90, 9.80, 18.70, and 27.00 km, which corresponded to 114, 519, 1336, and 2717 subcatchments, respectively. The details are presented in Table 3. Table 3. Conduit length and subcatchment counts for different drainage system complexities of the three sub-basins. The Level 1 sewer network only contained the pipe stroke with StartP as the outfall; it covered one part of the main roads of the study area. The Level 2 sewer network covered almost all of the main roads of the study area. The complexity of the Level 3 network was between those of Level 2 and the original sewer network (Level 4).

Results of Outfalls' Total Inflow
We built and modeled the SWMMs for 16 scenarios (S1-S16) based on the classification and discretization results for the three sub-basins. We then ran the models to obtain 16 flood simulation results for each sub-basin. Table 4 displays the cost time for simulating the four kinds of storm sewer network with different levels of complexity for the storm with a 50-year return period. The simulation times with the other return periods were similar. Unsurprisingly, the simulation time significantly increased with the complexity of the pipe network, which means that simplifying the sewer network can obviously improve the simulation speed.  Figures 11-13 show the time-total inflow hydrographs of S1-S16 for the 1Y108 outfall in the northern district, for the 4Y455 outfall in the convention district, and for the 3Y80 outfall in the main district, respectively.   For the time-total inflow hydrographs of outfall 1Y108 in the northern district (Figure 11), the hydrographs results of Levels 1-3 were very close to those Level 4 and each other for the 2-year storm. Similarly, for the 10-, 50-, and 100-year storms, the time-total inflow hydrographs of the experimental group were very close to the control group, although the Level 1 hydrographs tended to deviate from the Level 4 hydrographs (i.e., control groups) as the heavy rainfall increased. However, the differences between Levels 1 and 4 were not great.
For the time-total inflow hydrographs of outfall 4Y455 in the convention district ( Figure 12), note that the difference between hydrographs for the same storm were clearer than the hydrographs of the northern district. In addition, the Level 1 and Level 2 hydrographs were very close to each other for each storm, while the Level 3 and Level 4 hydrographs were very close to each other. The difference between Levels 3 and 4 increased with rainfall but only marginally. The differences between Levels 1 or 2 and Level 4 showed a subtle increasing trend.
The time-total inflow hydrographs of outfall 3Y80 in the main district were the most distinct for the four levels of all outfalls ( Figure 13). Surprisingly, the differences between hydrographs for the same storm were significant compared with the hydrographs of the northern and convention districts. Overall, the Level 3 hydrographs were very close to the Level 4 hydrographs, but the Level 1 and Level 2 hydrographs were far from the Level 4 hydrographs. This means that the simulation results will be unreliable if a Level 1 or 2 generalization is adopted but will be acceptable with a Level 3 generalization.
Notably, the difference of four levels' time-total inflow hydrographs of the three districts which had different amounts of outfall clearly show a different situation. This is because, if a certain area has more than one drainage system, simplification of the drainage system may lead to erroneous division of the subcatchments. This can result in storm runoff that should originally flow into one drainage system being assigned to another drainage system. Therefore, when the study area had only one drainage system, the time-total inflow hydrograph was almost unaffected by the generalization degree of the sewer network. However, when the study area had more than one drainage system, the time-total inflow hydrograph could deviate from the actual results depending on the generalization degree of the sewer network leading to the erroneous division of subcatchments. A higher degree of simplification of the sewer network would clearly increase the probability of erroneous division of subcatchments for an area with multiple drainage systems. Thus, the solution is to extend the range of the generalized sewer network to be as similar as possible to the original network to minimize the error (e.g., Level 3 in the main district).

Flood Results
We analyzed the distribution of overflow points, total overflow, flood spreading area, and accumulated flood depth to further explore the impact of the generalization degree of the sewer network on the flood simulation results. The accumulated flood depth was calculated by using the total overflow of the overflow points divided by the area of the corresponding subcatchments.
We only analyzed the flood simulation results for the storm with a 50-year return period (Scenarios S9-S12) because the simulation results were similar for other return periods. Figures 14-16 show the distribution of overflow points and accumulated flood depth of Scenarios S9-S12 for the three sub-basins. The statistical results are presented in Table 5 for the northern district, Table 6 for the convention district, and Table 7 for the main district. N is the number of overflow points, TN is the total number of overflow points, SLN is the same number of overflow point locations compared with S12 with a Level 4 network complexity (i.e., control group), A is the inundated area, and TA is the total inundated area.    Table 5. The total number of overflow points (TN), the number of overflow points (N), the same number of overflow point locations compared with the control group (SLM), and the total inundated area (TA) of Scenarios S9-S12 for the northern district.  Table 6. TN, N, SLM, and TA of Scenarios S9-S12 for the convention district.  The simulation results showed that the number of overflow points increased with the complexity of the sewer network in the northern district (see Figure 14 and Table 5). The other districts showed a similar trend, as presented in Figures 15 and 16 and Tables 6 and 7. This means that increasing the degree of generalization of the sewer network reduces the accuracy regarding the location of overflow points.

Scenarios
However, Figures 14-16 show that overflow points with serious flooding (i.e., flood volume of greater than 0.3 ML) were mainly distributed on the backbone conduits. When the flood volume of overflow points was 0.31-0.70 ML, N and SLM of Level 1 and Level 2 were quite different from the values of Level 4, but the values of Level 3 were very close to those of Level 4. When the flood volume of the overflow points was 0.71-1.50 ML or ≥1.51 ML, the differences in values between the experimental and control groups were reduced. The results showed that a simplified sewer network that keeps the backbone conduits of the drainage system can identify serious overflow points to some extent. In other words, the simplified sewer network can catch similar serious overflow points provided that it extends over a similar region as the original sewer network.
The total inundated area (TA) in the northern district decreased with increasing complexity for S9-S12, as presented in Table 5. The same situation was observed in the main district, as presented in Table 7. This indicates that increasing the degree of generalization of the sewer network makes it more difficult to obtain specific locations of the inundated area, and only a rough range can be obtained.
In general, the total overflow, flood spreading area, and accumulated flood depth of Level 3 were quite close to those of Level 4. Thus, we can use a Level 3 generalization to simulate floods in the study area instead of Level 4. However, if we do not require high precision in the simulation and only need to identify areas with serious overflow, Level 2 can be considered to replace Level 4, but Level 1 is inappropriate owing to critical error.

Conclusions
In this study, we designed 16 scenarios for three sub-basins based on four different complexities for sewer networks and four storms with different return periods to explore the impact of the urban drainage system complexity on the flood simulation accuracy in terms of the total inflow of outfall and flood results. We also proposed the stroke scaling method for automatic generalization to overcome the low efficiency and accuracy of manual generalization.
The results of the three specific case studies showed that the complexity of the urban storm sewer network does not have a significant impact on the total inflow of outfalls for a single drainage system but does have a clear impact for multiple drainage systems. The driving factor is that simplifying a region with multiple drainage systems may induce the storm runoff to flow into the wrong drainage systems. The results also showed that serious flooding is mainly distributed on backbone pipes. This means that, if the simplified sewer network extends over a range similar to that of the original network, the flood simulation results for the two networks will be similar. The results of the research basins can be used to formulate the following generalization strategies for a storm sewer network.
(1) Extreme rainfall intensity does not need to be considered for storm sewer network generalization because the differences in the flood simulation results for different complexities of the sewer network with different storm intensities are not quite clear.
(2) If the whole area for flood simulation is large and there are no strict requirement for the accuracy of inundated locations in the small sub-basins, sub-basins with a single drainage system can be considerably simplified to only keep conduits and junctions that are directly connected with the outfall (i.e., keep pipe strokes with Le = 1). Sub-basins with multiple drainage systems should also keep the conduits and junctions that make up the backbone of the original sewer network. However, if there are strict requirements regarding the quality of the flood simulation results, the conduits and junctions making up the backbone of the original sewer network should also be kept for sub-basins with a single drainage system.
(3) If the flood simulation area is small and locations with serious inundation need to be located, the simplified sewer network should be extended to have a range similar to that of the original sewer network. Therefore, the generalization strategy should be removing the branch sections of the sewer network first rather than deleting upstream segments without considering the range of the drainage system.
In summary, the storm sewer network should be simplified depending on the distribution characteristics of the drainage system and required precision for the flood simulation results. We believe that this study is greatly significant regarding data precision for the sensitivity of flood simulation. In addition, the proposed generalization strategies and automatic generalization method of storm sewer networks are highly applicable to improving the quality and efficiency of flood modeling and simulation.
However, this study focused only on the accuracy of storm sewer network on the sensitivity of flood simulations in an urban area. Future studies will explore the precision of other datasets and its effect on the sensitivity of hydrologic modeling or simulation in different scenarios. Different data precisions of different datasets can be compared to discover laws or patterns for the model sensitivity. Then, a series of simplified strategies can be proposed for different data types or combinations of different datasets to improve modeling and simulation efficiency.