Methodology of Sub-Catchment Division Considering Land Uses and Flow Directions

Catchment division constitutes the foundation for urban water flood forecasting but represents a technically challenging task. The accurate division of catchments is significant for precisely forecasting urban waterlogging. However, existing catchment division methods usually lead to produce results that do not accurately reflect the actual land-use distributions. In recent years, most research has been performed in smaller study areas (less than 10 km2), in residential areas, parks and campuses, and usually focused on a single landscape type. However, for large highly urbanized areas with complex land uses, due to the spatial heterogeneity and complexity of such areas in terms of building, traffic network and hydrology, etc., there is few studies on sub-catchment division. Moreover, the division results by using existing method usually have deviate with the actual land-type distributions. To address the above-mentioned issues, a sub-catchment division method was here proposed that accounts for land-use types and flow directions, and it is suitable for large urban areas by introducing an auto-adaptive threshold adjustment in a novel algorithm. First, the study area is divided into first- and second-level (FL and SL, respectively) catchments according to the macroscale features such as natural landforms, canals, and pipe network. Second, an amended DEM (Digital Elevation Model) and flow direction data are used to divide the SL catchments into third-level direction-based (D-B) catchments. Finally, a novel land use-based algorithm is proposed to divide the D-B catchments into the “smallest” catchments (S-catchments). A large-scale area (44 km2) in Dongying City of China was employed to validate the proposed method. The experiment showed that the proposed method is suitable for subcatchment divisions in large regions and can ensure that the subcatchments are consistent with the actual distribution of land uses and runoff directions.


Introduction
The intensification of global climate change and the rapidity of urbanization in recent years has increased the frequency and intensity of urban waterlogging, which seriously threatens the safety of urban public infrastructure and urban residents' lives and property. Therefore, it has become increasingly important to strengthen urban rainstorm warning, management, and forecast systems [1][2][3]. In order to simulate and evaluate the urban waterlogging caused by insufficient drainage capacity under rainstorm conditions, it is necessary to know where the water that can't be discharged comes from, which can be defined the 'catchment'. So for urban waterlogging prediction, catchment division is an important process in the forecasting of urban floods and has a significant impact on the forecast accuracy [4,5], it is also an important input index for hydrological models, such as Storm Water Management Model (SWMM) [6]. For natural basin waterlogging prediction, due to that it mainly considers the impact of rivers, which may not account for the urban underlying surfaces [20]. Therefore, to address these problems, a method capable of finely partitioning catchment areas over large regions while accounting for land types and flow directions is proposed in this study.
The paper is organized as follows. Section 2 reviews the existing catchment division methods and their limitations. Section 3 provides a detailed description of the proposed catchment division method, which accounts for land types and flow directions. Section 4 presents an analysis and discussion of the experimental results. Finally, the conclusions are provided in Section 5.

Conventional Catchment Division Methods
A catchment division method can be categorized as either a DEM-based hydrological method or a node-based geometric method. The DEM-based hydrological method uses mainly DEM data and hydrological algorithms to partition catchment areas; examples include the D8 algorithm proposed by Callaghan and Mark [12], the Digital Elevation Model Networks (DEMON) algorithm developed by Costa-Cabral and Burges [21], and the RIDEM model [15]. The most recent hydrological method utilizes ancillary data such as road, building and drainage data to amend DEMs in dividing catchment areas. The basic D8 algorithm is probably the most popular method for automated drainage recognition, but also has a number of deficiencies, and there are several approaches have been proposed to improve the D8 algorithm, such as a unified algorithm for the determination of flow directions [22], a two-parameter method for drainage network extraction [23] and GeoNet method which do not need a strict flow accumulation threshold [24]. Noted that for the division method in this paper, the basic D8 algorithm is used due to two reasons. One is the simplicity and reliability of this algorithm, the other is that the aim of this paper is to determine the primary catchment based on flow direction by considering buildings, the basic D8 algorithm can meet this need to some extent. Next, the node-based geometric method uses Voronoi algorithms to extract catchment areas based on the spatial distribution of catch basins in drainage networks. With the Voronoi method, each Voronoi diagram consists of a set of nodes, and each node represents a catchment area [17].

Hybrid Catchment Division Method
Urban catchment areas constitute amalgams comprising both natural terrain and artificial drainage systems; unfortunately, in the abovementioned methods, catchment areas are divided based on only one of these factors. To address this issue, Huang et al. [18] used geographic information system (GIS) technology to create a high-precision method that integrates these two methods. The procedures is as follows: Step 1: A model of the urban drainage network is constructed based on the actual layout of the drainage network (green lines in Figure 1) by processing its conduits and nodes by importance. This produces the pipe network and its important nodes (Nodes 1). Step 3: Nodes 2 are compared against the actual drainage pipe network. The nodes that overlap with drainage pipes are removed, whereas the nodes that supplement those of the drainage pipe network are retained, and by using the spatial overlay function of GIS to generate the final nodes ( Figure 2).
Step 4: All of the nodes remaining after Step 3 are used to create a Voronoi diagram (composed of Thiessen polygons, where each polygon represents a catchment area).

Limitations of Current Catchment Division Methods
Huang et al. [18] proposed a highly precise catchment division method that fully accounts for the effects of both natural terrain and drainage infrastructure on urban catchment areas. However, this approach simply uses the Voronoi algorithm to geometrically partition a catchment area based on the nodes formed by natural catchment points and catch basin points, which have different catchment capacities. Hence, this technique overlooks the distribution of land types throughout the catchment, resulting in deviations between the real overland flows and the modelling results. This problem can be described: (1) The distribution of urban catch basins is usually heterogeneous (e.g., the red and yellow rectangles in Figure 3). In areas with a very low density of catch basin nodes, the method developed by Huang et al. [18] will directly superimpose natural catchment nodes with catch basin nodes, but these nodes have different water catchment capacities when drawing the Thiessen polygons; this discrepancy often results in inconsistence of flow accumulation, thus lead to inconsistence of overland flows. Furthermore, when this method is applied to a large area, areas with a high catch basin density (such as urban centers) will have an excessive number of Thiessen polygons, which will directly affect the application of subsequent hydrological models. In addition, manual selection of the nodes is necessary to reduce the number of nodes, but this usually introduces uncertainty into the results. Step 2: Due to that land type affects the flow routing, for instance, the building and road are usually higher than the land surface, which will obstruct the overland flows routing, and change their flow direction. In addition, due to that different land uses have different water absorption capacity, such as impervious surface and forest land, so the land types can also affect surface flow velocity (the flow velocity on impervious surface is usually faster than that on natural land surface). On this basis, the DEM is amended by incorporating land types, and the D8 drainage algorithm is then used to extract the drainage network. Next, the drainage network is processed to obtain the overland flows and their important nodes (Nodes 2).
Step 3: Nodes 2 are compared against the actual drainage pipe network. The nodes that overlap with drainage pipes are removed, whereas the nodes that supplement those of the drainage pipe network are retained, and by using the spatial overlay function of GIS to generate the final nodes ( Figure 2). Step 3: Nodes 2 are compared against the actual drainage pipe network. The nodes that overlap with drainage pipes are removed, whereas the nodes that supplement those of the drainage pipe network are retained, and by using the spatial overlay function of GIS to generate the final nodes ( Figure 2).
Step 4: All of the nodes remaining after Step 3 are used to create a Voronoi diagram (composed of Thiessen polygons, where each polygon represents a catchment area).

Limitations of Current Catchment Division Methods
Huang et al. [18] proposed a highly precise catchment division method that fully accounts for the effects of both natural terrain and drainage infrastructure on urban catchment areas. However, this approach simply uses the Voronoi algorithm to geometrically partition a catchment area based on the nodes formed by natural catchment points and catch basin points, which have different catchment capacities. Hence, this technique overlooks the distribution of land types throughout the catchment, resulting in deviations between the real overland flows and the modelling results. This problem can be described: (1) The distribution of urban catch basins is usually heterogeneous (e.g., the red and yellow rectangles in Figure 3). In areas with a very low density of catch basin nodes, the method developed by Huang et al. [18] will directly superimpose natural catchment nodes with catch basin nodes, but these nodes have different water catchment capacities when drawing the Thiessen polygons; this discrepancy often results in inconsistence of flow accumulation, thus lead to inconsistence of overland flows. Furthermore, when this method is applied to a large area, areas with a high catch basin density (such as urban centers) will have an excessive number of Thiessen polygons, which will directly affect the application of subsequent hydrological models. In addition, manual selection of the nodes is necessary to reduce the number of nodes, but this usually introduces uncertainty into Step 4: All of the nodes remaining after Step 3 are used to create a Voronoi diagram (composed of Thiessen polygons, where each polygon represents a catchment area).

Limitations of Current Catchment Division Methods
Huang et al. [18] proposed a highly precise catchment division method that fully accounts for the effects of both natural terrain and drainage infrastructure on urban catchment areas. However, this approach simply uses the Voronoi algorithm to geometrically partition a catchment area based on the nodes formed by natural catchment points and catch basin points, which have different catchment capacities. Hence, this technique overlooks the distribution of land types throughout the catchment, resulting in deviations between the real overland flows and the modelling results. This problem can be described: (1) The distribution of urban catch basins is usually heterogeneous (e.g., the red and yellow rectangles in Figure 3). In areas with a very low density of catch basin nodes, the method developed by Huang et al. [18] will directly superimpose natural catchment nodes with catch basin nodes, but these nodes have different water catchment capacities when drawing the Thiessen polygons; this discrepancy often results in inconsistence of flow accumulation, thus lead to inconsistence of overland flows. Furthermore, when this method is applied to a large area, areas with a high catch basin density (such as urban centers) will have an excessive number of Thiessen polygons, which will directly affect the application of subsequent hydrological models. In addition, manual selection of the nodes is necessary to reduce the number of nodes, but this usually introduces uncertainty into the results. ISPRS Int. J. Geo-Inf. 2020, 9, x FOR PEER REVIEW 5 of 20 (2) As shown in Figure 3(b), a land type (such as building or grassland, indicated in green) is

179
The existing catchment division methods usually produce catchment areas that are inconsistent 180 with the real distribution of land types and are not very suitable for large areas [18]. To address these 181 issues, a catchment division method that accounts for land types and flow directions and is suitable 182 for large areas was proposed in this study. First, the primary catchments, otherwise known as first-   Figure 3a denotes the situation with low density of catch basin nodes, whereas the yellow panel denotes the situation with high density of nodes).
(2) As shown in Figure 3b, a land type (such as building or grassland, indicated in green) is irregularly partitioned into multiple Thiessen polygons, which is inconsistent with the real overland flows.

Catchment Division Method That Accounts for Land Type and Flow Direction
The existing catchment division methods usually produce catchment areas that are inconsistent with the real distribution of land types and are not very suitable for large areas [18]. To address these issues, a catchment division method that accounts for land types and flow directions and is suitable for large areas was proposed in this study. First, the primary catchments, otherwise known as first-level (FL) catchments, were determined by using the terrain, canals, and drainage pipes. Then, based on the FL catchments, an iterative fine division algorithm was proposed to reveal the fine division of catchments. The proposed method consists of three parts: the macropartitioning of FL catchments based on the natural topography, the division of second-level (SL) catchments based on canals and drainage pipes, and the micropartitioning of third-level catchments based on land types and flow directions. A flowchart of the proposed method is illustrated in Figure 4.

First-Level Catchment Division Based on the Natural Topography
The FL drainage system of a city is responsible for draining stormwater from a relatively large area into rivers outside the city; this system is usually combined with trunk rivers [25]. The trunk river system can be used to divide a city into several independent natural catchment areas. Therefore, the FL catchments are partitioned according to the city's natural terrain and the confluences between the rivers within the region, and the trunk rivers capable of draining water from the watershed (stream-M) are used as the partitioning boundaries. This roughly partitions the region into a few subcatchments and ensures that the catchment areas are consistent with the trunk river-delineated boundaries of the city's natural topography, as illustrated in Figure 5.

First-Level Catchment Division Based on the Natural Topography
The FL drainage system of a city is responsible for draining stormwater from a relatively large area into rivers outside the city; this system is usually combined with trunk rivers [25]. The trunk river system can be used to divide a city into several independent natural catchment areas. Therefore, the FL catchments are partitioned according to the city's natural terrain and the confluences between the rivers within the region, and the trunk rivers capable of draining water from the watershed (stream-M) are used as the partitioning boundaries. This roughly partitions the region into a few subcatchments and ensures that the catchment areas are consistent with the trunk river-delineated boundaries of the city's natural topography, as illustrated in Figure 5.

Second-Level Catchment Division Based on Canals and Pipe Network
The FL catchments account for the influences of the city's natural topography. However, due to the influence of artificial construction, artificial buildings and structures changed the natural terrain. Therefore, a city's catchment characteristics are dependent on topographic factors and are also constrained and guided by the existing drainage systems. Compared with natural overland flows, overland flows inside a city are highly heterogeneous due to the presence of impermeable artificial surfaces. Thus, the secondary drainage systems of a city are responsible for draining rainwater from the streets into the trunk river system, and these secondary systems usually consist of stormwater pipe network or artificial canals. In particular, the underground pipe network laid along the sides of roads usually act as the main channels for local surface runoff and underground drains and play an important role in local drainage processes similar to that of, natural rivers or main canals. In this paper, SL catchments are partitioned along drainage canals and along primary and secondary roads lined by drainage pipes on either side. Figure 6 demonstrates the partitioning of FL catchment into SL catchments. Using canals and pipe networks for the division of SL catchments ensures that the catchments will be partitioned in a uniform manner.

Second-Level Catchment Division Based on Canals and Pipe Network
The FL catchments account for the influences of the city's natural topography. However, due to the influence of artificial construction, artificial buildings and structures changed the natural terrain. Therefore, a city's catchment characteristics are dependent on topographic factors and are also constrained and guided by the existing drainage systems. Compared with natural overland flows, overland flows inside a city are highly heterogeneous due to the presence of impermeable artificial surfaces. Thus, the secondary drainage systems of a city are responsible for draining rainwater from the streets into the trunk river system, and these secondary systems usually consist of stormwater pipe network or artificial canals. In particular, the underground pipe network laid along the sides of roads usually act as the main channels for local surface runoff and underground drains and play an important role in local drainage processes similar to that of, natural rivers or main canals. In this paper, SL catchments are partitioned along drainage canals and along primary and secondary roads lined by drainage pipes on either side. Figure 6 demonstrates the partitioning of FL catchment into SL catchments. Using canals and pipe networks for the division of SL catchments ensures that the catchments will be partitioned in a uniform manner.

Third-Level Catchment Division Based on Land Types and Flow Directions
The third-level catchment (S-catchment) represents the smallest element of a hydrological model, and the S-catchment division accuracy directly determines the accuracy of subsequent hydrologic analyses. To ensure that the catchment areas are consistent with land types, the division of S-catchments is based on topographic flow directions and fine-resolution land-use types. First, the D8 algorithm is applied to perform flow analysis based on an amended DEM to determine the main flow directions. The SL catchments are then roughly partitioned into direction-based (D-B) catchments based on these flow directions. An iterative S-catchment division algorithm is finally proposed to finely partition these D-B catchments based on their fine-resolution land-use types to produce S-catchments.

Partitioning of D-B Catchments Based on Flow Directions
(1) DEM Amendment The surface runoff processes occurring within an urban area are controlled not only by the natural topography but also by buildings, roads, and canals. In the method proposed, the original DEM is amended to account for the influences of urban underlying surfaces on runoff; this is performed by incorporating information about land objects that influence the overland flow routing into the DEM. The elevation values at DEM grid points containing land object information are then increased or decreased to alter their catchment capacities. It should be noted that to avoid the artificial depressions caused by DEM amendment, the building data are preprocessed to avoid the building with holes. Since the influences of roads and canals on runoff have already been accounted for during the partitioning of SL catchments, when amending the DEM, it is necessary only to account for the impacts of buildings in obstructing runoff. The procedure and equation for DEM amendment are shown in Figure 7 and Equation (1).
where Ea is the amended DEM elevation value, Ei is the original DEM elevation value, and H is the corresponding building's height.

Third-Level Catchment Division Based on Land Types and Flow Directions
The third-level catchment (S-catchment) represents the smallest element of a hydrological model, and the S-catchment division accuracy directly determines the accuracy of subsequent hydrologic analyses. To ensure that the catchment areas are consistent with land types, the division of S-catchments is based on topographic flow directions and fine-resolution land-use types. First, the D8 algorithm is applied to perform flow analysis based on an amended DEM to determine the main flow directions. The SL catchments are then roughly partitioned into direction-based (D-B) catchments based on these flow directions. An iterative S-catchment division algorithm is finally proposed to finely partition these D-B catchments based on their fine-resolution land-use types to produce S-catchments.

Partitioning of D-B Catchments Based on Flow Directions
(1) DEM Amendment The surface runoff processes occurring within an urban area are controlled not only by the natural topography but also by buildings, roads, and canals. In the method proposed, the original DEM is amended to account for the influences of urban underlying surfaces on runoff; this is performed by incorporating information about land objects that influence the overland flow routing into the DEM. The elevation values at DEM grid points containing land object information are then increased or decreased to alter their catchment capacities. It should be noted that to avoid the artificial depressions caused by DEM amendment, the building data are preprocessed to avoid the building with holes. Since the influences of roads and canals on runoff have already been accounted for during the partitioning of SL catchments, when amending the DEM, it is necessary only to account for the impacts of buildings in obstructing runoff. The procedure and equation for DEM amendment are shown in Figure 7 and Equation (1).
where E a is the amended DEM elevation value, E i is the original DEM elevation value, and H is the corresponding building's height. (2) D-B Catchment Division Based on Flow Directions Step 1: Flow analysis. The hydrologic analysis tool in ArcGIS 10.2 is used to fill sinks in the amended DEM. It should be noted that if there exist artificial depressions caused by DEM amendment, this paper will first judge whether it is a depression surrounded by buildings. If so, the continuous iteration will be applied to fill the depression until there is no depression caused by buildings.
The D8 algorithm is then used to calculate the DEM elevation differences in eight directions from the central grid (north, west, east, south, southeast, northeast, southwest, and northwest), and the descending direction with the greatest elevation difference is defined as the flow direction (purple line in Figure 8). The basic D8 algorithm is probably the most popular method for automated drainage recognition and catchment area determinations [14]. During the extraction of flow directions, the catchment threshold will affect the selection of trunk flows and tributaries. Based on multiple experiments, the optimal threshold was determined to be 1000 in this paper.
Noted that the threshold can affect the results of flow directions. The larger the threshold, the less tributary at the end of the stream is extracted, leading to the main flow direction is clearer, whereas the details of tributaries are missing. The threshold for the formation of flow in each region is different. Since this paper is mainly for plain urban areas, the difference in surface slope is small, the threshold should be as small as possible to describe the surface runoff in detail, but it is necessary to avoid excessive fragmentation. Considering the fine division of catchment is based on land use data, the threshold whose flow results are more consistent with the land use distribution is selected as the optimal threshold in this paper. Taking about 5.2 km 2 as the sample area, the thresholds of 500, 1000, 2000, 5000, and 8000 are used to generate the flow direction, and the number of main flow directions generated by each threshold is obtained. Then, the ratio of the number of flows to land use patches are calculated, as illustrated in Figure 8. It can be found that as the threshold increases, this ratio gradually decreases. When the threshold is 1000, the ratio is the closest to 1, suggesting the flow results are more consistent with the land use distribution. Therefore, the threshold of 1000 in this paper is selected as the optimal threshold of the study area. It should be noted that the flows under different thresholds do not denote true geomorphological significance, so it is necessary to select the optimal threshold according to regional characteristics and data conditions. Step 1: Flow analysis. The hydrologic analysis tool in ArcGIS 10.2 is used to fill sinks in the amended DEM. It should be noted that if there exist artificial depressions caused by DEM amendment, this paper will first judge whether it is a depression surrounded by buildings. If so, the continuous iteration will be applied to fill the depression until there is no depression caused by buildings.
The D8 algorithm is then used to calculate the DEM elevation differences in eight directions from the central grid (north, west, east, south, southeast, northeast, southwest, and northwest), and the descending direction with the greatest elevation difference is defined as the flow direction (purple line in Figure 8). The basic D8 algorithm is probably the most popular method for automated drainage recognition and catchment area determinations [14]. During the extraction of flow directions, the catchment threshold will affect the selection of trunk flows and tributaries. Based on multiple experiments, the optimal threshold was determined to be 1000 in this paper. Step 2: Extraction of catchment points. The flow accumulation of each grid is determined by calculating the number of grids upstream whose flows pass through this grid. The grid elements with high flow accumulation levels will be used to form catchment ridge lines (green lines in Figure 9), and the intersections between catchment ridge lines are catchment points (blue points in Figure 9).
Step 3: Preliminary division of catchment areas. Based on the catchment points obtained in Step 2, all the grid elements with the same flow direction are partitioned into the same independent catchment area. These catchment areas are the third-level D-B catchments. Noted that the threshold can affect the results of flow directions. The larger the threshold, the less tributary at the end of the stream is extracted, leading to the main flow direction is clearer, whereas the details of tributaries are missing. The threshold for the formation of flow in each region is different. Since this paper is mainly for plain urban areas, the difference in surface slope is small, the threshold should be as small as possible to describe the surface runoff in detail, but it is necessary to avoid excessive fragmentation. Considering the fine division of catchment is based on land use data, the threshold whose flow results are more consistent with the land use distribution is selected as the optimal threshold in this paper. Taking about 5.2 km 2 as the sample area, the thresholds of 500, 1000, 2000, 5000, and 8000 are used to generate the flow direction, and the number of main flow directions generated by each threshold is obtained. Then, the ratio of the number of flows to land use patches are calculated, as illustrated in Figure 8. It can be found that as the threshold increases, this ratio gradually decreases. When the threshold is 1000, the ratio is the closest to 1, suggesting the flow results are more consistent with the land use distribution. Therefore, the threshold of 1000 in this paper is selected as the optimal threshold of the study area. It should be noted that the flows under different thresholds do not denote true geomorphological significance, so it is necessary to select the optimal threshold according to regional characteristics and data conditions.
Step 2: Extraction of catchment points. The flow accumulation of each grid is determined by calculating the number of grids upstream whose flows pass through this grid. The grid elements with high flow accumulation levels will be used to form catchment ridge lines (green lines in Figure 9), and the intersections between catchment ridge lines are catchment points (blue points in Figure 9). Step 2: Extraction of catchment points. The flow accumulation of each grid is determined by calculating the number of grids upstream whose flows pass through this grid. The grid elements with high flow accumulation levels will be used to form catchment ridge lines (green lines in Figure 9), and the intersections between catchment ridge lines are catchment points (blue points in Figure 9).
Step 3: Preliminary division of catchment areas. Based on the catchment points obtained in Step 2, all the grid elements with the same flow direction are partitioned into the same independent catchment area. These catchment areas are the third-level D-B catchments.

The Smallest Catchment Division Based on Land Type Data
To accurately determine urban catchments, an iterative fine division algorithm was proposed to partition D-B catchments based on high-precision land-use type data and catch basin data, thus obtaining the smallest catchments (S-catchments). It should be noted that due to that different land uses have different infiltration intensity, generally, the higher the precision of land uses data, the more accurate the catchment area will be. However, high precision land uses data are usually fragmented, which will lead to the difficulty of automatic determination of catchments in large-scale urban area. The proposed algorithm can address this issue. To guarantee the accuracy of the subcatchments, the land use data used in this paper is derived from the Department of Natural Resources of Shandong Province, China. The data accuracy and quality conformed the standard specification DB37T 2761.2-2016 (Technical specification of geographic information public service platform, Part 2: Framework Data) issued by Shandong Bureau of Quality and Technical Supervision. The land use Step 3: Preliminary division of catchment areas. Based on the catchment points obtained in Step 2, all the grid elements with the same flow direction are partitioned into the same independent catchment area. These catchment areas are the third-level D-B catchments.

The Smallest Catchment Division Based on Land Type Data
To accurately determine urban catchments, an iterative fine division algorithm was proposed to partition D-B catchments based on high-precision land-use type data and catch basin data, thus obtaining the smallest catchments (S-catchments). It should be noted that due to that different land uses have different infiltration intensity, generally, the higher the precision of land uses data, the more accurate the catchment area will be. However, high precision land uses data are usually fragmented, which will lead to the difficulty of automatic determination of catchments in large-scale urban area. The proposed algorithm can address this issue. To guarantee the accuracy of the sub-catchments, the land use data used in this paper is derived from the Department of Natural Resources of Shandong Province, China. The data accuracy and quality conformed the standard specification DB37T 2761.2-2016 (Technical specification of geographic information public service platform, Part 2: Framework Data) issued by Shandong Bureau of Quality and Technical Supervision. The land use patch with area larger than 100 m 2 can be obtained, which resolution is about 10m. The data used in this paper include lawn, forest, cropland, unused land, buildings, and plazas/roads. The area of each land-use patch, Area i (i-th patch), is calculated by superimposing the land-use type data onto the D-B catchment data ( Figure 10). results. To prevent subcatchments from being overly partitioned and fragmented, which would affect subsequent hydrologic analyses, an iteration algorithm was proposed in this paper.
The threshold value δ for the land-use patch area is defined as follows: when ℎ and the number of catch basins in the patch is greater than 0, land-use patch i is deemed a single subcatchment. For instance, as illustrated in Figure 10, due to that the area share of patch A , the land use patch A is determined as a single sub-catchment. While the area share of patch B , it is determined a mixed sub-catchment. Based on the characteristics of the study area and a large number of tests, the value of the threshold, δ, was determined to be 0.12. (3) Iterative algorithm for the determination of mixed catchment areas When AreaSharei ≤ δ, land-use patch i is topologically associated and merged with its adjacent land-use patches (j) to form a mixed catchment area (Panel B in Figure 10). The function for the topological association and merging of land-use patch i with its adjacent patches may be expressed as: where n is the total number of patches adjacent to land-use patch i, AreaSNeiij is the ratio between the area of patch i and that of (adjacent) patch j, Directionij is the angle between the flow directions of patch i and patch j, Rainwaterij is the number of catch basins and catchment points between patch i and its adjacent patch, and SemNeiij is the semantic similarity between patch i and patch j. The (2) Determination of a Single Catchment The ratio between the area of each land-use patch and the total area of their D-B catchment, AreaShare i , is calculated as follows: where m is the total number of land-use patches and Area is the total area of the D-B catchment in which the i-th land-use patch is located. The single catchment areas and mixed catchment areas are determined based on the AreaShare i results. To prevent subcatchments from being overly partitioned and fragmented, which would affect subsequent hydrologic analyses, an iteration algorithm was proposed in this paper.
The threshold value δ for the land-use patch area is defined as follows: when AreaShare i ≥ δ and the number of catch basins in the patch is greater than 0, land-use patch i is deemed a single subcatchment. For instance, as illustrated in Figure 10, due to that the area share of patch A ≥ δ, the land use patch A is determined as a single sub-catchment. While the area share of patch B < δ, it is determined a mixed sub-catchment. Based on the characteristics of the study area and a large number of tests, the value of the threshold, δ, was determined to be 0.12.
(3) Iterative algorithm for the determination of mixed catchment areas When AreaShare i ≤ δ, land-use patch i is topologically associated and merged with its adjacent land-use patches (j) to form a mixed catchment area (Panel B in Figure 10). The function for the topological association and merging of land-use patch i with its adjacent patches may be expressed as: Merge(i, j) = f (AreaSNei ij , Rainwater ij , Direction ij , SemNei ij ), i = 1, 2 . . . ., m; j = 1, 2 . . . ., n; where n is the total number of patches adjacent to land-use patch i, AreaSNei ij is the ratio between the area of patch i and that of (adjacent) patch j, Direction ij is the angle between the flow directions of patch i and patch j, Rainwater ij is the number of catch basins and catchment points between patch i and its adjacent patch, and SemNei ij is the semantic similarity between patch i and patch j. The mathematical model for semantic similarity can be calculated based on Li et al. [26] with the following expression: where L i and L j are the land-use types of land-use patch i and its adjacent patch j, respectively; m is the number of land-use types that are semantically adjacent to L i , for the land use data used in this paper, m is 6; and Distance (L i , L j ) is the distance between the positions of i and j in the set of semantically adjacent land types. The semantic positions of the land types are based on the arrangement of permeating coefficient. Patches with similar permeating coefficients are considered more compatible and have priority to merge. The similarity is reflected by a value between 0 and 1, where 0 means not compatible at all and 1 means totally compatible. For instance, the semantic positions are plazas/roads, buildings, unused land, cropland, lawn and forest, their permeating coefficients are 0.011, 0.012, 0.06, 0.17, 0.20, and 0.50, respectively. According to Li et al. [26], define the semantic distance between adjacent land uses as 1 unit, if L i and L j are unused land and plaza/roads, the Distance (L i , L j ) is 2 unit, so the SemNei ij equals 2/3 (1-2/6). The merging function for land-use patch i and its adjacent patch j is where w 1 and w 2 are the weights of each index, and the weights are given by the criteria importance through intercriteria correlation (CRITIC) method. Next, all of the land use patches are iteratively processed, and a merging priority is used to determine the optimal adjacent patch j to merge with land-use patch i. This process continues until all S-catchments have areas greater than δ, or the number of catch basins in each individual land patch is greater than 0, or the flow directions between all adjacent patches are different. The end of this iterative process marks the completion of the S-catchment division process.
Finally, the flow directions of the S-catchments are used to add upstream/downstream relationships between the S-catchments, which will be used in subsequent hydrologic analyses ( Figure 11). This process helps to improve the catchment division accuracy, and the thresholds and iterative processes of this method also ensure that the catchment areas will not be so fragmented as to preclude their application in large-scale hydrological models. where w1 and w2 are the weights of each index, and the weights are given by the criteria importance through intercriteria correlation (CRITIC) method. Next, all of the land use patches are iteratively processed, and a merging priority is used to determine the optimal adjacent patch j to merge with land-use patch i. This process continues until all S-catchments have areas greater than δ, or the number of catch basins in each individual land patch is greater than 0, or the flow directions between all adjacent patches are different. The end of this iterative process marks the completion of the S-catchment division process.
Finally, the flow directions of the S-catchments are used to add upstream/downstream relationships between the S-catchments, which will be used in subsequent hydrologic analyses ( Figure 11). This process helps to improve the catchment division accuracy, and the thresholds and iterative processes of this method also ensure that the catchment areas will not be so fragmented as to preclude their application in large-scale hydrological models.

Experimental Analyses and Discussion
The Dongcheng area of Dongying District in Dongying City of China (Figure 12) was selected as the study area to validate the proposed method. The study area covers of 44 km 2 , which is relatively

Experimental Analyses and Discussion
The Dongcheng area of Dongying District in Dongying City of China ( Figure 12) was selected as the study area to validate the proposed method. The study area covers of 44 km 2 , which is relatively large for performing a catchment division task. Dongying District (118 • 12 42"-118 • 59 52" East longitude and 37 • 14 13"-37 • 31 57" North latitude), the central region of Dongying City, is located in the northeastern part of Shandong Province. The district is bordered by the Bohai Sea to the east and the Yellow River to the west and is divided into two major areas: Dongcheng and Xicheng; Dongcheng is the main area of this study. Because Dongying District is located in the middle latitudes, its climate is influenced simultaneously by the Eurasian continent and West Pacific Ocean, resulting in a warm and temperate continental monsoon climate. Sixty-five percent of its annual precipitation falls in the summer months, making this region highly susceptible to flash flooding. Moreover, the terrain of the study area is relatively flat, and its average elevation is 6-8 m. The study area includes a large variety of coexisting land-use types, including highly developed urban areas and rural croplands. Noted that due to the elevation fluctuation is not obvious in this area, high resolution topographic data (such as LIDAR, Light Detection And Ranging) are necessary, in order to obtain realistic flow directions [27,28].
Two types of experimental data were used. (1) Basic geographic data: land-uses data, road vector data, river system vector data, these vector data are derived from the Department of Natural Resources of Shandong Province. It also includes 0.05 m high-resolution remote sensing images and 0.5 m resolution DEM data, which are derived from Dongying Natural Resources Bureau. (2) Pipe network data: data about the drainage pipes (15151), manholes (15791), and discharge outlets (147) within Dongying City from the Dongying Meteorological Bureau. Figure 12 illustrates the division results for FL catchments, SL catchments and S-catchments in the Dongcheng region by using the proposed method. The catchment area boundaries produced by the proposed method are relatively regular in shape, and their distribution is consistent with the rivers, road networks, drainage pipe networks, and building boundaries within the study area. This agreement suggests that the proposed method successfully accounted for the effects of the natural terrain and drainage systems on the drainage network and the effects of the real land use distribution on surface runoff. As a result, the catchments obtained by the proposed method are consistent with the real drainage patterns. The study area was ultimately partitioned into four FL catchments, 111 SL catchments and 804 S-catchments. The largest and smallest catchments are 72.86 hm 2 and 0.28 hm 2 , respectively. It is worth noting that the larger catchments consist mainly of land-use types such as ponds and wetlands. Excluding these special land types, the areas of the other catchments vary between 0.28 hm 2 and 25.63 hm 2 with an SD (standard deviation) of 3.18, indicating that the geometric shapes and sizes of most Two types of experimental data were used. (1) Basic geographic data: land-uses data, road vector data, river system vector data, these vector data are derived from the Department of Natural Resources of Shandong Province. It also includes 0.05 m high-resolution remote sensing images and 0.5 m resolution DEM data, which are derived from Dongying Natural Resources Bureau. (2) Pipe network data: data about the drainage pipes (15151), manholes (15791), and discharge outlets (147) within Dongying City from the Dongying Meteorological Bureau. Figure 12 illustrates the division results for FL catchments, SL catchments and S-catchments in the Dongcheng region by using the proposed method. The catchment area boundaries produced by the proposed method are relatively regular in shape, and their distribution is consistent with the rivers, road networks, drainage pipe networks, and building boundaries within the study area. This agreement suggests that the proposed method successfully accounted for the effects of the natural terrain and drainage systems on the drainage network and the effects of the real land use distribution on surface runoff. As a result, the catchments obtained by the proposed method are consistent with the real drainage patterns.

Catchment Division of the Study Area
The study area was ultimately partitioned into four FL catchments, 111 SL catchments and 804 S-catchments. The largest and smallest catchments are 72.86 hm 2 and 0.28 hm 2 , respectively. It is worth noting that the larger catchments consist mainly of land-use types such as ponds and wetlands. Excluding these special land types, the areas of the other catchments vary between 0.28 hm 2 and 25.63 hm 2 with an SD (standard deviation) of 3.18, indicating that the geometric shapes and sizes of most catchments are relatively uniform. In addition, the catchments in the city center are relatively small, whereas those in rural areas are relatively large, suggesting that the catchments partitioned by the proposed method reflect the spatial heterogeneity of surface runoff from the visual analysis.

Qualitative Comparison
To further validate the reliability of the proposed method, comparisons were conducted between the original DEM and land type-amended DEM in terms of their flow directions. Comparisons were also performed between the results of the catchment division method proposed in this study and those of the method developed by Huang et al. [18].
(1) Comparison between the flow directions based on the original DEM and amended DEM An area that contains artificial drainage facilities was used to illustrate the effects of the amended DEM on the flow direction analyses. Figure 13a shows the flow directions generated from the original DEM, while Figure 13b shows the flow directions generated from the land use-amended DEM. The red rectangles reveal that the flow directions based on the original DEM pass through buildings, which is clearly inconsistent with real overland flows. (2) Catchment division comparison based on the Voronoi method and the proposed method Since the whole study area is relatively large and contains a large number of catch basins, it involves a huge computational effort when using the method of Huang et al. [18] to partition the catchment area. Therefore, a 1.6 km 2 subregion of the study area was selected to compare the proposed method with that of Huang et al. [18]. Figure 14a shows the catchment areas partitioned by the method of Huang et al. [18]; each catchment area contains a minimum of one catch basin or natural catchment point as its drainage point, showing that this method accounts for the influences of natural terrain and drainage facilities on overland flows. However, this method often irregularly divides a single building or forest into multiple catchment areas, resulting in inconsistencies with actual land use distributions (as shown in the yellow and white panels). Furthermore, since the catch basins are not distributed uniformly, the geometric shapes of the catchments tend to be irregular. Furthermore, although the method of Huang et al. [18] considers natural catchment points, simple Voronoi partitioning tends to result in a single catchment having different flow directions, which is also inconsistent with actual surface runoff.
The catchment division results of the proposed method are shown in Figure 14b; clearly, the resulting catchment areas are similar to those partitioned by the method of Huang et al. [18], and all of the catchment areas have a drainage point. Unlike the method of Huang et al. [18]; however, the catchments derived by the proposed method are consistent with the boundaries of trunk rivers and roads, and continuous land-use types are not partitioned into multiple catchment areas, suggesting that the derived catchment areas are consistent with the complex underlying surfaces and land-use distribution. In addition, the proposed method accurately accounts for the directions of real overland flows. For instance, if two adjacent land uses feature different surface flow directions, they will still be divided into two different catchments due to their different directions (such as the two land-use patches in Figure 15, which have opposing flow directions). Furthermore, by setting patch area thresholds and accounting for factors such as the numbers of catch basins in adjacent patches, the proposed method ensures that the partitioned catchments have regular geometric shapes; consequently, the proposed method can capture the spatial heterogeneity of overland flows with different land uses. The iterative approach adopted by the proposed method also ensures that the ultimate catchment areas will not be so fragmented as to prohibit their use in hydrological models for large areas. Since the original DEM does not account for building heights, the fact that buildings obstruct flows is not reflected in the flow direction analysis. In contrast, the land use-amended DEM considers the heights of buildings, and thus, when the proposed method is applied to the flow direction analysis, the obstruction of overland flows by buildings can be captured.
(2) Catchment division comparison based on the Voronoi method and the proposed method Since the whole study area is relatively large and contains a large number of catch basins, it involves a huge computational effort when using the method of Huang et al. [18] to partition the catchment area. Therefore, a 1.6 km 2 subregion of the study area was selected to compare the proposed method with that of Huang et al. [18]. Figure 14a shows the catchment areas partitioned by the method of Huang et al. [18]; each catchment area contains a minimum of one catch basin or natural catchment point as its drainage point, showing that this method accounts for the influences of natural terrain and drainage facilities on overland flows. However, this method often irregularly divides a single building or forest into multiple catchment areas, resulting in inconsistencies with actual land use distributions (as shown in the yellow and white panels). Furthermore, since the catch basins are not distributed uniformly, the geometric shapes of the catchments tend to be irregular. Furthermore, although the method of Huang et al. [18] considers natural catchment points, simple Voronoi partitioning tends to result in a single catchment having different flow directions, which is also inconsistent with actual surface runoff.   The catchment division results of the proposed method are shown in Figure 14b; clearly, the resulting catchment areas are similar to those partitioned by the method of Huang et al. [18], and all of the catchment areas have a drainage point. Unlike the method of Huang et al. [18]; however, the catchments derived by the proposed method are consistent with the boundaries of trunk rivers and roads, and continuous land-use types are not partitioned into multiple catchment areas, suggesting that the derived catchment areas are consistent with the complex underlying surfaces and land-use distribution. In addition, the proposed method accurately accounts for the directions of real overland flows. For instance, if two adjacent land uses feature different surface flow directions, they will still be divided into two different catchments due to their different directions (such as the two land-use patches in Figure 15, which have opposing flow directions). Furthermore, by setting patch area thresholds and accounting for factors such as the numbers of catch basins in adjacent patches, the proposed method ensures that the partitioned catchments have regular geometric shapes; consequently, the proposed method can capture the spatial heterogeneity of overland flows with different land uses. The iterative approach adopted by the proposed method also ensures that the ultimate catchment areas will not be so fragmented as to prohibit their use in hydrological models for large areas.

Quantitative Analysis
In order to validate the accuracy of the proposed method quantitatively, indices of flow direction error rate (Error-FD) and fragmentation of the land uses (Frag), which can reflect the flow direction consistency between real situation and situation within divided catchments, and land use distribution consistency of divided catchments, are selected and used to make a quantitative comparison to the result by Huang et al. [18].
As shown in Figure 16, yellow color polygons mean the sub-catchment by the proposed method, green Thiessen polygons denote the sub-catchment by Huang et al. [18], blue lines are the real main flow direction (R FD ). According to the principle of the method of Huang et al. [18], that is, the water in each catchment flows to the pour point (such as Point A and B in Figure 16), the simulated flow directions by Huang et al. can be determined. However, this may lead to the consistency with the realistic flows. For instance, in the area with orange shaded in Figure 16, the red dashed lines are the wrong flows (E FD ) by using Huang et al. method, due to that the included angle between the wrong flow direction and real flow direction is larger than 90 degrees. On this basis, the flow direction error number (Error-FD) can be counted. Due to that our method consider the indicator of the real flow direction, the flow routing by using our method is similar to the real flows, the Error-FD is 0, suggesting that our method performed better in real flow directions maintenance. Error-FD of the two methods are calculated for the study area, the Error-FD number by using the method of Huang et al. [18] are 76 in the sample area (1.6 km 2 ).

Quantitative Analysis
In order to validate the accuracy of the proposed method quantitatively, indices of flow direction error rate (Error-FD) and fragmentation of the land uses (Frag), which can reflect the flow direction consistency between real situation and situation within divided catchments, and land use distribution consistency of divided catchments, are selected and used to make a quantitative comparison to the result by Huang et al. [18].
As shown in Figure 16, yellow color polygons mean the sub-catchment by the proposed method, green Thiessen polygons denote the sub-catchment by Huang et al. [18], blue lines are the real main flow direction (RFD). According to the principle of the method of Huang et al. [18], that is, the water in each catchment flows to the pour point (such as Point A and B in Figure 16), the simulated flow directions by Huang et al. can be determined. However, this may lead to the consistency with the realistic flows. For instance, in the area with orange shaded in Figure 16, the red dashed lines are the wrong flows (EFD) by using Huang et al. method, due to that the included angle between the wrong flow direction and real flow direction is larger than 90 degrees. On this basis, the flow direction error number (Error-FD) can be counted. Due to that our method consider the indicator of the real flow direction, the flow routing by using our method is similar to the real flows, the Error-FD is 0, suggesting that our method performed better in real flow directions maintenance. Error-FD of the two methods are calculated for the study area, the Error-FD number by using the method of Huang et al. [18] are 76 in the sample area (1.6 km 2 ). As shown in Figure 17, yellow color lines mean the sub-catchment by the proposed method, green Thiessen polygons denote the sub-catchment by Huang et al., green and blue filled polygons mean the land use patches (Patch A and B). On this basis, the fragmentation of land use patch can be determined by calculating the number of divided polygons. For instance, for patch A, it was separated by 4 Thiessen sub-catchments based on the method of Huang et al. [18], the fragmentation of patch A is 4, similarly, the fragmentation of patch B by Huang et al. is 2. While the fragmentation of patch A and B are both 1 by using our method. Frag by using the proposed method and Huang et As shown in Figure 17, yellow color lines mean the sub-catchment by the proposed method, green Thiessen polygons denote the sub-catchment by Huang et al., green and blue filled polygons mean the land use patches (Patch A and B). On this basis, the fragmentation of land use patch can be determined by calculating the number of divided polygons. For instance, for patch A, it was separated by 4 Thiessen sub-catchments based on the method of Huang et al. [18], the fragmentation of patch A is 4, similarly, the fragmentation of patch B by Huang et al. is 2. While the fragmentation of patch A and B are both 1 by using our method. Frag by using the proposed method and Huang et al. [18] in the sample study area are calculated, which are 731 and 1680, respectively. The ratio of Frag between the traditional method and the proposed method is 2.3, indicating that the sub-catchments division by using proposed method is consistent with the spatial distribution of land uses. According to the analysis, the sub-catchments division results by using the proposed method and method of Huang et al. [18] are summarized and compared, as illustrated by Table 1. It was found that our method performs better than the method of Huang et al. [18] in terms of land-use distribution consistency and flow directions consistency. In summary, our method is applicable for the division of sub-catchments for large urban areas.

Conclusions
Catchment division is a prerequisite for the forecasting and early warning of urban waterlogging, and accurate divisions are of utmost importance for urban disaster prevention. Although the state-of-the-art method accounts for the influences of natural terrain and drainage pipes on the subdivision of a catchment, it uses the Voronoi algorithm in a rather simplistic manner; as a result, the catchments partitioned by the existing method are inconsistent with the real distribution of land uses. Furthermore, because this method is mainly applied to divide catchments in small regions, it may not be suitable for use in large regions. To address these issues, this study proposed a subcatchment division method that accounts for land uses and flow directions and is suitable for large areas based on introduction of auto-adaptive threshold adjustment process. In addition to accounting for the natural topography and drainage pipe network, the proposed method also uses land-use patches and flow directions to improve the accuracy and consistency of its results. Furthermore, an iterative algorithm was leveraged to ensure that the results of the proposed method are compatible with the needs of large-scale flooding analyses. The following conclusions were According to the analysis, the sub-catchments division results by using the proposed method and method of Huang et al. [18] are summarized and compared, as illustrated by Table 1. It was found that our method performs better than the method of Huang et al. [18] in terms of land-use distribution consistency and flow directions consistency. In summary, our method is applicable for the division of sub-catchments for large urban areas.

Conclusions
Catchment division is a prerequisite for the forecasting and early warning of urban waterlogging, and accurate divisions are of utmost importance for urban disaster prevention. Although the state-of-the-art method accounts for the influences of natural terrain and drainage pipes on the subdivision of a catchment, it uses the Voronoi algorithm in a rather simplistic manner; as a result, the catchments partitioned by the existing method are inconsistent with the real distribution of land uses. Furthermore, because this method is mainly applied to divide catchments in small regions, it may not be suitable for use in large regions. To address these issues, this study proposed a subcatchment division method that accounts for land uses and flow directions and is suitable for large areas based on introduction of auto-adaptive threshold adjustment process. In addition to accounting for the natural topography and drainage pipe network, the proposed method also uses land-use patches and flow directions to improve the accuracy and consistency of its results. Furthermore, an iterative algorithm was leveraged to ensure that the results of the proposed method are compatible with the needs of large-scale flooding analyses. The following conclusions were drawn from a large-scale (44 km 2 ) case analysis of Dongying City in China.
(1) Following the subcatchment division of the whole study area, 804 catchments were obtained. The catchments were regular in shape, and their boundaries were consistent with the natural rivers, artificial road networks, drainage pipe networks and buildings within the region. This agreement suggests that the proposed method adequately accounts for the influences of natural terrain and artificial facilities on overland flows. With the exception of ponds and wetlands, the SD of the catchment area was 3.18, and the catchment areas in the city center were much smaller than those in the rural areas. This finding indicates that the proposed method can represent the spatial heterogeneity of catchments over a large area and can ensure that the catchment elements are, overall, geometrically regular in shape.
(2) Regarding the local comparison, qualitative and quantitative analyses were conducted to validate the accuracy of the proposed method. For the visual qualitative analysis, catchments partitioned by the proposed method were similar to those of the method of Huang et al. (2019), and all of the catchments contained at least one drainage point (catch basin or natural catchment point). Therefore, both of these methods can realistically reflect of urban drainage patterns. However, the method of this paper is superior to the method of Huang et al. (2019) in terms of its consistency with the real distribution of land uses and overland flow directions. For the quantitative analysis, the Error-FD of the proposed method is 0 and the ratio of Frag between the traditional method and the proposed method is 2.3, suggesting our method performed better in real flow directions maintenance and land use spatial distribution.
The method proposed in this paper is both suitable and feasible for catchment division tasks in large areas, and the catchments partitioned by this method are generally consistent with the real distribution of land uses, which is important for the division of high precision catchment and realistic overland flows, thus have significance for precisely forecasting urban waterlogging. However, there still exist limitations. For instance, as it is difficult to extract the "true" catchments of a region, this method has been validated only through a comparative analysis with existing state-of-the-art method. In the future, the proposed method will be validated by integrating data such as drainage flow volumes. In addition, a land-use patch area threshold (0.12) was used in this method to ensure that the algorithm would be suitable for dividing large areas; however, it has yet to be determined whether this threshold value is suitable for other regions. Moreover, due to that the data definitions and classifications in different cities/countries are varied, the selection of the trunk river and canals may be different, but this has a small impact on sub-catchment division. In future work, the improved flow analysis algorithms, such as a two-parameter method for drainage network extraction and the GeoNet method, will be integrated to determine the D-B catchment. Finally, future studies will focus on the integration of other data (e.g., drainage pump station data) to partition catchments more accurately.

Conflicts of Interest:
The authors declare no conflict of interest.