Urban Flood Risk Assessment in Zhengzhou, China, Based on a D-Number-Improved Analytic Hierarchy Process and a Self-Organizing Map Algorithm

: Flood risk assessment is an important tool for disaster warning and prevention. In this study, an integrated approach based on a D-number-improved analytic hierarchy process (D-AHP) and a self-organizing map (SOM) clustering algorithm are proposed for urban ﬂooding risk assessment. The urban ﬂood inundation model and geographic information system (GIS) technology were used to quantify the assessment indices of urban ﬂood risk. The D-AHP approach was adopted to determine the weights of the indices, which effectively makes up for the shortcomings of the AHP in dealing with uncertain evaluation information (such as fuzzy and incomplete information). In addition, the SOM clustering algorithm was applied to determine the ﬂood risk level. It is a data-driven approach that avoids the subjective determination of a ﬂood risk classiﬁcation threshold. The proposed approach for ﬂood risk assessment was implemented in Zhengzhou, China. The ﬂood risk was classiﬁed into ﬁve levels: highest risk, higher risk, medium risk, lower risk, and the lowest risk. The proportion of the highest risk areas was 9.86%; such areas were mainly distributed in the central and eastern parts of the Jinshui District, the eastern part of the Huiji District, and the northeastern part of the Guancheng District, where there were low terrain and serious waterlogging. The higher risk areas accounted for 24.26% of the study area, and were mainly distributed in the western and southern parts of the Jinshui District, the southern part of the Huiji District, the middle and eastern parts of the Zhongyuan District, the northeastern part of the Erqi District, and the northwestern part of the Guancheng District, which consisted of economically developed areas of dense population and buildings, matching well with historical ﬂooding events. To verify the effectiveness of the proposed approach, traditional approaches for risk assessment were compared. The comparison indicated that the proposed approach is more reasonable and accurate than the traditional approaches. This study showed the potential of a novel approach to ﬂood risk assessment. The results can provide a reference for urban ﬂood management and disaster reduction in the study area.


Introduction
Flooding problems are among the serious and frequent natural disasters in cities. With rapid urbanization, the populations and economies of urban areas are highly concentrated, resulting in social and economic flood damage that is more severe than it was prior to urbanization [1,2]. Reducing the adverse effects of urban floods has become one of the priorities in urban disaster management [3,4]. Urban flood risk assessment is an effective tool for flood disaster management; it can identify the corresponding risk levels and the main causes of flooding in different regions and provide a basis for the prevention and reduction of urban floods [5][6][7]. Notably, large-scale flood mapping and flood risk assessment, combined with machine learning techniques, are increasingly pursued in them to low-dimensional space. Therefore, it can better process nonlinear data [40,41]. Accordingly, the SOM algorithm was adopted in this study to determine the urban flood risk level.
In this study, we developed a novel approach for urban flood risk assessment by integrating the D-AHP and the SOM algorithm. First, urban flood risk assessment indices were established based on four aspects: a disaster-causing factor (DCF), the disaster environment (DE), the disaster bearing body (DBB), and disaster prevention and mitigation capability (DPMC). Second, the D-number theory was combined with the AHP to calculate the weight of each index, which mitigates the problem of incomplete expert evaluation information and subjective preference in the AHP. Then, the SOM algorithm was applied in determining the urban flood risk level. Finally, a case study from Zhengzhou, China, was conducted to evaluate the validity of the proposed method by comparing it with the results of traditional flood risk assessment methods.

Study Area
This study was conducted in five main urban districts of Zhengzhou, China: the Jinshui District, the Zhongyuan District, the Erqi District, the Guancheng District, and the Huiji District (See Figure 1). Zhengzhou is located in the central part of Henan Province in China. It is the political, economic, and cultural center of Henan Province. Geographically, Zhengzhou is located between 112 • 42 to 114 • 14 east longitude and 34 • 16 to 34 • 58 northern latitude, with higher terrain in the southwest and lower terrain in the northeast. The average annual precipitation in Zhengzhou is 636 mm: the average annual rainfall in summer is 352 mm, accounting for 55.3% of the annual precipitation. The maximum daily rainfall is 553 mm, based on rainfall records from 1961 to 2021; 63% of the maximum annual rainfall occurs in August. Under the comprehensive influence of the climate conditions and the characteristics of the underlying surface, the study area is a flood disaster-prone area and Zhengzhou is one of the key flood control cities in China. From 19 July 2021 to 21 July 2021, Zhengzhou was hit by a heavy rain disaster. The daily rainfall reached 553 mm from 8:00 p.m. on 19 July 2021 to 8:00 p.m. on 20 July 2021. The maximum rainfall in 1 h was 201.9 mm, which was the largest hourly rainfall observed in China. The flood killed 380 people and caused an economic loss of USD 6.3 billion. It had a serious impact on residents' lives and on the social production and traffic in Zhengzhou.
Considering the most dangerous scenario and the integrity of data, the rainfall event from 19 July 2021 to 21 July 2021, which had the greatest impact on the study area in recent years, was selected to conduct urban flood risk assessment.

Data
The data used in this study are shown in Table 1. The basic data used to establish the urban flood inundation model of Zhengzhou were mainly based on a digital elevation model (DEM), and included slope, river distribution, building distribution, road distribution, and conduit distribution. The DEM was derived from the shuttle radar surveying mission (SRTM) data of the US Space Shuttle Endeavour (https://www.resdc.cn/data.aspx?DATAID=217 (accessed on 15 August 2021)), with 30 m spatial resolution. The slope was obtained by DEM processing, with the gradient analysis tool ArcGIS software. The remote sensing data were acquired from the Gaofen-1 satellite of China (https://www.resdc.cn/data.aspx?DATAID=285 (accessed on 15 August 2021)), with 16 m spatial resolution. The data for river distribution and building distribution were obtained from remote sensing data. The road distribution information and the conduit distribution information were provided by the Zhengzhou Municipal Administration Bureau.  The calibration data for the urban flood inundation model mainly included the distribution of flood-prone areas, historical rainfall data, and historical inundation data. The distribution of flood-prone areas was provided by the Zhengzhou Municipal Administration Bureau. The historical rainfall data were collected from the Zhengzhou Meteorological Bureau. The historical inundated depth information was obtained via field investigation and a web crawler. The historical inundated area information was obtained via remote sensing data.
The flood risk assessment index of the data contained maximum inundation depth (MD), maximum inundation volume (MVO), maximum inundation velocity (MVE), DEM, slope, distance to the river (DRI), density of population (DP), density of building (DB), average area GDP (AGDP), density of conduits (DC), density of roads (DRO), and distance to the hospital (DH). The data on MD, MVO, and MVE were obtained from the urban flood inundation model. The data on AGDP and DP were obtained from the Zhengzhou Statistical Yearbook. The data on DB, DC, and DRO were calculated by the density analysis tool of ArcGIS software. Hospital distribution data were provided by the Baidu Map Point of Interest. The data on DRI and DH were obtained from the river and hospital distribution, respectively, by the distance analysis tool of ArcGIS software.

Framework
The framework of the proposed approach is illustrated in Figure 2, which includes the construction of an urban flood risk assessment index system, a quantification of risk assessment index, an index weight calculation based on the D-AHP method, a flood risk classification based on the SOM clustering algorithm, and a comparative analysis of different risk assessment approaches.

Construction of Urban Flood Risk Assessment Index System
The commonly used urban flood risk assessment indices (DCF, DE, DBB, and DPMC) were used for reference to establish the assessment system according to the previous literature [29,42,43].
The flood risk index system used in this study is presented in Figure 3. MD, MVO, and MVE were selected to represent the disaster-causing factor. These three indices were all positive indices; the higher the values were, the higher the urban flood risk. DEM, SL, and DRI were used to evaluate the disaster environment, and they were negative indices. In flood disasters, population, buildings, and property are important disaster-bearing bodies. Therefore, DP, DB, and AGDP were selected to characterize the disaster bearing body, and they were positively correlated with the risk value. DC, DRO, and DH were selected to evaluate disaster prevention and mitigation capabilities. These three indices were negative indices, and their values were negatively correlated with the impact of flood disaster.

Quantification of the Risk Assessment Index
The spatial distributions of the DEM, SL, DRI, DP, DB, AGDP, DC, DRO, and DH indices were obtained by ArcGIS software (see Section 2.2). MD, MVO, and MVE were obtained from the urban flood inundation model, which was introduced as set out below.

Urban Flood Inundation Model
The personal computer storm water management model (PCSWMM) developed by the Canadian Institute of Hydraulic Computing (CHI) was used to establish the urban flood simulation model, which was based on the storm water management model (SWMM). The PCSWMM was combined with the independent geographic information system (GIS) to integrate one-dimensional (1D) and two-dimensional (2D) modeling, which made up for the defect that the SWMM could only be used to simulate 1D pipeline and river flow but not 2D surface flooding distribution [44,45]. More importantly, the simulation results provided data on two-dimensional submerged water depth, water volume, velocity, and other data for later processing [46][47][48]. The urban flood inundation model established by the PCSWMM has been applied in many regions, and the simulation results have been satisfactory [44][45][46][47][48]. Therefore, this study used the PCSWMM to establish an urban flood inundation model.

Calibration of Urban Flood Inundation Model
The calibration of urban hydrodynamic model was applied to the flood-prone areas and the inundation of historical rainfall events. First, the inundation distributions of different return period rainfall events were compared with the flood-prone areas in Zhengzhou, which were provided by the Zhengzhou Municipal Administration Bureau. Second, the inundation depths of observation points were compared with simulated values by the Remote Sens. 2022, 14, 4777 7 of 24 PCSWMM during historical rainfall events. Third, we compared the simulated inundation area with that of an area obtained by remote sensing data in the rainfall event that occurred on 20 July 2021, and the remote sensing data has been reported by Reference [49].

Weight Calculation of Index Based on the D-AHP Method
The D-AHP method was used to compensate for the shortcoming of not being able to deal with uncertain information evaluation (such as inaccurate, fuzzy, and incomplete information) by extending the AHP via the D-number. In addition, the optimized method reduced subjectivity and dealt with the decision-making problems based on fuzzy and incomplete information [36]. For example, 10 experts assessed two indices. In the first situation, six experts thought that the preference of index 1 was preferred index 2 at 0.65, and four experts thought that the preference of index 1 was preferred index 2 at 0.75. In the second situation, seven experts thought that the preference of index 1 was preferred index 2 at 0.85, and three experts did not provide any information, due to their lack of professional knowledge. As shown below, the extended fuzzy preference relationship with the D-number can represent both cases well, which the pairwise comparison relationship of the AHP cannot do so [36].

D-number Theory
Deng et al. [31] and Deng and Deng [37] defined the D-number theory as follows: Let Ω be a finite nonempty set; the D-number is a mapping formulated by where ∅ is an empty set and B is a subset of Ω.
From Definition 1, the completeness constraint is released by the D-number. If ∑ B⊆Ω D(B) = 1, then the information is complete; if ∑ B⊆Ω D(B) < 1, the information is incomplete.
} be a D-number; then, it can be represented as

Definition 3.
Suppose that set U consists of n assessment indices. Set U = {U 1 , U 2 , . . . , U n }. Its fuzzy preference relation is as follows: The matrix form can be expressed as R = r ij n×n ; that is, where r ij represents the degree of preference for U i relative to U j . The assignment of r ij and its corresponding meanings are as follows: when U j is absolutely more important than U i ∈ (0, 0.5); when U j is more important than U i 0.5; when U j is just as important as U i ∈ (0.5, 1); when U i is more important than U j 1; when U i is absolutely more important than U j (6) Definition 4. Let set U = {U 1 , U 2 , . . . , U n } of the assessment indices that exist, and let the preference relatioshipn of the D-number exist in the form of a matrix expressed as ij k represents the degree of importance that kth expert considers index i to be relative to index j; v ij k represents the expert's support for the importance.
Consequently, the preference relationships of case 1 and case 2 could be expressed as (9) and (10), respectively, according to Equation (10).

Steps in Calculating Index Weight by the D-AHP Method
Step 1. Equations (11) and (12) were used to standardize positive and negative indices, respectively.
where x i is the x i value of the index i after normalization; x min is the minimum values of the corresponding index; and x max is the maximum values of the corresponding index.
Step 2. Experts were organized to measure the importance of each index in terms of the D-number preference relationship to establish the D-number preference matrix, R D .
Step 3. The crisp matrix R C was calculated by the D-number preference matrix R D according to Equation (3).
Step 4. The preference probability used to represent the pairwise comparison indices was the probability matrix R P constructed by the crisp matrix R C [50]. The elements in the matrix were recorded as c ij , while the elements in the matrix R P were denoted as p ij = Pr U i > U j , ∀i, j ∈ {1, 2, . . . , n} : The degree of preference for considering one index to be more important than another is: Step 5. The triangulation matrix R T P was obtained by adjusting the order of the rows and columns in R P according to the sum of the rows in the R P matrix. Equation (15) was used to calculate whether the inconsistent coefficient was within the acceptable range.
where I.D. is the inconsistent coefficient, R T P (i, j) is the element of row I and column j in the triangulation matrix R T P (i, j), and n represents the number of indices compared in pairs. Step 6. We triangulated the crisp matrix R C to R T C as R T P described above. If the elements in R T C met the requirements of R T C (i, j) + R T C (j, i) < 1 (incomplete information), then R T C needed to be further normalized to matrix R T C according to Equation (16): Step 7. The weight of each index was calculated, based on the R T C (or R T C ) matrix.

Flood Risk Classification Based on the SOM Clustering Algorithm
The SOM clustering algorithm was proposed by Kohonen [51]. It can classify data with similar quality without the need to specify a cluster number and a cluster center in advance. Therefore, this paper adopted the SOM clustering algorithm to carry out risk classification. The SOM structure contains components called nodes and consists of two parts: an input layer and a clustering layer. The method uses an iterative process to identify weight vectors during the training phase [51]. The Euclidean distance between indices is used as the input vector in the SOM training algorithm [51,52]. The size of the neural network was calculated by Equation (17) [52,53]. The number of clusters was determined by Equation (18) [52,53].
The heuristic equation was proposed to determine the optimal solution of the map's side length. In the case of combining quantization error and topographic error, its calculation was as follows: where M is the number of map units and N is the number of samples of the training data. The number of clusters can be determined by the Davies-Bouldin index (DBI). The smaller the DBI is, the better the clustering effect, which can be calculated by the following equation: where C is the average distance from all samples of this cluster to the center of this cluster. The model consisted of two modules. A rainfall-runoff model and a flow-routing process model were selected for hydrological and hydraulic modeling, respectively. For hydrologic analysis, the study area was divided into 3283 sub-catchments, according to the distribution of buildings, roads, and rivers in the Zhengzhou urban area, and the direction of a catchment was determined according to DEM and slope (Figure 4a). The hydrology input parameters of the sub-catchments included area, flow length, slope, and imperviousness. In addition, the Horton infiltration method was selected for this hydrology module [54]. For hydraulic simulation, the conduit distribution was generalized into the conduit model via ArcGIS software for simple data processing. First, the one-dimensional (1D) conduit model consisted of 2522 junctions and 2561 conduits (Figure 4b). Then, a two-dimensional (2D) floodplain model was established in the PCSWMM, which was composed of 100,528 grids with a size of 100 m × 100 m. Finally, the 1D conduit model and 2D floodplain model were linked via orifices (Figure 4c). In addition, the dynamic wave method was used in the hydraulic simulation [54].

Calibration of the Model
In this study, the rationality of the model was calibrated with the flood-prone areas and the inundation of historical rainfall events.

•
Calibration by flood-prone areas: The design rainfall with return periods of 10 years, 50 years, and 100 years, respectively, were obtained from the designed rainfall equation provided by the Zhengzhou Urban and Rural Planning Bureau and were adopted as the input boundaries of the urban flood inundation model in Zhengzhou.
The simulation results were compared with the flood-prone areas in Zhengzhou and the results are shown in Figure 5 and Table 2. Considering the actual situation, the area with the inundation depth greater than 10 cm was considered to have the greatest impact on people's daily life. The results with a return period of 10 years showed that 33 of 39 flood-prone areas generated waterlogging (Figure 5a). There were 40 flood-prone areas that generated waterlogging with a return period of 50 years, 35 of which were over 10 cm (Figure 5b). Forty-one flood-prone areas with a return period of 100 years generated waterlogging, 38 of which were over 10 cm (Figure 5c). It can be seen from Table 2 that in the three designed rainfall events, the proportion of flood disasters that occurred in flood-prone areas was 78.5%, 83.3%, and 90.5%, respectively, each of which was greater than 75%. Therefore, the model was considered to have high simulation accuracy.  • Calibration by historical inundation depth and area: The inundation depth of the rainfall event from midnight to 2:00 p.m. on 20 July 2021 (hereinafter referred to as rainfall event A) and the rainfall event on 26 July 2011 (hereinafter referred to as rainfall event B) were used to verify the accuracy of the model (as shown in Tables 3 and 4). For rainfall events A and B, the average relative errors between the historical inundation depth and the simulated inundation depth were 13.89% and 17.36%, respectively. The Nash coefficient (E NS ) of the A rainfall event was 0.78, and that of the B rainfall event was 0.75. Therefore, the model simulated the maximum inundation depth well. The historical inundation area obtained by remote sensing data for rainfall event A was used for comparison with the simulated inundation area. The detailed process for visualizing the historical inundation area was presented in Section 3.3 of [54]. The simulated inundation area for rain event A is shown in Figure 6b. The historical inundation area (Figure 6a) and the simulated inundation area (Figure 6b) account for 75.3% and 76.6% of the total area of the study area, respectively. The relative error was 1.7%. Notably, there was a similar distribution of the inundation area. For example, the inundation area in the Huiji District was concentrated in the south, while the inundation area in the Zhongyuan District was in the central region. In addition, the northeast corner of the Jinshui District, the southwest corner of the Erqi District, and the southeast corner of the Guancheng District were less affected by the flood. Therefore, the model simulated the inundation area well.

Calculation of Urban Flood Risk Assessment Indices
Twelve indices were selected to construct the urban flood risk assessment index system from four perspectives (i.e., DCF, DE, DBB, and DPMC). As the resolution of the urban flood inundation model was 100 m × 100 m, the 12 evaluation indices were also divided into 100,528 grids with the same resolution.
The disaster-causing factor included MD, MVO, and MVE, which were obtained from the model's simulation results of rainfall events in Zhengzhou from 19 July 2021 to 21 July 2021 (see Section 4.1). As shown in Figure 7a-c, most of the severely flooded areas were located in the northeastern part of the study area, and the maximum inundation velocity was larger in the western part of the study area. The disaster environment indices include DEM, slope, and DRI in the study area. As shown in Figure 7d, the southwestern part of the study area was higher and the northeastern part was lower; the terrain of southwestern part was steep (Figure 7e); the eastern channel as more densely distributed (Figure 7f). The disaster-bearing body indices included DP, DB, and AGDP in the study area. As shown in Figure 7g-i, the middle area of the study area was well developed with a concentrated population, a developed economy, and dense buildings. The disaster prevention and mitigation capability indices included DC, DH, and DRO. Figure 7j-l shows that there ere dense roads in the middle of the study area, and the distribution of conduits and hospitals was relatively uniform.

Index Weight Calculation Based on the D-AHP
The weight of each index was calculated from the steps described in Section 3.4. As an example, we present below the weight calculation process of DCF, DE, DBB, and DPMC.
(1) The assessment information of experts on the indicators was collected through questionnaires. Based on the assessment information of experts, the D-number preference matrix R D was established: (2) The matrix R D was converted to a crisp matrix R C , according to Equation (3).
(3) The probability matrix R P was constructed based on the crisp matrix R C .
(4) The probability matrix R P was converted to the matrix R T P using the triangularization method.

DCF DE DBB DPMC
The ranking of the indicators was calculated as DCF > DE > DBB > DPMC, where the symbol ">" indices preference.
According to Equation (15) in Section 3.4, the inconsistency coefficient I.D. was 0.05, within the acceptable range.
(5) The R T C and R T C may be expressed as follows, according to Equations (13), (14) and (18):  [29,43]: where a i represents the weight of the ith index; λ is the credibility of the information provided by experts, and the higher the expert's knowledge in the field of problem assessment, the higher the credibility. The specific calculation of λ is as follows: (6) The weight equations were constructed by the weight relationship of the indices represented in the matrix R T C [29,43]: where a i represents the weight of the ith index; λ is the credibility of the information provided by experts, and the higher the expert's knowledge in the field of problem assessment, the higher the credibility. The specific calculation of λ is as follows: For highly reliable in f ormation n; For moderately reliable in f ormation n 2 2 ; For low reliable in f ormation (26) Because the experts are experienced, λ = λ = 1, according to Equation (26).
Similarly, the weights of all indices could be obtained, and the results are shown in Table 5. In addition, the weights calculated by the AHP methods, which were used in Section 5.1, are also shown in Table 5.

Flood Risk Classification of Urban Floods Based on the SOM Algorithm
A 40 × 40 SOM neural network was established, according to Equation (17) in Section 3.5. The input planes (Figure 8a) and the weight distance matrix (Figure 8b) were visualized, according to the established SOM neural network. The input planes showed the correlation of different indices. The more similar the connection pattern of the inputs, the more relevant the inputs were. For example, MD and MVO were highly related, due to their inputs (Figure 8a). The weight distance matrix showed the distance of different neurons (Figure 8b). The small hexagon represents neurons, while the large hexagon visualizes the distance between neurons. The darker the color of the large hexagon, the farther the distance. A total of 100,528 grids were stored in different neurons and were classified based on the distance between the neurons. The DBI of different clustering numbers was determined by Equation (18). As shown in Figure 8c, when the clustering number corresponded to 5, the DBI value was the smallest. Therefore, the optimal effect was achieved when the number of clusters was five. The urban flood risk was classified into five corresponding levels from high to low: highest risk, higher risk, medium risk, lower risk, and the lowest risk.
--Remote Sens. 2022, 14, 4777 16 of 24 (6) The weight equations were constructed by the weight relationship of the indices represented in the matrix R T C [29,43]: where a i represents the weight of the ith index; λ is the credibility of the information provided by experts, and the higher the expert's knowledge in the field of problem assessment, the higher the credibility. The specific calculation of λ is as follows: For highly reliable in f ormation n; For moderately reliable in f ormation n 2 2 ; For low reliable in f ormation (26) Because the experts are experienced, λ = λ = 1, according to Equation (26).
Similarly, the weights of all indices could be obtained, and the results are shown in Table 5. In addition, the weights calculated by the AHP methods, which were used in Section 5.1, are also shown in Table 5.

Flood Risk Classification of Urban Floods Based on the SOM Algorithm
A 40 × 40 SOM neural network was established, according to Equation (17) in Section 3.5. The input planes (Figure 8a) and the weight distance matrix (Figure 8b) were visualized, according to the established SOM neural network. The input planes showed the correlation of different indices. The more similar the connection pattern of the inputs, the more relevant the inputs were. For example, MD and MVO were highly related, due to their inputs (Figure 8a). The weight distance matrix showed the distance of different neurons (Figure 8b). The small hexagon represents neurons, while the large hexagon visualizes the distance between neurons. The darker the color of the large hexagon, the farther the distance. A total of 100,528 grids were stored in different neurons and were classified based on the distance between the neurons. The DBI of different clustering numbers was determined by Equation (18). As shown in Figure 8c, when the clustering number corresponded to 5, the DBI value was the smallest. Therefore, the optimal effect was achieved when the number of clusters was five. The urban flood risk was classified into five corresponding levels from high to low: highest risk, higher risk, medium risk, lower risk, and the lowest risk.
Similarly, the weights of all indices could be obtained, and the results are shown in Table 5. In addition, the weights calculated by the AHP methods, which were used in Section 5.1, are also shown in Table 5.

Flood Risk Classification of Urban Floods Based on the SOM Algorithm
A 40 × 40 SOM neural network was established, according to Equation (17) in Section 3.5. The input planes (Figure 8a) and the weight distance matrix (Figure 8b) were visualized, according to the established SOM neural network. The input planes showed the correlation of different indices. The more similar the connection pattern of the inputs, the more relevant the inputs were. For example, MD and MVO were highly related, due to their inputs (Figure 8a). The weight distance matrix showed the distance of different neurons (Figure 8b). The small hexagon represents neurons, while the large hexagon visualizes the distance between neurons. The darker the color of the large hexagon, the farther the distance. A total of 100,528 grids were stored in different neurons and were classified based on the distance between the neurons. The DBI of different clustering numbers was determined by Equation (18). As shown in Figure 8c, when the clustering number corresponded to 5, the DBI value was the smallest. Therefore, the optimal effect was achieved when the number of clusters was five. The urban flood risk was classified into five corresponding levels from high to low: highest risk, higher risk, medium risk, lower risk, and the lowest risk. Figure 9a shows the classification of neurons. The radar diagram of the average values of different cluster indices (after standardization) is shown in Figure 9b. Cluster 5 had the highest disaster-causing factor index value, especially for MD and MVO, which were much higher than other clusters. DEM and DRI (after standardization) were also the largest among the disaster environment. The cluster was considered to be extremely vulnerable to flooding, so cluster 5 was the highest risk area. Cluster 4 had the largest disaster-bearing body index, and its DP, DB, and AGDP were higher than those of other clusters. MD and MVO were second only to cluster 5, so cluster 4 was considered as the higher risk area. The index values of cluster 1 were all relatively small; DEM, MD, and MVO were the smallest and were the least prone to flood disasters. Therefore, cluster 1 belonged to the lowest risk area. Cluster 2 had dense conduits and larger DEM, but the largest MVE; other indices were greater than those of cluster 1, so cluster 2 was classified as the lower risk area. Cluster 3 had the smallest DRI, a smaller DC, and a lower DEM than those of clusters 1 and 2, so cluster 3 belonged to the medium risk area.   Figure 10 shows the spatial distribution of the urban flooding risk in Zhengzhou. It indicates that the highest risk areas were mainly distributed in the central and eastern parts of the Jinshui District, the eastern part of the Huiji District, and the northeastern part of the Guancheng District, accounting for 9.86% of the total area. The higher risk areas accounted for 24.26% of the total area and were mainly located in the western and southern parts of the Jinshui District, the southern part of the Huiji District, the middle and eastern parts of the Zhongyuan District, the northeastern part of the Erqi District, and the northwestern part of the Guancheng District. The medium risk areas accounted for the largest proportion (27.57%) of the total area and were mainly located in the central and northern parts of the Huiji District, the northern part of the Zhongyuan District, and the eastern part of the Guancheng District. The lower risk areas and the lowest risk areas accounted for 24.26% and 8.95% of the total area, respectively and were mainly distributed in the southwestern part of the study area and the western part of the Huiji District. In order to verify the accuracy of the risk distribution, the above assessment results were compared with the inundation data and actual economic loss data of the rainfall events from 19 July 2021 to 21 July 2021. The highest risk areas and the higher risk areas in the study area were consistent with the area with the most severe flood disaster, which proved that the application of the D-AHP method and the SOM clustering algorithm in flood risk assessment was feasible and that the calculation results were reasonable.

Comparison with Other Methods
The results of the D-AHP method and the SOM clustering method were compared with those of AHP-SOM clustering (Figure 11a), the D-AHP method (Figure 11b), and the TOPSIS method (Figure 11c), respectively, in order to establish the differences in their algorithms in flood risk assessment. The flood risk distribution obtained by the AHP-SOM method is shown in Figure 11a. The index weights obtained from the AHP method are shown in Table 5. The proportion of the highest risk areas and the higher risk areas were 11.18% and 20.54%, respectively. The area proportions were basically consistent with that determined by the D-AHP and the SOM clustering method, but their spatial distributions were quite different. The results of the AHP-SOM method were obviously unreasonable. For example, as shown in Figure 11a, region 1 is located in the highest risk areas, but it has high elevation, an extremely low population density, and a backward economy; therefore, it was in the lower risk area. Although the terrain of regions 2 and 3 are relatively low, they were more suitable for classification as medium risk areas with large DC, low DP, and a backward economy. Regions 4 to 9 are low-lying, with large MD, close to the river, and with a relatively developed economy; the risk level was the higher risk or the highest risk. The main reason for the inaccuracy of the AHP-SOM method was that the traditional AHP method ignores the uncertainty of evaluation information, resulting in some index weight that does not conform to reality; therefore, a specific index is aggregated into a class. For example, the weights of MD and MVO in this study were too high, leading to a small impact of other indices on flood risk results. Therefore, it may be concluded that the AHP-SOM method was greatly influenced by expert evaluations and had strong subjectivity, and the evaluation result was not ideal. Figure 11b shows the flood risk map based on the D-AHP and natural-break method. The proportions of the highest risk areas and the higher risk areas were 10.36% and 24.11%, respectively. The proportions of the lower risk areas and the lowest risk areas were 26.84% and 12.08% respectively. The spatial distribution of the different levels of flood risk differed greatly from those of the D-AHP and the SOM clustering methods, and the risk levels in some areas were obviously unreasonable. In Figure 11b, regions 10 and 11 are the banks of the Yellow River. Although the DRI in these areas was very small, the DP and DB ere very small and the economy was not developed; therefore, it was reasonable for these areas to be medium risk areas. Regions 12 and 13 are located in high-altitude areas that are sparsely populated and economically backward; therefore, it was appropriate to change the risk level to lowest risk. Although the DC in regions 14, 15, and 16 was relatively dense, they all had high DB, DP, AGDP, and MD, which were vulnerable to flood disasters. Therefore, there was good reason to rate them as higher risk areas. It may be concluded that the flood risk classification by the SOM clustering algorithm was reasonable, and the evaluation results of the D-AHP and natural-breaks method were not as accurate as the methods proposed in this study.
The flood risk distribution obtained by the TOPSIS-with-entropy-weighting method is shown in Figure 11c. Compared with several other methods, the evaluation result of this method was quite different. The proportion of the lowest risk areas and the lower risk areas were as high as 41.23% and 29.22%, respectively. In contrast, the highest risk areas and the higher risk areas had ratios of only 0.75% and 7.35%, respectively. However, most areas of Zhengzhou were flooded on 20 July 2021, indicating that the classification for flood risk by the TOPSIS method was not reasonable. In addition, the highest risk areas were rarely distributed in the eastern part of the Jinshui District, which suffered from severe floods on 20 July 2021, and the results did not correspond to the actual situation. Therefore, the proposed method was more useful for the flood risk assessment in the study area.

Limitation of the Proposed Approach and Future Work
In this study, we developed a novel approach for urban flood risk assessment by integrating the D-AHP and the SOM algorithm. The D-AHP method describes the uncertain evaluation information well and solves the subjective problem of the AHP. The SOM clustering algorithm avoids the subjective problem caused by the artificial determination of the flood risk classification threshold. This study proved the accuracy of the proposed method in urban flood risk assessment. Of course, there are limitations, as follows: (1) In this article, a case study of Zhengzhou, China, was adopted to test the applicability of the proposed D-AHP method and the SOM clustering algorithm. Due to the limitation of data, 12 indices were selected for urban flood risk assessment. In the future, a more comprehensive index system should be selected for urban flood risk assessment from the perspectives of DCF, DE, DBB, and DPMC, such as considering the distribution of critical infrastructure, pumping stations, and flood warning stations. (2) Because the observed data of inundation depth in the study area were difficult to obtain, the study mainly adopted the data of flood-prone areas that were provided by the Zhengzhou Municipal Administration Bureau and the observed data of two rainfall events to calibrate the urban flood simulation model. In the future, with an increase in observed inundation data, the accuracy of the urban flood simulation model may be further improved, and more accurate index values, such as maximum inundation depth, maximum inundation volume, and maximum inundation velocity can be obtained. (3) Urban flood risk distribution is the basis for determining flood reduction measures.
In the future, we will conduct research on the selection, placement, and scale optimization of flooding measures, according to the distribution of high-risk areas.

Conclusions
In this study, an integrated approach based on the D-number-improved AHP and the self-organizing map (SOM) algorithm was proposed for urban flood risk assessment. Taking Zhengzhou in China as a case study, 12 indices were selected from four aspects of DCF, DE, DBB, and DPMC. An urban flood inundation model and GIS technology were used to quantify the evaluation indices. Considering the uncertainty of the evaluation information, the D-AHP method was adopted to quantify the index weight. In addition, the SOM clustering algorithm was used to classify flood risk level automatically and to solve the subjective determination of the flood risk classification threshold.
The flood risk distribution in Zhengzhou showed that the flood risk was classified into five corresponding levels from high to low: highest risk, higher risk, medium risk, lower risk, and the lowest risk. The highest risk areas were mainly distributed in the central and eastern parts of the Jinshui District, the eastern part of the Huiji District, and the northeastern part of the Guancheng District, accounting for 9.86% of the total area. The higher risk areas accounted for 24.26% of the total area and were mainly located in the western and southern parts of the Jinshui District, the southern part of the Huiji District, the middle and eastern parts of the Zhongyuan District, the northeastern part of the Erqi District, and the northwestern part of the Guancheng District. As a comparison, the other three approaches (the AHP-SOM method, the D-AHP and natural-break method, and the TOPSIS-with-entropy-weighting method) were also considered for urban flood risk assessment. The results demonstrated that the integrated approach of the D-AHP method and the SOM clustering algorithm is more reasonable and scientific.
This study provides a new approach for urban flood risk assessment, which can provide valuable information for urban flood risk management, flood control, and mitigation planning in the study area and in other areas.