Hybrid Models Incorporating Bivariate Statistics and Machine Learning Methods for Flash Flood Susceptibility Assessment Based on Remote Sensing Datasets

: Flash ﬂoods are considered to be one of the most destructive natural hazards, and they are difﬁcult to accurately model and predict. In this study, three hybrid models were proposed, evaluated, and used for ﬂood susceptibility prediction in the Dadu River Basin. These three hybrid models integrate a bivariate statistical method of the fuzzy membership value (FMV) and three machine learning methods of support vector machine (SVM), classiﬁcation and regression trees (CART), and convolutional neural network (CNN). Firstly, a geospatial database was prepared comprising nine ﬂood conditioning factors, 485 ﬂood locations, and 485 non-ﬂood locations. Then, the database was used to train and test the three hybrid models. Subsequently, the receiver operating characteristic (ROC) curve, seed cell area index (SCAI), and classiﬁcation accuracy were used to evaluate the performances of the models. The results reveal the following: (1) The ROC curve highlights the fact that the CNN-FMV hybrid model had the best ﬁtting and prediction performance, and the area under the curve (AUC) values of the success rate and the prediction rate were 0.935 and 0.912, respectively. (2) Based on the results of the three model performance evaluation methods, all three hybrid models had better prediction capabilities than their respective single machine learning models. Compared with their single machine learning models, the AUC values of the SVM-FMV, CART-FMV, and CNN-FMV were 0.032, 0.005, and 0.055 higher; their SCAI values were 0.05, 0.03, and 0.02 lower; and their classiﬁcation accuracies were 4.48%, 1.38%, and 5.86% higher, respectively. (3) Based on the results of the ﬂood susceptibility indices, between 13.21% and 22.03% of the study area was characterized by high and very high ﬂood susceptibilities. The three hybrid models proposed in this study, especially CNN-FMV, have a high potential for application in ﬂood susceptibility assessment in speciﬁc areas in future studies.


Introduction
Flash floods arise from interactions between the hydrological and the atmospheric systems. They are characterized by a runoff peak developing over a period of minutes to hours during or after heavy rainfall, and they generally occur in the river basins smaller than 200 km 2 [1]. They are considered to be one of the most devastating and frequent natural disasters worldwide [2]. Over the last few decades, the number of such disasters and the amount of damage generated have significantly increased due to the global climatic change [3]. For example,~4.63 million km 2 of China is susceptible to flash floods, which have threatened 560 million people [4]. In Europe, 40% of flood-related casualties between 1950 and 2006 were caused by flash floods [5], and the percentage in southern Europe has exceeded 80% [6]. Therefore, it is an indispensable task to study flood susceptibility (defined as the probability of flooding) in order to develop preventive measures and reduce the impacts of flash floods. The accurate identification of flood-prone areas plays a crucial role in this task [7]. Thus, flood susceptibility maps with high precisions are widely considered into the flood risk management as a non-structure measure [8].
In recent years, several types of methods have been mainly applied in flood susceptibility mapping, including multi-criteria decision analysis (MCDA) methods based on knowledge-driven, physically-based simulation methods and statistical methods and machine learning methods driven by historical data. The MCDA methods (i.e., the analytical hierarchy process and analytical network process) are considered to be simple but subjective [9]. The physically based simulation methods (i.e., the MIKE 11 [10] and FLO-2D) can describe the details of the flash flood, but they require a sea of input data and substantial computational resources [11]. In contrast to the MCDA and physically based simulation methods, the latest techniques, i.e., statistical methods and machine learning methods, overcome the shortcomings of the MDCA and physical simulation methods and are able to quickly and accurately predict the susceptibility to flooding. Therefore, they have been widely used in many recent studies [12][13][14][15]. The main statistical methods used in flood susceptibility mapping include the frequency ratio [15], weights of evidence [16], index of entropy [17], and statistical index [18]. However, due to the complex mechanism of flood occurrence, the accuracies of these statistical methods are not very high [19]. In this context, machine learning methods have been introduced in order to improve the accuracy of flood susceptibility predictions because these methods can solve nonlinear problems better [20]. In recent years, many machine learning algorithms have been successfully applied in the assessment of the susceptibility to natural disasters, such as support vector machine (SVM) [21], random forest algorithm [22], classification and regression trees (CART) [23], boosted regression trees [24], and artificial neural networks [25]. These methods are modeled based on the rule of treating past flood points as dependent variables and flood conditioning factors as explanatory variables. In addition, past flood data can be used to evaluate the performance of these models, which is one of the advantages of these methods.
Although machine learning methods have significantly improved the accuracy of flood susceptibility predictions compared to statistical methods, no single method or technique is considered to be the best in all areas. One of the reasons for this phenomenon is that a single method has some shortcomings, which limit the performance of the flood susceptibility prediction [17]. It is difficult to ensure that the input data for the flood susceptibility prediction most appropriately represent the flooding in the single method, which may cause the model to miss the best fit function or the true distribution of the sample set [17]. However, a hybrid model is considered to be an effective technique for solving this problem. Therefore, integrating the statistical and machine learning methods to create hybrid models has become a popular trend in recent researches. In this regard, Costache et al. used one bivariate statistical method (Statistical Index) and its novel ensemble with the following machine learning models: Logistic Regression, Classification and Regression Trees, Multilayer Perceptron, Random Forest, and Support Vector Machine and Decision Tree CART to predict the flash flood susceptibility in Bâsca Chiojdului River Basin. From their results, it can be found that the proposed Multilayer Perceptron-Statistical Index (MLP-SI) ensemble has the highest efficiency [18]. Wang et al. integrated two independent models of frequency ratio and index of entropy with multilayer perceptron and classification and regression tree models to evaluate the flood susceptibility of Poyang County, in China [17]. Tehrany  SVM showed the best performance [26]. In addition, there are a lot of studies that have made contributions in this regard [27,28]. The common conclusion of the above studies of hybrid models is that these hybrid models have improved flood susceptibility prediction capabilities. Therefore, new hybrid models need to be investigated further.
In this context, based on nine flash flood conditioning factors, three hybrid models were proposed in this study to predict the flood susceptibility in Dadu River basin by integrating a bivariate statistic method (FMV) and three machine learning methods (SVM, CART, and CNN). The aims of this study are as follows: (1) proposing and validating the three hybrid models to enrich the methods for predicting flood susceptibility and (2) predicting and assessing the flood susceptibility of the Dadu River Basin for mitigating the negative effects of the flash flood disasters in the study area.

Study Area
The present study is focused on the Dadu River Basin (28 • 24 -33 • 65 N, 99 • 62 -103 • 77 E), which is situated on the eastern margin of the Tibetan Plateau and to the west of the Sichuan Basin ( Figure 1). The Dadu River, as a tributary of the Min River and a sub-tributary of the Yangtze River, has a full length of 1062 km, an elevation drop of 4175 m, and a catchment area of 90,016 km 2 . The study area is highly undulatory, with altitudes ranging from 337 to 7304 m. The precipitation in the Dadu River Basin increases from north to south and reaches 116 mm/day in the south. Eighty percent of the precipitation in the study area occurs from May to October [29]. Despite the high degree of afforestation (78%), the high precipitation, high slopes, and dense ditches frequently lead to severe flash flood events [30,31].
Remote Sens. 2021, 13, x FOR PEER REVIEW 3 that have made contributions in this regard [27,28]. The common conclusion of the a studies of hybrid models is that these hybrid models have improved flood suscept prediction capabilities. Therefore, new hybrid models need to be investigated furth In this context, based on nine flash flood conditioning factors, three hybrid m were proposed in this study to predict the flood susceptibility in Dadu River bas integrating a bivariate statistic method (FMV) and three machine learning methods ( CART, and CNN). The aims of this study are as follows: (1) proposing and validatin three hybrid models to enrich the methods for predicting flood susceptibility and (2 dicting and assessing the flood susceptibility of the Dadu River Basin for mitigatin negative effects of the flash flood disasters in the study area.

Study Area
The present study is focused on the Dadu River Basin (28°24′-33°65′N, 9 103°77′E), which is situated on the eastern margin of the Tibetan Plateau and to the of the Sichuan Basin ( Figure 1). The Dadu River, as a tributary of the Min River and a tributary of the Yangtze River, has a full length of 1062 km, an elevation drop of 41 and a catchment area of 90,016 km 2 . The study area is highly undulatory, with alti ranging from 337 to 7304 m. The precipitation in the Dadu River Basin increases north to south and reaches 116 mm/day in the south. Eighty percent of the precipi in the study area occurs from May to October [29]. Despite the high degree of affores (78%), the high precipitation, high slopes, and dense ditches frequently lead to severe flood events [30,31].

Flash Flood Inventory Map
The inventory of the areas previously affected by flash floods is the basic information for predicting areas where flash floods could occur in future [27]. In particular, for machine Remote Sens. 2021, 13, 4945 4 of 26 learning and statistical models, the accuracy of the historical flash flood locations significantly affects the prediction results. In this study, the flash flood inventory maps were obtained from the National Flash Flood Investigation and Evaluation Project (NFFIEP), which was launched by the Ministry of Water Resources of China and the Ministry of Finance of China in 2013 [4]. In this project, the flood disaster areas were determined by data collection and analysis and field surveys which were assumed as points for collection [4]. Thus, the central location of the historical flash flood ditch is used to indicate the location where the flash flood occurred. However, because of the age of many flash flood events, it is difficult to determine their boundaries and height today. Meanwhile, this project recorded the longitude, latitude, time, casualties, and economic losses of historical flash flood events from 1949 to 2015. More importantly, the reliability and attributes of these flash floods were strictly inspected by the experts and scholars of the China Institute of Water Resources and Hydropower [32]. Since the establishment of this database, it has successfully served a large number of studies [4,33,34].
There was a total of 485 flash floods in the Dadu River Basin (Figure 1). In addition to the flooded points, we randomly selected an equal number of non-flooded points in the study area. When applying them in the machine learning and statistical models, a value of 1 was assigned to the flooded points (i.e., the positive samples), and a value of 0 was assigned to the non-flooded points (i.e., the negative samples). Finally, 70% of the positive and negative samples were combined as the training sample (340 flood points and 340 non-flood points) and the remaining 30% were used as the validation sample (145 flood points and 145 non-flood points). In order to select the best model in the training process, the above training sample was again divided into 80% and 20% used to be the sub-training sample (272 flood points and 272 non-flood points) and testing sample (68 flood points and 68 non-flood points), respectively. Thus, a total of 272 flood and 272 non-flood points were used to train the models, 68 flood points and 68 non-flood points were used to test the models, and 145 flood points and 145 non-flood points were used to validate the models.

Flash Flood Conditioning Factors
Based on the formation mechanism of the flash floods, we considered nine factors from two perspectives (triggering factors and disaster-pregnant environment) based on previous studies [35][36][37]. These flash flood conditioning factors include the altitude, slope, slope aspect, topographic wetness index (TWI), maximum three-day precipitation (M3DP), land cover, soil texture, normalized difference vegetation index (NDVI), and distance to the river (DR). Each of these factors were converted into a gridded database with a spatial resolution of 1 km × 1 km in ArcGIS. The primary sources of the factors are presented in Table 1. Furthermore, short descriptions of the factors that influence the flash flood occurrence are provided below.
Altitude ( Figure 2a) is an important factor affecting flood occurrence. In general, areas with lower altitudes experience higher river discharge and a greater likelihood of flooding [38]. The altitude of the study area increases from southeast to northwest, ranging from 337 to 7304 m. In this study, the altitude was represented by a digital elevation model (DEM) obtained from the Geospatial Data Cloud site, Computer Network Information Center, Chinese Academy of Sciences (GDC). The spatial resolution of DEM has been converted into 1 km × 1 km by Resample tool in ArcGIS. Slope (Figure 2b) is considered to be one of the factors with the most influence on flash flood genesis [27]. Due to the nature of water flowing downhill, the slope can directly affect the vertical percolation and surface runoff [39]. Generally, floods occur more frequently in areas with low slopes. The slopes in the study area are highly variable, even reaching 52.6 • in some places. The slope factor used in this study was calculated using the DEM. Slope aspect (Figure 2c) is defined as the direction of the maximum slope of the terrain surface [40], which was also obtained from the DEM in this study. It is generally accepted that the slope aspect affects flooding in an indirect way, that is, by controlling various geographic and environmental factors, i.e., vegetation, soils, and rainfall [41]. In this study, the Remote Sens. 2021, 13, 4945 5 of 26 slope aspect map was divided into 10 categories ranging from flat to north zones. The TWI ( Figure 2d) is a significant factor in flood susceptibility mapping, which reflects the geotechnical wetness [42]. As a signal for water accumulation in a river basin, the TWI values are positively correlated with the likelihood of flooding. In this study, the TWI was also calculated from the DEM, and the detailed calculation formula was presented by Ali et al. [43]. The M3DP (Figure 2e), a representative factor for rainfall, is a triggering factor for flooding. It has been shown to have a non-negligible influence on the occurrence of floods [44]. Higher M3DP values generally signify a higher risk of flooding. The rainfall in the study area decreases from south to north, and the M3DP varies from 37 to 167 mm. The M3DP values used in this study were calculated using Global Precipitation Measurement (GPM) database, which records the average daily precipitation across the world. As a new generation of precipitation observation satellites, GPM was launched in February 2014 by National Aeronautics and Space Administration, integrating advanced microwave detection technology and data correction algorithms [45]. More importantly, its applicability in many regions of China has been verified [46]. The GPM data were resampled to 1 km × 1 km using the kriging interpolation method [17]. The land cover ( Figure 2f) and soil texture ( Figure 2g) are considered to have significant effects on flooding and drought incidences [47]. In the study area, there are six types of land cover and 12 types of soil texture. These two factors were both derived from the Data Center for Resources and Environmental Sciences, Chinese Academy of Sciences. Among them, the land cover data provided by this platform were produced by human visual interpretation based on Landsat TM. The NDVI (Figure 2h) has been used extensively for predicting flood susceptibility [48], and it represents the status of the vegetation cover. The higher the NDVI, the higher the vegetation cover and the lower the potential for flooding. The study area is basically covered with vegetation with NDVI of 0-0.92. The NDVI data used in this study was obtained from the National Earth System Science Data Center. The DR (Figure 2i) is also a commonly used factor for identifying the flood susceptibility because river flows are the main pathways for flood discharge and areas near rivers are susceptible to flooding [23]. The DR values were calculated by imposing multifarious buffer zones every 1000 m around the four-level river systems. To prepare the input for the bivariate statistics method, following the method used in previous studies [18,19], all five continuous numerical factors (altitude, slope, TWI, M3DP, and NDVI) were divided into five classes using the natural break method.

Information Gain Method
To avoid overfitting, feature selection is an essential step before applying the machine learning algorithms. This is because the use of less redundant data results in a performance boost of the model and leads to less opportunity for noise-based decisions [49]. Therefore, feature selection was used to improve the performance of the flood susceptibility model.
The information gain (IG) method is a popular feature selection method, which was proposed by Hunt et al. (1966). It was selected not only for its ability to eliminate invalid indicators, but also for its ability to rank the importance of the input variables [50]. Information theory is the basis of the IG method, which calculates the amount of gain. The IG value for each flash flood conditioning variable F i was estimated using the following formula [51]: where H(Y) is the entropy value of Y i , and H(Y|F i ) is the entropy of Y after associating the values of the flash flood conditioning factor F i .

Variance Inflation and Tolerance
Multi-collinearity analysis was carried out to evaluate the correlations between the flash flood conditioning factors. In general, severe multi-collinearity makes it difficult for the model to accurately estimate by mistakenly describing the relevant factors in the statistical models [52]. The variance inflation (VIF) and tolerance (TOL) were used to examine the multi-collinearity among the factors in this study. When VIF > 10 or TOL < 0.1, the factor has multiple collinearity problems and needs be eliminated [52].

Bivariate Statistics Method
The FMV was selected as the bivariate statistics method in this study. Fuzzy logic was first proposed by Lotfi Zadeh in 1965 for computing set theory as fuzzy logic [53]. In reality, the relationship between variables or concepts is often inaccurate and ambiguous, which makes it difficult to describe the relationship using values of 0 (irrelevant) or 1 (relevant). However, fuzzy logic solves this problem by showing a gray look into the actual world and by finding a way to draw the external facts [53]. For example, if black is represented by 0 and white is represented by 1, then gray will be a number between 0 and 1, that is, fuzzy logic is a method that shows the correctness of the numbers between 0 and 1 [54]. Fuzzy principles have been implemented in several methods, such as frequency ratio (FR) and weight of evidence et al. [55]. Among which the frequency ratio (FR) is one of the most popular [56]. The FR can be calculated as follows [56]: where FR ij is the FR value of class i of factor j, A ij is the number of flash floods within class i of factor j, A j is the number of total flash floods in factor j, B ij is the number of pixels in class i of factor j, and B j is the total number of pixels in factor j.
After calculating the FR ij , the fuzzy membership values are obtained using the following equation: where u ij is the FMV of class i of factor j.

Support Vector Machine
The SVM algorithm is a cogent prediction machine learning method based on the structural risk minimization principle and statistical learning theory [57]. This algorithm implements binary classification by constructing a hyperplane, which can divide the training data based on the bands and an optimization algorithm. This hyperplane is generated by transforming the original input space into a higher-dimensional feature space [58]. After finding the hyperplane, the support vector whose position is closest to the hyperplane can be identified [59]. Then, a specific kernel function is applied to transform the input data into two classes: non-flood susceptibility and flood susceptibility {0, 1} [23]. Several kernel functions are used in SVM, but numerous studies [19,24] have shown that the radial basis function (RBF) has a better performance than the other kernels in the context of flood prediction. Thus, the RBF function was used in this study. The major steps of the algorithm are as follows: (1) Assume that T = {x 1 , x 2 ,..., x n , y} is the training set of known samples where x i is the i th input data, and y is the output data where i = 1, 2, . . . , n.
(2) Separate the training set into two categories using an n-dimensional hyperplane to obtain the maximum interval: where w is the norm of the hyperplane normal, b is a scalar base, and (·) represents the product operation.
(3) Using the Lagrange multiplier, the cost function can be defined as follows: where λ i is the Lagrangian multiplier. By using standard procedures, the solution can be obtained by minimizing the duality of w and b using Equation (6) [60].
(4) For the non-separable case, the constraints can be modified by introducing slack variables, ξi [60]: Thus, Equation (6) becomes where ν∈(0,1], which is introduced in order to account for misclassifications [61]. The BRF is described as follows: where γ is the parameter of the kernel function. Sometimes kernel functions are parameterized using γ = 1/2σ 2 , where σ is an adjustable parameter that governs the performance of the kernel.

Classification and Regression Trees
The CART, proposed by Breiman et al. in 1984 [62], is a nonparametric machine learning method that can build predictions based on the input variables [63]. Both the classification and regression tasks can be accomplished by the CART algorithm. For the classification problem, the predictors used can be number, binary, and categorical [19]. This is considered to be an advantage of the CART. Another advantage is the resistance to missing data [64] because the determination of the optimal ramification tree does not use a missing value. When using the CART to make the predictions, the missing values are processed using substitutes (surrogates) [62]. The predicted values are represented by Remote Sens. 2021, 13, 4945 9 of 26 the average of the response values. The optimal sampling rule of the CART algorithm, i.e., modified towing, is expressed as follows: where k denotes the target classes; PL(k) and PR (k) are the probability distributions of the targets in the left and right child nodes, respectively; and the power term u embeds a user-trollable penalty in splits that generate unequal-sized child nodes [65].

Convolutional Neural Network
The CNN was proposed by Bengio et al. in order to overcome the challenge of decreased speed in the learning process faced by traditional artificial neural networks (ANNs) when analyzing complex networks [66]. The CNN is still a neural network, and it carries the local connections among the different layers. In recent years, several CNN architectures have been established to solve increasingly complicated nonlinear problems, among which, the 1D-CNN is regarded as the most typical [67] and was used in this study. In general, the 1D-CNN has five neuronal layers, including an input layer, a convolutional layer, a pooling layer, a fully connected neural network layer, and an output layer. To simplify the problem, small squares of input data were used to extract the features in the input layer. The convolution layer needs an image matrix and a filter to process a series of mathematical operations. The role of the pooling layer is to reduce the number of parameters while retaining the critical information. The fully connected neural network layer, which is a simple and multilayer perceptron, is used to identify the object classes and to learn the weights.

Statistical Measures
As statistical measures, the sensitivity, specificity, and k-index were used to evaluate the performances of the models. These indexes reflect the classification accuracy of a model, which, in this study, is the ability of the model to correctly differentiate the flood pixels from the non-flood pixels [68]. Specifically, the sensitivity and specificity represent the proportion of flood events that are classified as flood pixels and the proportion of non-flood locations that are classified as non-flood pixels, respectively [69]. The statistical indexes, including the sensitivity, specificity, and accuracy, were calculated using the following equations [69]: where FP (false positive) and FN (false negative) are the numbers of pixels erroneously classified. In addition, the Root Mean Squared Error (RMSE) (Equation (8)) was used to determine the difference between the expected and actual flood results. Although it is generally believed that the lower the RMSE, the better the model performance [70], Singh et al. claimed that RMSE values less than half the SD of the measured data may be considered low and acceptable for model evaluation [71]. Therefore, the SDs of prediction values were calculated in this study to provide reference for understanding the RMSE.
Notably, the SDs of real sample datasets are equal to 0.5, because these datasets contain only the same number of 0 (non-flood points) and 1 (flood points).
The seed cell area index (SCAI) was used to compare the ratio of the density of the flood locations and the areas of the susceptibility classes in this study. Lower SCAI values indicate a higher flood susceptibility. This approach is helpful in evaluating the consistency and effectiveness of the models [72]. The SCAI is calculated as follows [73]: Aeria extent of susceptibility classes (%) Inventory of floods in each susceptibility classes (%) ,

ROC Curve
In general, the receiver operating characteristic (ROC) curve is applied to describe the performance of a statistical model that combines different clues and test results for predictive purposes [27]. The area under the curve (AUC) is the most important indicator of a ROC curve and can directly reflect the accuracy of a model. The greater the AUC, the better the model. In this study, the training dataset was used to determine the success rate curve, and the validation dataset were used to determine the prediction rate curve. The success rate and the prediction rate reflect the goodness of fit and the prediction power of the models, respectively [74]. The AUC can be calculated as follows: where TP (true positive) and TN (true negative) are the number of pixels that are correctly classified, P is the total number of pixels with flash floods, and N is the total number of pixels without flash floods.

Processing
The methodological workflow implemented in this study is schematically shown in Figure 3. First, a database comprising a flood inventory map and nine flash flood condition factors was created from different sources. Next, we applied the information gain method to extract the valid factors and used the VIF and TOL to verify that there were no serious collinear relationships between the factors. Then, based on the training dataset, we used the FMV model to calculate the FMV values of the factors' classes and input them into the three machine learning models for training. In addition, the three single machine learning models also were trained using the training dataset. Meanwhile, the test dataset was used to evaluate the training accuracy of the models and determine the optimal parameters of the models. Finally, the flood susceptibility maps were generated, and the prediction performances of the models were evaluated using the ROC and several statistical methods. It should be noted that the machine learning models were run using the packages in R software.

Feature Selection
The predictive abilities of the nine flash flood conditioning factors in terms of flood susceptibility are shown in Figure 4. The altitude had the highest IG value of 0.47, followed by the M3DP (0.38), TWI (0.2), soil texture (0.19), land cover (0.18), DR (0.09), slope (0.09), slope aspect (0.05), and NDVI (0). These results indicate that the IG values of the remaining eight factors were greater than 0, except for the NDVI. This phenomenon (IG value of NDVI < 0) can be explained by the fact that the precipitation in this area exceeded the maximum interception capacity of the tree canopy, and thus the protective effect of the forest was eliminated from the hydrological point of view [27]. Therefore, the NDVI was not further employed to train the six models. Table 2 presents the results of the multi-collinearity analysis of the eight flash flood conditioning factors (not including the NDVI). It can be seen that the altitude had the highest VIF (5.476) and the lowest TOL (0.183). However, neither of these values exceeded the critical values (10 and 0.1, respectively), indicating the absence of multicollinearity among the eight flash flood conditioning factors. Thus, all eight of the flash flood conditioning factors were taken into account in the modeling.

Feature Selection
The predictive abilities of the nine flash flood conditioning factors in terms of flood susceptibility are shown in Figure 4. The altitude had the highest IG value of 0.47, followed by the M3DP (0.38), TWI (0.2), soil texture (0.19), land cover (0.18), DR (0.09), slope (0.09), slope aspect (0.05), and NDVI (0). These results indicate that the IG values of the remaining eight factors were greater than 0, except for the NDVI. This phenomenon (IG value of NDVI < 0) can be explained by the fact that the precipitation in this area exceeded the maximum interception capacity of the tree canopy, and thus the protective effect of the forest was eliminated from the hydrological point of view [27]. Therefore, the NDVI was not further employed to train the six models.

Fuzzy Membership Value
The FMV calculation results are presented in Table 3, and the relative distribution of the flood pixels within the factor classes are shown in Figure 5. It can be seen that four classes had FMV values of 0. Among these classes, one was attributed to the altitude, two belonged to the soil texture, and one belonged to the slope aspect. This is because there were no flood pixels in these classes.
The highest FMV values (1.00) for each factor occurred for the altitude class of 337-1494 m, the M3DP class of 116-167 mm, the TWI of −6.50 to 0.36, the DR class of <1000 m, the slope class of 0°-5.36°, the land cover class of built-up areas, the soil texture class of sandy-clay, and the slope aspect class of southeast. These classes had significantly different area ratios and flash flood point density ratios ( Figure 5). In addition, as can also be seen from Figure 5

Fuzzy Membership Value
The FMV calculation results are presented in Table 3, and the relative distribution of the flood pixels within the factor classes are shown in Figure 5. It can be seen that four classes had FMV values of 0. Among these classes, one was attributed to the altitude, two belonged to the soil texture, and one belonged to the slope aspect. This is because there were no flood pixels in these classes.
The highest FMV values (1.00) for each factor occurred for the altitude class of 337-1494 m, the M3DP class of 116-167 mm, the TWI of −6.50 to 0.36, the DR class of <1000 m, the slope class of 0 • -5.36 • , the land cover class of built-up areas, the soil texture class of sandy-clay, and the slope aspect class of southeast. These classes had significantly different area ratios and flash flood point density ratios ( Figure 5). In addition, as can also be seen from Figure 5, the variation in the FMV values of the five continuous numerical factors (altitude, M3DP, TWI, DR, and slope) exhibited a clear pattern. For the M3DP and TWI, the FMV values were positively correlated with the values of these factors, while for altitude, DR, and slope, the FMV values were inversely correlated with the value of the factors. These results indicate that the incidence of flash floods increased with increasing M3DP and TWI, while it decreased with increasing altitude, DR, and slope.
After the FMV calculations, the FMV values were input to the three hybrid models (SVM-FMV, CART-FMV, and CNN-FMV) for training and prediction.

SVM and SVM-FMV
In order to determine the best model structure, the RMSE of the testing dataset (RMSEtesting) was used to reflect the model's performance. From the Table 4, it is can be seen that the RMSE of each model is almost equal to the SD of prediction values and

SVM and SVM-FMV
In order to determine the best model structure, the RMSE of the testing dataset (RMSE testing ) was used to reflect the model's performance. From the Table 4, it is can be seen that the RMSE of each model is almost equal to the SD of prediction values and slightly higher than the half of SD of sample dataset (0.5). This result indicates that there is a difference between the observed and modeled flood susceptibility, but this difference could be considered as acceptable [75]. After conducting the 5 cross-validation procedure, the model with the lowest RMSE was identified as the best model structure. In this study, the best structure for the SVM model (RMSE testing = 0.38) is cost = 100 and gamma = 0.1, and the best structure for the SVM-FMV model (RMSE testing = 0.29) has the same cost but different gamma parameters (0.001) as the SVM. The statistical indices used to evaluate the performance of the SVM and the SVM-FMV models in terms of the training and the testing dataset are presented in Table 4. As can be seen, 222 flood pixels and 214 non-flood pixels were classified correctly in the training dataset of the SVM model, with a sensitivity and specificity of 79.29% and 81.06%, respectively, while 229 flood pixels and 218 non-flood pixels were classified correctly by the SVM-FMV model, with a sensitivity and specificity of 80.92% and 83.52%, respectively. Overall, the classification accuracies of the SVM and SVM-FMV reached 80.15% and 82.17%, respectively, in terms of the training dataset, and 80.88% and 88.97%, respectively, in terms of the testing dataset. This result indicated that the SVM and SVM-FMV models had a high training accuracy.
After the normalization, the flood susceptibility indices obtained using the SVM (FSI SVM ) and SVM-FMV (FSI SVM-FMV ) models were classified into five classes using the natural break method ( Figure 6). In terms of the FSI SVM , the first class of values (0-0.11) identified the zones with a very low flood susceptibility, accounting for 42.79% of the study area. The low (0.11-0.

CART and CART-FMV
The CART and CART-FMV models were constructed by the training dataset with the 5-fold cross-validation method. After the cross-validation procedure, according to the minimum RMSEtesting, the optimal CART and CART-FMV trees were built (Figure 7). Note that the two trees were not pruned due to their simplicity and good performance.
In terms of the training dataset (Table 4), it can see that the CART and CART-FMV correctly classified the 253 and 245 flood pixels with sensitivities of 84.47% and 80.59%, respectively. The CART and CART-FMV had specificities of 92.34% and 88.75%, respectively, correctly classifying 229 and 213 non-flood pixels, respectively. Overall, the classification accuracies of the CART and CART-FMV reached 88.6% and 84.49%, respectively. In terms of the testing dataset, the classification accuracies of the CART and CART-FMV models reached 83.09% and 88.97%, respectively. These values showed that the two models had a good training performance.
For FSICART and FSICART-FMV, the natural break method was also used to reclassify them into five classes ( Figure 6). For the FSICART, the very low flood susceptibility accounted for 59.10% of the study area, followed by the very high (14.75%), low (11.51%), moderate (7.37%), and high (7.28%) classes. However, for the FSICART-FMV, the situation was somewhat different. Although the very low class also accounted for the largest area (53.78%), the moderate class ranked second in area (18.12%) in terms of the FSICART, not the very high class. Notably, the very high class also accounted for a large proportion (12.16%) of the study area. The low class and high class accounted for 7.04% and 8.90%, respectively.

CART and CART-FMV
The CART and CART-FMV models were constructed by the training dataset with the 5-fold cross-validation method. After the cross-validation procedure, according to the minimum RMSE testing , the optimal CART and CART-FMV trees were built (Figure 7). Note that the two trees were not pruned due to their simplicity and good performance.
In terms of the training dataset (Table 4), it can see that the CART and CART-FMV correctly classified the 253 and 245 flood pixels with sensitivities of 84.47% and 80.59%, respectively. The CART and CART-FMV had specificities of 92.34% and 88.75%, respectively, correctly classifying 229 and 213 non-flood pixels, respectively. Overall, the classification accuracies of the CART and CART-FMV reached 88.6% and 84.49%, respectively. In terms of the testing dataset, the classification accuracies of the CART and CART-FMV models reached 83.09% and 88.97%, respectively. These values showed that the two models had a good training performance.
For FSI CART and FSI CART-FMV , the natural break method was also used to reclassify them into five classes ( Figure 6). For the FSI CART , the very low flood susceptibility accounted for 59.10% of the study area, followed by the very high (14.75%), low (11.51%), moderate (7.37%), and high (7.28%) classes. However, for the FSI CART-FMV , the situation was somewhat different. Although the very low class also accounted for the largest area (53.78%), the moderate class ranked second in area (18.12%) in terms of the FSI CART , not the very high class. Notably, the very high class also accounted for a large proportion (12.16%) of the study area. The low class and high class accounted for 7.04% and 8.90%, respectively. Remote Sens. 2021, 13, x FOR PEER REVIEW 17 of 26

CNN and CNN-FMV
Based on the minimum RMSEtesting, after the 5-fold cross-validation and trial-and-error procedure, we determined the final CNN and CNN-FMV model structures (Figure 8). Considering the eight flash flood conditioning factors used in this study, the input shape was determined to be 8 × 1. In the convolutional layer, the rectified linear unit (relu) function was applied, which was considered to be the most typical activation function [66]. The filter and the kernel size were set to 100 and 2 in the convolutional layer, respectively. In addition, the pool shape in the pooling layer and the units in the full connected layer were determined to be 2 and 32, respectively.
Following the training processes of the two models, their classification results were counted and calculated (Table 4). For the CNN model, 187 flood pixels and 237 non-flood pixels were correctly classified, resulting in a sensitivity of 84.23% and a specificity of 73.60%. For the CNN-FMV model, 242 flood pixels and 216 non-flood pixels were correctly classified, resulting in a sensitivity of 81.21% and a specificity of 87.80%. Overall, the classification accuracies of the CNN and CNN-FMV reached 77.94% and 84.19%, respectively, in terms of the training dataset, while they reached 77.21% and 88.24%, respectively, in terms of the testing dataset. Thus, the two models were considered to have good training accuracy.
For the flood susceptibility indices of the CNN (FSICNN) and CNN-FMV (FSICNN-FMV), the range of values was also divided into five classes using the natural break method (Figure 6). In terms of the FSICNN, over 75% of the entire region of the Dadu River Basin had very low and low flood susceptibilities. Notably, the very high class occupied the smallest area, just 5.73%, while the moderate and high classes accounted for 10.41% and 7.48% of the entire study area, respectively. With respect to the FSICNN-FMV, the very low and low classes, between 0 and 0.28, accounted for 66% of the Dadu River Basin together, while the moderate and high classes accounted for 13.76% and 12.12%, respectively. Similar to

CNN and CNN-FMV
Based on the minimum RMSE testing , after the 5-fold cross-validation and trial-anderror procedure, we determined the final CNN and CNN-FMV model structures (Figure 8). Considering the eight flash flood conditioning factors used in this study, the input shape was determined to be 8 × 1. In the convolutional layer, the rectified linear unit (relu) function was applied, which was considered to be the most typical activation function [66]. The filter and the kernel size were set to 100 and 2 in the convolutional layer, respectively. In addition, the pool shape in the pooling layer and the units in the full connected layer were determined to be 2 and 32, respectively.

Statistical Measures
In terms of the validation dataset (Table 5), the CART-FMV had the highest classification accuracy (85.86%) and the highest sensitivity (84.67%) and specificity (87.14%) values, while the classification accuracies of the SVM and CNN were the lowest (78.28%). Notably, the classification accuracy of each of the three individual machine learning models (SVM, CART, and CNN) was lower than that of their corresponding respective hybrid Following the training processes of the two models, their classification results were counted and calculated (Table 4). For the CNN model, 187 flood pixels and 237 nonflood pixels were correctly classified, resulting in a sensitivity of 84.23% and a specificity of 73.60%. For the CNN-FMV model, 242 flood pixels and 216 non-flood pixels were correctly classified, resulting in a sensitivity of 81.21% and a specificity of 87.80%. Overall, the classification accuracies of the CNN and CNN-FMV reached 77.94% and 84.19%, respectively, in terms of the training dataset, while they reached 77.21% and 88.24%, respectively, in terms of the testing dataset. Thus, the two models were considered to have good training accuracy.
For the flood susceptibility indices of the CNN (FSI CNN ) and CNN-FMV (FSI CNN-FMV ), the range of values was also divided into five classes using the natural break method ( Figure 6). In terms of the FSI CNN , over 75% of the entire region of the Dadu River Basin had very low and low flood susceptibilities. Notably, the very high class occupied the smallest area, just 5.73%, while the moderate and high classes accounted for 10.41% and 7.48% of the entire study area, respectively. With respect to the FSI CNN-FMV , the very low and low classes, between 0 and 0.28, accounted for 66% of the Dadu River Basin together, while the moderate and high classes accounted for 13.76% and 12.12%, respectively. Similar to the FSI CNN , the fifth class of the FSI CNN-FMV , characterized by very high flood susceptibility, accounted for the smallest area (7.74%).

Statistical Measures
In terms of the validation dataset (Table 5), the CART-FMV had the highest classification accuracy (85.86%) and the highest sensitivity (84.67%) and specificity (87.14%) values, while the classification accuracies of the SVM and CNN were the lowest (78.28%). Notably, the classification accuracy of each of the three individual machine learning models (SVM, CART, and CNN) was lower than that of their corresponding respective hybrid models (SVM-FMV, CART-FMV, and CNN-FMV). Table 5. Prediction performances of the six models used in this study based on the validation dataset. Based on the SCAI results, the SCAI value of the very high class in the CNN-FMV was the lowest (0.11) and that in the CART was the highest (0.19). In addition, all three individual machine learning models (SVM, CART, and CNN) had higher SCAI values than their respective corresponding hybrid models (SVM-FMV, CART-FMV, and CNN-FMV).

ROC Curve
The success rate curves, which were computed using the training dataset, are shown in Figure 9a. From these success rate curves, it can be seen that the CNN-FMV hybrid model had the highest performance (AUC = 0.

Assessment of the Methodology
One of the elements of novelty of this study is represented by the first application of machine learning models to assess the flood susceptibility within the Dadu River Basin. More important novelty is reflected in the first application of the following ensemble models to determine flood susceptibility in this study: SVM-FMV, CART-FMV, and CNN-FMV. As a bivariate statistical method, the FMV has been used for landslide susceptibility mapping [56], but it has rarely been applied in flood susceptibility mapping as well as in ensemble modeling. In this context, exploring the applicability of FMV in ensemble modelling become one of the novelties of this study. As shown in formulations (2) and (3), one of the advantages of the FMV is that it can be simply and easily implemented since the calculation is based on the FR. Therefore, it has great potential in future practical application. In addition, as with all bivariate statistical models, the FMV provides a good representation of the relationships between the flood conditioning factors and the flood occurrence. However, bivariate statistical methods lack the ability to capture the hidden characteristics of floods because of the complex triggering mechanism of floods [17]. Fortunately, machine learning models can reflect more of the high-dimensional relationships between the non-linearly related input variables [76]. As a popular machine learning model, the SVM is widely used in the assessment of flood and even landslide susceptibility [23,77]. This algorithm has been demonstrated to have an excellent generalization ability [23]. As a decision tree algorithm, the CART is popularly used in the assessment of natural disaster susceptibility. The data distribution and the existence of data outliers do not have a huge impact on its results, which is one of the advantages of this algorithm [78]. The CNN, as one of the most popular deep learning techniques, is able to obtain reliable results comparable to or superior to those of conventional machine learning methods [79]. In 2020, Wang et al. proposed two CNN frameworks for flood susceptibility prediction [80]. Gang et al. and Khosravi et al. have applied CNN to the prediction of flood susceptibility in cities and Iran, respectively [66,67]. However, the application of CNN in flood susceptibility prediction is still rare [80]. It is unclear whether CNN can be combined with statistical models to improve the accuracy of flood susceptibility prediction. In order to explore this issue, CNN was used to ensemble with FMV for the first time in this study. Overall, in the three hybrid models presented in this study, the FMV clearly depicts the relationship between the factors and flooding, and it can provide a more appropriate data representation for the machine learning methods than the raw data. Therefore, the hybrid Similar to the success rate curves, the prediction rates were graphically represented as curves using the validation samples ( Figure 9b). Thus, the CNN-FMV hybrid model had the highest performance (AUC = 0.912), followed by the CART-FMV and SVM-FMV hybrid models, both of which had an AUC value of 0.898. Consistent with the success rate results, the AUC values of the three individual machine learning models for the prediction rate were lower than the AUC values of their respective hybrid models. The AUC values of the SVM, CART, and CNN were 0.866, 0.893, and 0.857, respectively.

Assessment of the Methodology
One of the elements of novelty of this study is represented by the first application of machine learning models to assess the flood susceptibility within the Dadu River Basin. More important novelty is reflected in the first application of the following ensemble models to determine flood susceptibility in this study: SVM-FMV, CART-FMV, and CNN-FMV. As a bivariate statistical method, the FMV has been used for landslide susceptibility mapping [56], but it has rarely been applied in flood susceptibility mapping as well as in ensemble modeling. In this context, exploring the applicability of FMV in ensemble modelling become one of the novelties of this study. As shown in formulations (2) and (3), one of the advantages of the FMV is that it can be simply and easily implemented since the calculation is based on the FR. Therefore, it has great potential in future practical application. In addition, as with all bivariate statistical models, the FMV provides a good representation of the relationships between the flood conditioning factors and the flood occurrence. However, bivariate statistical methods lack the ability to capture the hidden characteristics of floods because of the complex triggering mechanism of floods [17]. Fortunately, machine learning models can reflect more of the high-dimensional relationships between the non-linearly related input variables [76]. As a popular machine learning model, the SVM is widely used in the assessment of flood and even landslide susceptibility [23,77]. This algorithm has been demonstrated to have an excellent generalization ability [23]. As a decision tree algorithm, the CART is popularly used in the assessment of natural disaster susceptibility. The data distribution and the existence of data outliers do not have a huge impact on its results, which is one of the advantages of this algorithm [78]. The CNN, as one of the most popular deep learning techniques, is able to obtain reliable results comparable to or superior to those of conventional machine learning methods [79]. In 2020, Wang et al. proposed two CNN frameworks for flood susceptibility prediction [80]. Gang et al. and Khosravi et al. have applied CNN to the prediction of flood susceptibility in cities and Iran, respectively [66,67]. However, the application of CNN in flood susceptibility prediction is still rare [80]. It is unclear whether CNN can be combined with statistical models to improve the accuracy of flood susceptibility prediction. In order to explore this issue, CNN was used to ensemble with FMV for the first time in this study. Overall, in the three hybrid models presented in this study, the FMV clearly depicts the relationship between the factors and flooding, and it can provide a more appropriate data representation for the machine learning methods than the raw data. Therefore, the hybrid models potentially have better accuracies and performances in flood susceptibility mapping than the single machine learning models.

Assessment of the Model Performances
According to the model training results (Table 4 and Figure 9a), the classification accuracies ranged from 77.79% for the CNN to 88.60% for the CART, and the AUC values ranged from 0.884 for CNN to 0.935 for CNN-FMV. According to the model validation results (Table 5 and Figure 9b), the classification accuracies ranged from 78.28% for the CNN to 85.86% for the CART, and the AUC values ranged from 0.857 for CNN to 0.912 for CNN-FMV. These classification accuracies and AUC values were higher than 0.75% and 0.80, respectively, which showed that all the six models had acceptable fitting accuracies and good prediction performances. Therefore, the applications of these six models in this study were successful. However, in terms of training dataset (Table 4 and Figure 9a), it can be seen that the CART had the best classification accuracy of 88.60%, while the CNN-FMV hybrid model had the highest AUC value of 0.935. In terms of validation dataset (Table 5 and Figure 9b), we can observe that the one with the highest classification accuracy was the CART-FMV hybrid model (85.86%), while the one with the highest AUC value was the CNN-FMV hybrid model (0.912). The reason for this result is due to the fact that this study treated values with predictions greater than 0.5 as 1 (flood point), and vice versa, as 0 (non-flood point) when calculating the classification accuracy, which is different from calculating AUC values [81]. Therefore, the ranking of each model based on the classification accuracy and the ranking based on the AUC value may be different, a phenomenon that also appeared in many studies [17,19,27,28,43,69,70,81]. In other words, the model with the maximum classification accuracy does not necessarily have the highest AUC value. However, as a useful method for representing the quality of the probabilistic natural disaster susceptibility model classifiers, the ROC curve indicates that the CNN-FMV hybrid model had the highest Success-rate and prediction-rate AUC values, 0.935 and 0.912 respectively. This result indicates that the CNN-FMV hybrid model had the best fitting and prediction performances in this study. In addition, the SCAI results revealed that the CNN-FMV hybrid model had the lowest SCAI value (0.11), which also indicates that it had the best result in terms of the validation results. This is because the FMV can express the degree of correlation between each class in the factors and the development of floods. Moreover, the CNN can consider the topographical information of the surrounding environment to achieve a higher performance [67].
Another important finding was obtained from the model validation results (Table 5 and Figure 9). Compared with their single machine learning models, the AUC values of the SVM-FMV, CART-FMV, and CNN-FMV were 0.032, 0.005, and 0.055 higher; their SCAI values were 0.05, 0.03, and 0.2 lower; and their classification accuracies were 4.48%, 1.38%, and 5.86% higher, respectively. These results suggest that the three hybrid models proposed in this study are some degree of accuracy improvement compared to their respective single machine learning models. Therefore, these three models can be used as a benchmark for future studies, the main scope of which will be to assess the flash flood susceptibility in specific areas. However, some studies on the ensemble modeling in flash flood susceptibility have showed higher AUC values of their hybrid models. For example, Costache et al. showed ADT-IOE have excellent capability for flood susceptibility prediction with AUC = 0.972 in Suha River Basin [27]. At first glance, it seems that the accuracy of the hybrid models proposed in this study is less than the ADT-IOE model that have already been used in a basin. However, the Suha River Basin (363 km 2 ) is much smaller than our study basin (90,016 km 2 ). Therefore, it is reasonable for the model to have lower accuracy in a larger area. For a large area, the study of Dodangeh et al. showed that SVR-HS model had a lower accuracy (AUC = 0.75) than the hybrid models proposed by this study [81], although the area of the basin studied in their research (18,644 km 2 ) is only quarter of Dudu River Basin. Therefore, the proposed hybrid models are promising in flash flood susceptibility prediction, especially in a large basin.
Although, the RMSE result in this study showed that these models are not perfect, due to the RMSE is not smaller than a half of SD. However, when review relative previous studies [14,81], it is can be found that the SD of prediction values is also almost equal to the RMSE in their studies. This result indicates that the SD and RMSE results obtained in this study is not perfect but reasonable. This may be one of the limitations of the flood susceptibility mapping based on machine learning models compared to hydrological model. As shown in the study of Kastridis et al., the results based hydrological model may have a better RMSE [75].
In terms of the FSI results of the six models, the percentages of the high and very high FSI values ranged from 13.2% for the CNN to 22.02% for the CART. These zones were mainly distributed in the southern part of the Dadu River Basin, where the terrain is relatively flat and the precipitation is high. In addition, the valley areas in the northern part of the Dadu River Basin are similarly characterized by high or very high flood susceptibilities. The above two spatial distribution characteristics of the high and very high flood susceptibility classes are similar to the results obtained by Costache and Bui [19].

Applications and Limitations
The three hybrid models (SVM-FMV, CART-FMV, and CNN-FMV) proposed in this study have been demonstrated to be excellent at predicting flood susceptibility. Thus, they can be applied to any other area as an effective method of identifying flood-prone areas. In addition, these models also have the potential to be applied to the susceptibility assessment of other natural disasters, such as landslides and mudslides. Furthermore, the results of this study (the flood susceptibility map) may be useful in helping the local authorities take the most appropriate measures to mitigate the negative effects of flash floods.
However, several limitations exist in this study. The results generated by these three hybrid models cannot describe the details of flooding including the flood inundation extent, water depth, and the velocity [82]. For more detailed scales, such as river sector, the development of a combined hydraulic model is recommended, which can more easily take into account anthropogenic influences in order to assess the extent of flooding for different probability estimates of flow [43]. In the modelling process of this study, only the typical CNN architecture (1D-CNN) was used, and the use of higher dimensional architectures (2D-CNN and 3D-CNN) is suggested for future studies. On the other hand, using binary values (0, 1) for flood absence and presence cannot reflect the frequencies of each point, which should be considered in future studies as possible [81]. In addition, only the location of the flood events was taken into account when preparing the flood sample, not the date of the occurrence. However, the date of the flooding could reflect the effects of the changes in certain factors on the occurrence of flooding, such as land use changes. This topic is also an interesting future research direction. In terms of the flood factor selection, more number of them could be considered, such as slope-length, curvature, and others, which may contribute to better prediction accuracy. In terms of the results, we did not list the importance of each factor. This is due to the fact that the main purpose of this study was to compare the performances of the novel hybrid models and, except for the CART, both the SVM and CNN algorithms cannot be used to directly calculate the importance of the factors. Note that the FSI results of the six models have some degree of variation ( Figure 6). As was described by Shafizadeh-Moghadam et al., each method creates different results [24]. Thus, the technique of combining the results of individual models, which is considered to generate more generalizable results, could be applied in future studies. In addition, integrating machine learning and physical simulation methods for flood susceptibility mapping is worth to be considered in future studies.

Conclusions
Flash flood events are becoming more frequent worldwide. Therefore, the accurate identification of areas prone to flash floods is particularly important in flash flood prevention and mitigation. In this study, we proposed three proposed hybrid models (SVM-FMV, CART-FMV, and CNN-FMV) to identify the areas prone to flash floods within the Dadu River basin. Then, we evaluated the capabilities of these three hybrid models for flood susceptibility prediction and compared them with three single machine learning models. The ROC curves revealed that the CNN-FMV hybrid model had the best fitting (AUC value = 0.915) and prediction performance (AUC value = 0.912). This result is attributed to the fact that FMV clearly depicts the relationship between the factors and flooding, and CNN achieve a higher performance by considering the topographical information of the surrounding environment. In addition, according to the validation results of the ROC, the statistical measures, and the SCAI, the three novel hybrid models proposed in this study all outperformed their respective single machine learning models in terms of flood susceptibility prediction. Compared with their respective single machine learning models, the AUC values of the SVM-FMV, CART-FMV, and CNN-FMV were 0.032, 0.005, and 0.055 higher; their SCAI values were 0.05, 0.03, and 0.2 lower; and their classification accuracies were 4.48%, 1.38%, and 5.86% higher, respectively. Therefore, these three hybrid models can be used as a reference for future studies involving flood susceptibility predictions and even for predicting other natural disasters in a given area. The FSI results obtained in this study revealed that the proportion of the area with high and very high flood susceptibilities ranged from 13.2% for the CNN to 22.02% for the CART. These zones were mainly distributed in the southern part of the Dadu River Basin where the terrain is relatively flat and the precipitation is high. However, flash flood susceptibility mapping cannot describe the details of flooding such as flood inundation extent, and also cannot quantify the impact of the flood management measures. In future study, it is recommended to develop combined hydraulic model.