Comparison of Machine Learning Algorithms for Flood Susceptibility Mapping

: Floods are one of the most destructive natural disasters, causing ﬁnancial and human losses every year. As a result, reliable Flood Susceptibility Mapping (FSM) is required for effective ﬂood management and reducing its harmful effects. In this study, a new machine learning model based on the Cascade Forest Model (CFM) was developed for FSM. Satellite imagery, historical reports, and ﬁeld data were used to determine ﬂood-inundated areas. The database included 21 ﬂood-conditioning factors obtained from different sources. The performance of the proposed CFM was evaluated over two study areas, and the results were compared with those of other six machine learning methods, including Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Deep Neural Network (DNN), Light Gradient Boosting Machine (LightGBM), Extreme Gradient Boosting (XGBoost), and Categorical Boosting (CatBoost). The result showed CFM produced the highest accuracy compared to other models over both study areas. The Overall Accuracy (AC), Kappa Coefﬁcient (KC), and Area Under the Receiver Operating Characteristic Curve (AUC) of the proposed model were more than 95%, 0.8, 0.95, respectively. Most of these models recognized the southwestern part of the Karun basin, northern and northwestern regions of the Gorganrud basin as susceptible areas.


Introduction
Natural disasters such as landslides, wildfires, tsunamis, and floods cause huge financial and human losses every year [1][2][3]. The effects of flooding are detrimental to human and ecological wellbeing around the world, making it one of the most destructive disasters [4]. For example, according to statistics, floods are accountable for more than half of the damage caused by natural disasters over the last five decades [5][6][7]. There are many immediate impacts of flooding, including loss of life, damage to properties and infrastructure, as well as loss of crops and livestock. Although many efforts have been made to reduce the negative effects of floods, the number of flood events has considerably increased [8]. Thus, reliable and accurate Flood Susceptibility Mapping (FSM) is vital in flood-prone regions [9].
To quantify the associated risk, damage, vulnerability, and spatial extent of floods, researchers have focused their efforts in recent years on understanding, predicting, estimating, and explaining flood hazards [10][11][12][13]. Flood susceptibility describes the risk of flooding in a specific region based on geo-environmental factors [14,15]. FSM is based on the relationship between floods and their causes [16,17]. It provides informative guidance for decision-makers in managing and preventing floods. Generally, there are four types of to produce accurate results [46]. Furthermore, deep learning frameworks contain several hyperparameters the tuning of which could be time-consuming and challenging [47]. To tackle these limitations, a new robust model called Deep Forest has been proposed by [48]. In the deep forest model, the layers are similar to those in Deep Neural Networks (DNN), but instead of neurons, each layer contains many random forests [49]. The deep forest model has two main components [47]: multi-grained scanning and Cascade Forest Model (CFM).
In this study, an CFM model was developed using 21 flood-influencing factors to produce accurate FSM results over two study areas. It should be noted that due to the limitation of the number of features in the input dataset, this research only used the second component of the deep forest model (i.e., CFM). The main contributions of this study are: (1) a novel CFM was developed for FSM for the first time; (2) the performance of the proposed FSM model was compared to other advanced machine learning methods; (3) the robustness and applicability of the proposed model and other methods were investigated over two different study areas; and (4) the most informative features for FSM were selected based on a combination of the Harris Hawks Optimization (HHO) algorithm and deep forest model and were then applied to FSM.

Materials and Methods
FSM can be applied in three main steps: (1) preprocessing and data preparation, (2) model training and tuning models' parameters, and (3) applying the trained model to generate FSM and accuracy assessment. The general framework of the proposed FSM method is shown in Figure 1. More details of each step are provided in the following subsections. Deep learning models have provided promising results in many applications [6,[40][41][42][43][44][45]. Although these methods can result in a high accuracy, they are more complicated compared to conventional machine learning algorithms and require a large amount of training datasets to produce accurate results [46]. Furthermore, deep learning frameworks contain several hyperparameters the tuning of which could be time-consuming and challenging [47]. To tackle these limitations, a new robust model called Deep Forest has been proposed by [48]. In the deep forest model, the layers are similar to those in Deep Neural Networks (DNN), but instead of neurons, each layer contains many random forests [49]. The deep forest model has two main components [47]: multi-grained scanning and Cascade Forest Model (CFM).
In this study, an CFM model was developed using 21 flood-influencing factors to produce accurate FSM results over two study areas. It should be noted that due to the limitation of the number of features in the input dataset, this research only used the second component of the deep forest model (i.e., CFM). The main contributions of this study are: (1) a novel CFM was developed for FSM for the first time; (2) the performance of the proposed FSM model was compared to other advanced machine learning methods; (3) the robustness and applicability of the proposed model and other methods were investigated over two different study areas; and (4) the most informative features for FSM were selected based on a combination of the Harris Hawks Optimization (HHO) algorithm and deep forest model and were then applied to FSM.

Materials and Methods
FSM can be applied in three main steps: (1) preprocessing and data preparation, (2) model training and tuning models' parameters, and (3) applying the trained model to generate FSM and accuracy assessment. The general framework of the proposed FSM method is shown in Figure 1. More details of each step are provided in the following subsections.

Study Areas
The proposed FSM model was applied to two study areas in Iran. These study areas were the Gorganrud and Karun basins ( Figure 2).

Study Areas
The proposed FSM model was applied to two study areas in Iran. These study areas were the Gorganrud and Karun basins ( Figure 2). The Karun basin originates from the west slopes of the Zagros mountain in Iran, and flows through the plain of Khuzestan. Karun River's watershed is part of the first-class watershed of the Persian Gulf and the Sea of Oman. The Karun basin has an area of 67,297 km 2 , which makes it the largest watershed in Iran. The basin lies within the middle Zagros highlands and is bounded by 30°00′ to 34°05′N latitudes and 48°00′ to 52°30′E longitudes.
The Gorganrud Basin, located in the province of Golestan, is one of the most floodprone watersheds in Northern Iran. It occupies an area of 11,290 km 2 located between 36°25′ to 38°15′N latitudes and 56°26′ to 54°10′E longitudes. There is a semi-arid, semihumid, humid, and Mediterranean climate in this basin with a mean annual temperature of about 18 °C. Over the last decade, there have been multiple large flood events in this basin; the largest one occurred on 11 August 2001, which killed more than 500 people. Furthermore, on 17 March 2019, over 70 villages, 12,000 homes, infrastructure, gardens, and agricultural lands were destroyed by a flood.

Flood Samples
Flood maps are essential for generating samples of flooded and non-flooded points. In this study, the reports published by the Iranian Water Resources Department for the period of 1985-2022 as well as investigative reports on disaster management for the flood events in the two study areas were used to create the samples. A field survey was also conducted to verify the data collected regarding flooded locations. Based on the flood The Karun basin originates from the west slopes of the Zagros mountain in Iran, and flows through the plain of Khuzestan. Karun River's watershed is part of the first-class watershed of the Persian Gulf and the Sea of Oman. The Karun basin has an area of 67,297 km 2 , which makes it the largest watershed in Iran. The basin lies within the middle Zagros highlands and is bounded by 30 • 00 to 34 • 05 N latitudes and 48 • 00 to 52 • 30 E longitudes.
The Gorganrud Basin, located in the province of Golestan, is one of the most floodprone watersheds in Northern Iran. It occupies an area of 11,290 km 2 located between 36 • 25 to 38 • 15 N latitudes and 56 • 26 to 54 • 10 E longitudes. There is a semi-arid, semihumid, humid, and Mediterranean climate in this basin with a mean annual temperature of about 18 • C. Over the last decade, there have been multiple large flood events in this basin; the largest one occurred on 11 August 2001, which killed more than 500 people. Furthermore, on 17 March 2019, over 70 villages, 12,000 homes, infrastructure, gardens, and agricultural lands were destroyed by a flood.

Flood Samples
Flood maps are essential for generating samples of flooded and non-flooded points. In this study, the reports published by the Iranian Water Resources Department for the period of 1985-2022 as well as investigative reports on disaster management for the flood events in the two study areas were used to create the samples. A field survey was also conducted to verify the data collected regarding flooded locations. Based on the flood susceptibility maps obtained by the Multi-Criteria Decision Making (MCDM) analysis, areas with lower flood susceptibility ranks were recognized. Non-flooded points were randomly created in those areas in the GIS software. Finally, the samples of flooded and non-flooded areas were divided into two groups of training, and test datasets (Table 1).

Flood Conditioning Factors
Current research considers 21 independent factors which affect FSM, shown in Table 2. The flood conditioning factors for both study areas are also illustrated in Figures 3 and 4. These factors were derived from Sentinel-1 and Sentinel-2, Landsat satellite imagery, DEM, and in-situ precipitation data. A total of 17 factors from the 21 investigated flood conditioning factors were calculated using DEM and 3D topography indicators.

Cascade Forest Model (CFM)
DNNs have a large number of hidden neurons, which learn representations layerby-layer by leveraging forward and backward propagation procedures [50]. In contrast, CFM creates a cascade of Decision Tree (DT) forests to learn classification distributions (features) based on layers of input data, supervised by the input data [48]. Therefore, CFM generates more accurate predictions based on ensembles of random forests because each layer learns more discriminative representations [51,52].
As illustrated in Figure 5, a CFM consists of multiple layers, each of which consists

Cascade Forest Model (CFM)
DNNs have a large number of hidden neurons, which learn representations layer-bylayer by leveraging forward and backward propagation procedures [50]. In contrast, CFM creates a cascade of Decision Tree (DT) forests to learn classification distributions (features) Remote Sens. 2022, 15,192 11 of 25 based on layers of input data, supervised by the input data [48]. Therefore, CFM generates more accurate predictions based on ensembles of random forests because each layer learns more discriminative representations [51,52].
As illustrated in Figure 5, a CFM consists of multiple layers, each of which consists of an ensemble module. Each layer receives features by concatenation of the input and output probabilistic features from the previous layer, then feeds the results to the next level [47]. For each layer, the process is repeated, and the final output is produced by averaging the forest outputs (without raw data) using the argmax function. To ensure that the ensemble is diverse, each layer includes different types of forests (the red and blue boxes in Figure 5).
Modified Fournier Index.

Cascade Forest Model (CFM)
DNNs have a large number of hidden neurons, which learn representations layerby-layer by leveraging forward and backward propagation procedures [50]. In contrast, CFM creates a cascade of Decision Tree (DT) forests to learn classification distributions (features) based on layers of input data, supervised by the input data [48]. Therefore, CFM generates more accurate predictions based on ensembles of random forests because each layer learns more discriminative representations [51,52].
As illustrated in Figure 5, a CFM consists of multiple layers, each of which consists of an ensemble module. Each layer receives features by concatenation of the input and output probabilistic features from the previous layer, then feeds the results to the next level [47]. For each layer, the process is repeated, and the final output is produced by averaging the forest outputs (without raw data) using the argmax function. To ensure that the ensemble is diverse, each layer includes different types of forests (the red and blue boxes in Figure 5).  The concatenation probabilistic features with original input features put into a single input feature vector can effectively prevent overfitting. The output of CFM, y l , for original input vector dataset, X 0 , for l th layer can be described as: where H refers to individual learner (i.e., RF algorithm) and n is number of individual learners.

Feature Selection
The feature selection process is important for dimensional reduction of the input datasets. This process removes the less effective features from the input dataset. The main purpose of using feature selection is to improve learning accuracy and to minimize the computational costs and time during the model training [53].
Heuristic strategies are utilized to determine a reasonably informative feature subset from the entire solution space, which may not be the best solution but will be accepted within the constraints of computational efficiency [54]. In Ref. [55], HHO was proposed as a population-based, nature-inspired optimization paradigm to select the optimum features. This optimization algorithm has widely been used in many applications of remote sensing [56]. HHO was inspired by the Harris' hawks in nature and their cooperative behaviors, in which several hawks attack their prey from different directions to surprise them. Based on dynamic scenarios and prey escaping patterns, Harris' hawks can present a variety of chasing patterns [57].
In terms of optimization, HHO is a continuous algorithm. Some real-world problems, such as feature selection, have a binary search space. Consequently, this algorithm should be reformulated efficiently to work on binary spaces. Thus, this study used the binary version of the HHO algorithm for feature selection.

Accuracy Assessment
The accuracy assessment was carried out by both visual interpretation of the results and statistical accuracy measures, such as Overall Accuracy (OA), Balance Accuracy (BA), F1-Score, Kappa Coefficient (KC), Area Under the Receiver Operating Characteristic (ROC) Curve (AUC), and Intersection Over Union (IOU). The OA index shows the overall performance of the model. The misbalancing of sample datasets is one the most important challenges in FSM using machine learning models. Therefore, different indices, such as such as BA, KC, and F1-score, were calculated to have a better evaluation of the model in an imbalanced dataset.
Moreover, to assess the effectiveness of the CFM, its performance was also compared to other advanced and well-known machine learning approaches, including SVM, DT, RF, LightGBM, DNN, XGboost, and CatBoost.

Results
All implemented models in this study have several hyperparameters that need to be tuned. These hyperparameters were knowledge-based, and the optimal values were selected based on trial and error (Table 3).

Variable Dependency
This study used 21 flood conditioning factors for FSM. Figure 6 shows the correlation between variables based on the Pearson correlation Coefficient (PCC) for the Karun and Gorganrud basins. Based on the results, the correlation between the independent variables was low (under 0.5). However, the correlation between the dependent variables (e.g., TRI and Slope factors) was more than 0.7.
A variance inflation factor (VIF) was employed to calculate the degree of multicollinearity among multiple predictor factors. The result of the VIF for the Karun and Gorganrud basins is shown in Figure 7. VIF values equal to 1, between 1 up to 5, and more than 5 indicate no correlation, moderate correlation, and high correlation between variables, respectively. This study used 21 flood conditioning factors for FSM. Figure 6 shows the correlation between variables based on the Pearson correlation Coefficient (PCC) for the Karun and Gorganrud basins. Based on the results, the correlation between the independent variables was low (under 0.5). However, the correlation between the dependent variables (e.g., TRI and Slope factors) was more than 0.7. A variance inflation factor (VIF) was employed to calculate the degree of multicolli earity among multiple predictor factors. The result of the VIF for the Karun and Gorga rud basins is shown in Figure 7. VIF values equal to 1, between 1 up to 5, and more tha 5 indicate no correlation, moderate correlation, and high correlation between variable respectively.

Basin.
A variance inflation factor (VIF) was employed to calculate the degree of multicollinearity among multiple predictor factors. The result of the VIF for the Karun and Gorganrud basins is shown in Figure 7. VIF values equal to 1, between 1 up to 5, and more than 5 indicate no correlation, moderate correlation, and high correlation between variables, respectively.    Table 4 presents the statistical indices used to compare the accuracy of different models. The ensemble learning models (i.e., RF and DT) provided higher performance, while other non-ensemble models, such as DNN and SVM had relatively lower accuracy. Among ensemble learning methods, the proposed CFM achieved the highest performance  Table 4 presents the statistical indices used to compare the accuracy of different models. The ensemble learning models (i.e., RF and DT) provided higher performance, while other nonensemble models, such as DNN and SVM had relatively lower accuracy. Among ensemble learning methods, the proposed CFM achieved the highest performance considering all accuracy indices. CFM resulted in an OA of more than 94.04, which was higher than SVM, RF, DNN, DT, LightGBM, CatBoost, and XGBoost by 8.70%, 1.06%, 6.32%, 3.79%, 0.40%, 0.46%, and 0.46%, respectively. Furthermore, the proposed CFM improved the results of the F1-Score index, which was approximately 12-0.76% compared to all other models.     Figure 10 illustrates the ROC curves of FSM models. The results indicated that the CFM achieved the highest AUC value (0.97). The CatBoost and XgBoost models were second best models with AUC = 0.96, the LightGBM model was third with AUC = 0.96. The AUC of other models was lower than 0.96.

FSM of the Gorganrud Basin
The results of FSM models for the Gorganrud basin are shown in Figure 11. Overall, the west side of the Gorganrud basin is located in the high flood-prone zones. However, the east side of the basin is classified as a low flood susceptibility region.

FSM of the Gorganrud Basin
The results of FSM models for the Gorganrud basin are shown in Figure 11. Overall, the west side of the Gorganrud basin is located in the high flood-prone zones. However, the east side of the basin is classified as a low flood susceptibility region.  The accuracy assessment of the FSM models in the Gorganrud basin is shown in Table 5. The proposed CFM had the best performance considering all indices. For instance, the OA of the CFM is 92.40 %, which is 3.13%, 0.89%, and 2.01% higher than the XGBoost, CatBoost, and LightGBM models, respectively. Additionally, CFM has a significant improvement compared to other remaining models. For example, it outperformed the SVM, RF, DNN, and DT methods by more than 1.57%, 3.46%, 5.7%, and 4.8% in terms of the OA index, respectively.  The confusion matrices of different FSM models over the Gorganrud basin are shown in Figure 12 The FSM results over the Gorganrud basin were evaluated using the ROC curve for all models (Figure 13). Overall, the CFM AUC = 0.954 and CatBoost = 0.959 models were the most efficient ones. The DT model provided the lowest performance among FSM models with AUC = 0.876. The remaining five FSM models (SVM, RF, DNN, XGBoost, and LightGBM) also had a lower AUC compared to CFM. The FSM results over the Gorganrud basin were evaluated using the ROC curve for all models (Figure 13). Overall, the CFM AUC = 0.954 and CatBoost AUC = 0.959 models were the most efficient ones. The DT model provided the lowest performance among FSM models with AUC = 0.876. The remaining five FSM models (SVM, RF, DNN, XGBoost, and LightGBM) also had a lower AUC compared to CFM.

Accuracy of the Proposed CFM
This study has introduced an advanced machine-learning model for FSM. The accuracy of the proposed CFM was compared with those of seven other FSM models over two different study areas. The results showed that CFM had better performance, with an OA of 92%. Furthermore, CFM showed higher accuracy in identifying both flood and nonflood classes. Thus, the proposed model could be more robust when unbalanced samples are employed for FSM. One reason for the accuracy improvement was the fact that the proposed model was developed based on a set of random forests in different layers. Another reason was the fact that this model could adapt itself to all the included features. Since the current study utilized a high number of features, the accuracy of the model increased compared to other models.
Recently, many studies have focused on machine learning-based FSM. For instance, Arabameri et al. [58] evaluated the performance of XGBoost, SVM, RF, and Logistic regression for FSM. Their results showed that XGBoost provided more promising results than other models. Furthermore, Kaiser, et al. [59] estimated the performance of Gradient Boosting Decision Tree, RF, and Catboost models. Their results showed that Catboost had more reliable results. Similarly, in the current study, the result of FSM implemented for two study areas showed that Catboost and XGBoost models outperformed other algorithms.
It is worth noting that there is a tradeoff between flooded and non-flooded classification using machine learning models. Based on the presented confusion matrices in Figure 9 and Figure 12, some differences for the flooded and non-flooded classes were observed. For instance, a comparison between Figure 12e,h showed that the performance of the LightGBM and CFM models was not similar. Thus, the proposed method had a high level of effectiveness in both flooded and non-flooded classes.

Accuracy of the Proposed CFM
This study has introduced an advanced machine-learning model for FSM. The accuracy of the proposed CFM was compared with those of seven other FSM models over two different study areas. The results showed that CFM had better performance, with an OA of 92%. Furthermore, CFM showed higher accuracy in identifying both flood and non-flood classes. Thus, the proposed model could be more robust when unbalanced samples are employed for FSM. One reason for the accuracy improvement was the fact that the proposed model was developed based on a set of random forests in different layers. Another reason was the fact that this model could adapt itself to all the included features. Since the current study utilized a high number of features, the accuracy of the model increased compared to other models.
Recently, many studies have focused on machine learning-based FSM. For instance, Arabameri et al. [58] evaluated the performance of XGBoost, SVM, RF, and Logistic regression for FSM. Their results showed that XGBoost provided more promising results than other models. Furthermore, Kaiser, et al. [59] estimated the performance of Gradient Boosting Decision Tree, RF, and Catboost models. Their results showed that Catboost had more reliable results. Similarly, in the current study, the result of FSM implemented for two study areas showed that Catboost and XGBoost models outperformed other algorithms.
It is worth noting that there is a tradeoff between flooded and non-flooded classification using machine learning models. Based on the presented confusion matrices in Figures 9 and 12, some differences for the flooded and non-flooded classes were observed. For instance, a comparison between Figure 12e,h showed that the performance of the LightGBM and CFM models was not similar. Thus, the proposed method had a high level of effectiveness in both flooded and non-flooded classes.
The complexity of the study area could negatively affect the results of FSM produced by machine learning models. For example, most models had lower accuracies in the Gorganrud basin compared to the Karun basin. However, the results of CFM over both study areas were high. Thus, the proposed CFM could show robust and better results in various study areas with different characteristics.

Flood Susceptible Areas
A generated flood susceptibility map can help decision makers in identifying flood prone areas. In Figure 14 The complexity of the study area could negatively affect the results of FSM produced by machine learning models. For example, most models had lower accuracies in the Gorganrud basin compared to the Karun basin. However, the results of CFM over both study areas were high. Thus, the proposed CFM could show robust and better results in various study areas with different characteristics.

Flood Susceptible Areas
A generated flood susceptibility map can help decision makers in identifying flood prone areas. In Figure 14, some of flood susceptible areas in Karun basin are shown. As is noticeable, flood inventory points are overlaid with areas, which have a high flood susceptibility index. In Karun basin, prone areas are mostly located in the west-southern part, due to the fact that the values of important factors in this area are in line with higher flood susceptibility. For instance, the value of elevation (DEM), distance to stream (HOFD), slope is low, and the level of wetness index (TWI) and catchment area (MCA) is high. The same condition is true for other regions demonstrated in the figure such as northern and eastern parts.

Feature Selection
This study applied the HHO-based optimization framework for feature selection. In this regard, the HHO algorithm and CFM were used together for feature selection using the Karun sample dataset. The result of the feature selection process indicated that the highest accuracy can be obtained by using 12 optimal features. After removing three features including Cross Sectional Curvature, Curve Number (CN), and Land Use/Land Cover (LULC) from the input dataset, the value of OA (93.84%) remained unchanged. The elimination of these features had only 0.2% impact on the OA. Therefore, it can be concluded that these features do not have a key role in achieving promising results. Remote Sens. 2023, 15, x FOR PEER REVIEW 25 of 29 Figure 15. Examples of highly susceptible areas in the Gorganrud basin.

Feature Selection
This study applied the HHO-based optimization framework for feature selection. In this regard, the HHO algorithm and CFM were used together for feature selection using the Karun sample dataset. The result of the feature selection process indicated that the highest accuracy can be obtained by using 12 optimal features. After removing three features including Cross Sectional Curvature, Curve Number (CN), and Land Use/Land Cover (LULC) from the input dataset, the value of OA (93.84%) remained unchanged. The elimination of these features had only 0.2% impact on the OA. Therefore, it can be concluded that these features do not have a key role in achieving promising results.

Model Generalization
The training samples dataset are important for machine learning-based FSM models, which might not be available in all areas. Thus, utilizing a pre-trained model can be a reasonable solution for such conditions. In this study, we evaluated the generalization of the proposed CFM by employing a trained model in two scenarios: (1) training the model using the training samples of the Karun basin and evaluating the accuracy of the CFM using the test samples of the Gorganrud basin, and (2) training the model using the training samples of the Gorganrud basin and evaluating the accuracy of the CFM using the test samples of the Karun basin.
The qualitative results of the generalization of different models for the first and second scenarios are provided in Table 6 and Table 7, respectively. Overall, the performance of most models including the proposed CFM, was better in the second scenario than that of the first scenario. As seen, it has provided an acceptable result in the FSM, with an accuracy of more than 79% by the OA index.

Model Generalization
The training samples dataset are important for machine learning-based FSM models, which might not be available in all areas. Thus, utilizing a pre-trained model can be a reasonable solution for such conditions. In this study, we evaluated the generalization of the proposed CFM by employing a trained model in two scenarios: (1) training the model using the training samples of the Karun basin and evaluating the accuracy of the CFM using the test samples of the Gorganrud basin, and (2) training the model using the training samples of the Gorganrud basin and evaluating the accuracy of the CFM using the test samples of the Karun basin.
The qualitative results of the generalization of different models for the first and second scenarios are provided in Tables 6 and 7, respectively. Overall, the performance of most models including the proposed CFM, was better in the second scenario than that of the first scenario. As seen, it has provided an acceptable result in the FSM, with an accuracy of more than 79% by the OA index. It is worth noting that both study areas have similar topographic, geomorphologic and LULC conditions. Moreover, most of the investigated factors which are related to the topography and geomorphology have a similar range of values in the different areas (Table 2). Therefore, the proposed method has provided efficient and robust FSM results.

Dimension Reduction Impact on FSM
Feature selection is a kind of dimension reduction technique which selects the informative features based on different approaches. The filter-based feature selection approach investigates the features based on a similarity measurement metric such as Pearson Correlation Coefficient (PCC). The mentioned approach was used in order to detect highly correlated features according to a threshold equal to 0.7. Consequently, some features, namely, TRI, SPI, Curvature, VOFD, VD and LSF were eliminated from the dataset. Table 8 illustrates the performance of different models over the Karun basin before dimension reduction process.   After implementing the dimension reduction analysis and eliminating the correlated features, the result of all the models saw an increase in accuracy. By comparing Tables 4 and 8, it is concluded that SVM method experienced a noticeable accuracy improvement due to the reason that in the SVM algorithm, features' dependency affects the performance of the model. Moreover, tree-based algorithms consider the features as independent variables. Therefore, dimension reduction does not greatly increase the efficiency of this kind of approaches. However, it leads to a reduction in computational cost in all the applied approaches by decreasing the number of features from 21 to 15.

Conclusions
FSM is one of the most important steps to prevent damage from flood events. In this study, the effectiveness of a new advanced machine learning method, CFM, was investigated for FSM. We employed two large-scale datasets for evaluating the performance of the CFM. Additionally, 21 flood conditioning factors were generated to obtain high accuracy. In the Karun basin, prone areas were mostly located in the west-southern part. The flood prone areas of the Gorganrud basin were also located in the western zone as well as some part of the central and eastern zones. For both study areas, the flood inventory points are overlaid with areas with high flood susceptibility values. The accuracy of CFM was compared with those of seven conventional and advanced machine learning models. The proposed CFM outperformed other models in both study areas. CFM provided an OA of more than 92%. Furthermore, the generalization assessment of the models showed that the CFM had a higher generalization capability compared to other models. In this study, we set the model hyperparameters based on several trial and error efforts. Future studies should use automatic methods, such as Genetic algorithm (GA), HHO, particle swarm optimization (PSO) for tuning the hyperparameters of the models.