Flood Detection and Susceptibility Mapping Using Sentinel-1 Remote Sensing Data and a Machine Learning Approach: Hybrid Intelligence of Bagging Ensemble Based on K-Nearest Neighbor Classiﬁer

: Mapping ﬂood-prone areas is a key activity in ﬂood disaster management. In this paper, we propose a new ﬂood susceptibility mapping technique. We employ new ensemble models based on bagging as a meta-classiﬁer and K-Nearest Neighbor (KNN) coarse, cosine, cubic, and weighted base classiﬁers to spatially forecast ﬂooding in the Haraz watershed in northern Iran. We identiﬁed ﬂood-prone areas using data from Sentinel-1 sensor. We then selected 10 conditioning factors to spatially predict ﬂoods and assess their predictive power using the Relief Attribute Evaluation (RFAE) method. Model validation was performed using two statistical error indices and the area under the curve (AUC). Our results show that the Bagging–Cubic–KNN ensemble model outperformed other ensemble models. It decreased the overﬁtting and variance problems in the training dataset and enhanced the prediction accuracy of the Cubic–KNN model (AUC = 0.660). We therefore recommend that the Bagging–Cubic–KNN model be more widely applied for the sustainable management of ﬂood-prone areas. = 0). In the test step, the MSEs and RMSEs of the Cubic–KNN, Coarse–KNN, Cosine–KNN, Weighted–KNN, and Bagging-Tree models are, respectively, 0.0396 and 0.1989, 0.0682 and 0.2611, 0.0682 and 0.2611, 0.0568 and 0.2384, and 0.0454 and 0.2132. These results suggest that the Cubic–KNN model performed best in the test step (mean = − 0.0324 and standard deviation = 0.1966). that have a relationship to ﬂooding; in this we refer to these as conditioning factors. We used Sentinel-1 remote sensing radar data to identify and map ﬂood locations in the Haraz watershed in northern Iran. We used 10 ﬂood conditioning factors and 201 ﬂood locations as our model inputs. Eight new hybrid models (Cubic–KNN, Bagging Tree–Cubic KNN, Coarse–KNN, Bagging Tree–Coarse–KNN, Cosine–KNN, Bagging Tree–Cosine–KNN, Weighted KNN, and Bagging Tree–Weighted KNN) were created to analyze and map ﬂood susceptibility. Results based on the relief attribute evaluation metric indicate that distance from the river and slope gradients are the two most important factors for ﬂood occurrence in the Haraz watershed. Among the eight models, we found that Bagging Tree–Cubic KNN model has the highest predictive power.


Introduction
Increases in global flood occurrences have been attributed to deforestation, land-use changes, poor watershed management, and climate change [1][2][3]. Floods happen when streams overflow their banks, often as a result of heavy rainfall, and inundate surrounding areas that are not typically covered by water [4]. Floods can damage roads, rail lines, agriculture, and ecosystems, claim lives, and pollute surface water through the transfer of biological and industrial waste, resulting in environmental pollution [5][6][7][8]. More than 20,000 lives are lost to flooding annually [9], and between 1995 and 2015, approximately 109 million people were impacted by the flood damage, with direct costs of USD 75 billion per year [10].
Iran is an arid and semiarid country that is prone to damaging floods, especially in its northern provinces. Between 25 March and 8 April 2019, for example, a devastating flood impacted more than 25 of the 31 provinces in the country. Damage was exacerbated by heavy rainfall, poor watershed management, inadequate flood control structures, and a lack of a flood warning system. Maps of flood hazard and risk derived from physical models that only predict peak discharge may be subject to considerable uncertainty and error [11], and numerical models require large amounts and types of data that are difficult to acquire in a developing country like Iran. Fortunately, over the past several decades, remote sensing (RS) and Geographic Information Systems (GIS) have been shown to be effective in handling large hydrological datasets to create more accurate flood hazard maps.
Our study focuses on the Haraz catchment in northern Iran ( Figure 1). This catchment has a wetter climate, more cloudy days, and denser vegetation than other parts of Iran, making flood susceptibility mapping based on optical remote sensing imagery more challenging than in regions with little vegetative cover and fewer cloudy days. In such areas, satellite-based, synthetic aperture radar (SAR) and light detection and ranging (LiDAR) penetrate clouds and detect the ground surface and surface water; they are valuable tools for real-time flood forecasting [12,13]. SAR can collect data during day or night, either independently or together with other remote sensors [14]. In this study, we used imagery acquired by Sentinel-1, a SAR satellite known for its high spatial resolution and short repeat cycles, which makes it ideal for monitoring changes in flood inundation [15].
Several data-driven models have been developed and used for flood mapping, including bivariate models of frequency ratio [16,17], Shannon entropy [18], weight of evidence (WOE) [11], and the evidential belief function (EBF) [16]. In addition, a variety of multivariate methods have been used in flood hazard studies, notably logistic regression [19,20] and multicriteria decision-making (MCDM) methods such as analytic hierarchy process (AHP) [21][22][23] analytic network process (ANP) [24], vlse kriterijuska optamizacija I komoromisno resenje (VIKOR), and a technique for order preference by similarity to ideal solution (TOPSIS) [25]. Unfortunately, many of these models have performance limitations in that they do not incorporate nonflood locations and generally consider only sum weights or class weights rather than weights for specific layers [26]. Additionally, MCDM models are based on expert opinion and generate the greatest sources of bias and error [25,27]. Finally, flooding at a watershed scale is a complex phenomenon, involving nonlinear processes that cannot be predicted using these simple models.
Recently, artificial intelligence (AI) algorithms have been developed to overcome these weaknesses. Artificial neural network (ANN) is the most widely used algorithm in hydrology [28,29], but has poorer predictive power when the range of the testing dataset is not within the range of the training dataset [30][31][32][33]. To improve its predictive power, researchers have integrated the ANN model with fuzzy logic (FL) and adaptive neuro-fuzzy interface (ANFIS) models. Although ANFIS is a powerful algorithm and has higher predictive power than both ANN and FL, its membership function fails to adequately determine optimum weights [34,35], hence an optimization algorithm has been applied to calculate optimum values automatically [8,9,36,37].
Further developments in hazard modelling have relied on hybrid algorithms. Within this group are machine learning ensemble models, which are more flexible and better suited for sophisticated flood modeling than the above-mentioned methods. Machine learning ensemble models have been shown to provide better hazard predictions for floods [8,9,16,25,[38][39][40][41][42], wildfires [43,44], sinkholes [45], droughts [42,46], earthquakes [47,48], gully erosion [49,50], ground subsidence [51], groundwater [52][53][54][55][56], and landslides [15,55,. Nevertheless, there still is no universal model that has been shown to be superior in all study areas [35]. In this paper, we develop and test four new algorithms of K-Nearest Neighbor (KNN), a machine learning ensemble method that has not previously been used for flood ensemble modeling. The four algorithms are Cosine KNN, Coarse KNN, Cubic KNN, and Weighted KNN. We compare the performance of the four KNN algorithms with those of Bagging Tree models and a hybrid of KNN and bagging.

Description of Study Area
The Haraz watershed is located in Mazandaran Province in northern Iran ( Figure 1). The 4015 km 2 watershed is mountainous, ranges in elevation from 328 m to 5595 m asl, and has cold winters and mild humid summers with mean annual rainfall of 430 mm [16]. Factors that contribute to flooding here include rainfall, deforestation, land-use changes, and inadequate flood management policies [53]. GIS data show that slopes in the watershed range up to 66 • , with 5% flat terrain and 95% hilly and mountainous terrain [8]. Most of study area (92%) is rangeland. The ground is rocky and dominantly developed on Jurassic formation [16]. Haraz has a long history of catastrophic flooding. In April 2019, floods in Mazandaran Province killed six people, damaged more than 200 villages, and caused USD $166.4 million damage to agriculture [16]. Thus, there is a pressing need for more reliable flood hazard maps for this area.

Methodology
The flowchart for the methodology used in this study is shown in Figure 2. The workflow includes: (1) data collection and preparation, which involves determining appropriate conditioning factors (factor ranking and selection); (2) preparation of a flood inventory map; (3); modeling flood susceptibility with KKN functions and its ensembles using the Bagging Tree algorithm; (4) preparation of flood susceptibility maps; and (5) validation and comparison of the models and flood susceptibility maps using training (goodness-of-fit) and validation (prediction accuracy) datasets.

Flood Inventory Map
We mapped flooded areas using Sentinel-1 images, remote sensing data and field surveys. In this study, a flood inventory was assembled based on flood events in 2008, 2012, 2016, and 2017. We also used flood event data collected by the Mazandaran Regional Water Authority (MRWA), aerial photographs, Google Earth, and field surveys. To prepare our flood map, we chose 201 flood points and 201 nonflood points, of which we used 70% for training (141 points) and 30% for validation (60 points). Both flood and nonflood points are needed for flood susceptibility modelling [11,79].

Flood Conditioning Factors
A variety of flood conditioning factors should be tested in flood susceptibility modelling [11]. We chose the following 10 conditioning factors (Table 1) for our study [80] and mapped them at 30-m spatial resolution [30]: distance to river, elevation, slope, lithology, curvature, rainfall, topographic wetness index (TWI), stream power index (SPI), land use/land cover, and river density. We quantified topographic and hydrological factors using an Advanced Space-borne Thermal Emission and Reflection Radiometer (ASTER) DEM. Relevant details for the 10 conditioning factors are described below and in Table 1:

Slope
Higher slope angles increase water velocity and surface runoff [81] and reduce infiltration. Lower slope angles are associated with greater flood depths [82]. We classified slope angle based on the manual classification method into five categories:

Curvature
Water flow is affected by slope curvature [84]. A zero curvature value generally has more potential for flooding than positive and negative curvature values. Most flood-prone areas in the Haraz watershed have zero curvature values associated with flat landforms. We classified curvature using the natural breaks classification method and defined three categories: convex (negative values), flat (zero value), and concave (positive values).

Stream Power Index
Stream power index (SPI), which is a measure of the erosive power of water flow, is defined by the followed equation [85]: where A S is the specific area in m 2 /m and β is the slope angle in degrees. SPI is related to fluvial processes such as sediment transport and river channel erosion [86]. Fuller [87] found that a high SPI value in confined channels can lead to severe channel transformation. It is generally accepted that an increase in SPI corresponds to an increased likelihood of flooding. We classified SPI using the manual classification method with nine categories: 0-80, 80-400, 400-800, 800-2000, 2000-3000, and >3000.

Topographic Wetness Index
The topographic wetness index (TWI) is a measure of the tendency for water to accumulate at any location within a catchment under the influence gravity and is an important attribute in flood susceptibility mapping [87][88][89][90][91]. It generally reflects spatial soil moisture patterns related to floodplains [90]. Moore et al. [87] proposed the following equation to calculate TWI: We classified SPI using the natural breaks classification method with six categories: 1.9-3.94, 3.95-4.47, 4.48-5.03, 5.04-5.71, 5.72-6.96, and 6.97-11.53.

Lithology
Lithology can affect flooding through the differences in permeability of rocks and sediments [17]. We obtained a geology layer in GIS shapefile format, which was originally prepared by the Iran Geological Survey Department, from the Mazandaran Regional Water Organization. We created three geologic units: Paleozoic rocks (4.7% of watershed), Mesozoic rocks (56.4%), and Cenozoic rocks and sediments (38.9%).

Rainfall
Rainfall has an obvious and direct effect on flood occurrence [9,16,17,37,92] and, for flood susceptibility mapping, is most commonly expressed as annual rainfall [93]. We quantified the rainfall factor based on 20 years of precipitation data (1991-2011) from 17 stations inside and outside the study area. We selected a simple kriging method to create the rainfall layer because it produced the lowest root mean square error (RMSE) and mean absolute error (MAE) [2]. We divided the rainfall layer into six classes: 183-333, 334-379, 380-409, 410-448, 449-535, and 536-741 mm [86].
Land Use/Land Cover Land use/land cover has an important role in flooding. For example, runoff increases when vegetated land is converted to bare land [94]. We extracted land use/land cover from the operational land imager (OLI) of Landsat 8 scenes acquired in 2013 using the land-use unit classification method in ArcGIS 10.3 and supervised classification in Environment for Visualizing Images (ENVI 5.1) software. Our seven land use/land cover classes are: water bodies, residential areas, grassland, garden, farm land, forest land, and barren land.

River Density
River density is a measure of the number of streams and rivers in an area. If all other conditioning factors are constant, high river densities have a higher potential for flooding than low river densities [8]. We classified river density using the natural breaks classification method and defined six categories: 0-0.401, 0.401-1.17, 1.92-2.67, 2.67-3.66, and 3.66-7.3 km/km 2 .

Distance to River
Distance to river (i.e., distance of the measurement points from the river) plays a major role in the distribution and magnitude of floods in the study area [95]. The shorter the distance, the higher the probability of flooding, especially where the river has a low storage capacity [96,97]. To create the distance-to-river layer, we edited the digital watershed map using the multi-ring buffer command in ArcGIS 10.3. Generally, low infiltration rates in the Haraz watershed result in rapid runoff in the vicinity of rivers during high-intensity rainfall events, which in turn causes catastrophic flooding in areas with low topographic gradients [28]. We divided distances to river into eight classes: 0-50, 50-100, 100-150, 150-200, 200-400, 400-700, 700-1000, and >1000 m.

Detection of Flood-Prone Area by Sentinel-1
Sentinel-1 is the first satellite constellation of the European Space Agency's Copernicus Programme and comprises two satellites that share the same orbital plane-Sentinel-1A and Sentinel-1B. They carry a C-band (5.7 cm wavelength) synthetic radar instrument, which collects data in all weather, day or night. The radar has four different operational modes: strip map (SM), wave (WV), interferometric wide swath (IW), and extra wide swath (EW). Its main drawback is that radar waves cannot penetrate dense vegetation [98].
The backscatter signal from inundated areas is identifiable in Sentinel-1 SAR data products, which are freely available through the Sentinel Scientific Data Hub (scihub.copernicus.eu). The specular reflection of C-band signals over flooded areas is significantly lower than over bare ground in the present study, Sentinel-1 Level-1 Ground Range Detected (GRD) data were projected onto the ground using an Earth ellipsoid model (WGS84). Finally, we used Sentinel-1 SAR data to identify and map flooded areas using the InSAR method [99][100][101].

Data Preprocessing and Processing
The process of flood detection using Sentinel-1 data includes the following steps: Step 1: Radar data acquisition. We used Sentinel's Application Platform (SNAP) to manipulate radar data, as well as threshold data acquired during the flood (Table 2).
Step 2: Radar data preprocessing: We coregistered radar images using the coherence between master and slave images [102]. We selected two images from 05/10/2016 and 23/11/2017 as the master images. We combined a split su-swath and applied the orbit file technique to extract the boundary of the study area. We then overlaid the coregistered radar data. Next, we enhanced the spectral resolution of the radar images using a spectral diversity technique.
We produced an interferogram by multiplying the values of pixels in the master image and the conjugate complex number of related pixels in the slave image [102,103]. To detect flood-prone areas, we applied pre-and post-flood data by the interferogram formation technique.
We identified zones of terrain observation progressive scan (TOPS) data [104]. Data within these zones were considered to be invalid and thus were removed. Removal of the topographic phase provided an interferogram [102,104,105] that allowed us to specify nonflood-prone areas. Finally, we used phase filtering to detect flood-prone.
Step 3: Radar data processing. We used the output from step 2 as input for processing the digital images with SNAPHU and ENVI 5.1 software. We used ArcGIS 10.3 to analyze spatial data ( Figure 2). We viewed the phase and the unwrapped and the coherence bands in Google Earth to identify and record historical flood locations. We used a handheld GPS in the field to validate the extracted flood-prone locations, 40% of which were near the main rivers. Finally, we verified the accuracy of Google Earth images and the radar data, and vectorized points using ArcGIS 10.3 software. For georeferencing, we employed ground control points (GCPs), nearest neighbor resampling, and a first-order transformation ( Figure 3).

K-Nearest Neighbor Classifier
K-Nearest Neighbor (KNN) is a common classification tool used in data mining applications [106]. It is a nonparametric, lazy learning algorithm that makes no assumptions about the primary dataset. This is important when modeling hydrological processes, such as floods and stream flow, for which there is little or no prior knowledge of the data distribution [107]. In addition, these processes are nonlinear and heterogeneous with noisy data that challenge common statistical assumptions such as those underpinning linear regression models [108]. In this context, KNN is a useful tool as it uses all contributing cases in the dataset and classifies new cases based on their similarity indices (also called 'distance functions'). Cases are classified by voting for neighbor classes. The optimal case is the one with the highest similarity indices [109].
In KNN, the optimal choice of the chosen number of neighbors (K) depends on the metrics used for classification and regression purposes. In the case of continuous variables, the most common distance metric is Euclidean distance, also known as the straight-line distance. Conversely, for discrete variables, the overlap metric (or Hamming distance) is frequently used. Other metrics that have been used are correlation coefficients, such as the Pearson and Spearman correlation coefficients. The K value is sensitive to the chosen dataset and differs between datasets. Based on an empirical rule-of-thumb introduced by Dude [110,111], K is equal to the square root of the number of samples; this makes parameter tuning difficult for diverse applications.
There are other popular methods, such as K-fold Cross-Validation (CV), Leave-one-Out Cross-Validation (LOOCV), and bootstrapping. K-Fold Cross-Validation can be used to evaluate the test error with a statistical learning method. This approach places randomly chosen sets of observations into K folds of equal size. In contrast, LOOCV does not use two sets of equal size; rather, it employs a single observation for the validation set and the remaining observations for the training set. We use these two methods as well as bootstrapping to measure the accuracy of our statistical learning approach. However the K-fold Cross-Validation method is preferred for the following reasons [112]:

2.
The K-fold CV offers a greater computational advantage than other methods. 3.
The K-fold CV yields more accurate estimates of the test error than bootstrapping and LOOCV.
With K-fold CV, the training phase is short and fast. All training datasets are required during the testing phase to decide on the best subset of the entire training dataset. This method has been used in diverse applications such as big data classification, pattern recognition, ranking models, and computational geometry [106].
The K-fold CV algorithm applies a vector as an input to the K training dataset. It then uses the most common class to classify the K nearest neighbors. During the training phase, neighbors are defined based on their distances from the test dataset; the classes of the test dataset are determined in the testing phase [4]. The number of neighbors can be changed to determine the best performance of the KNN algorithm. There are four KNN classifiers introduced by MATLAB [113]:

1.
Coarse KNN: The number of neighbors is 100. The classifier is defined as the nearest neighbor among all classes.

2.
Cosine KNN: The cosine distance metric is the nearest neighbor classifier. It is generally used as a metric for distances when vector magnitudes are irrelevant. The following equation is used to measure the distance between two vectors, u and v [113]: 3.
Cubic KNN: The number of neighbors is 10, and the cubic distance metric is the nearest neighbor classifier [109]. The following equation is used to measure the distance between two n-dimensional vectors, u and v: 4.
Weighted KNN: The number of neighbors is 10, and the weighted Euclidean distance is used as the nearest neighbor classifier. The following equation is used to measure the weighted Euclidean distance between two n-dimensional vectors, u and v: where 0 < w i < 1 and n i=1 w i = 1.

Bagged Tree Ensemble Algorithm
Ensemble methods apply a variety of decision trees, instead of only one, to improve predictive performance. The two most common techniques used with ensemble models are [114] bagging and boosting.
Bagging (Bootstrap Aggregation) improves the precision and consistency of machine learning algorithms used for regression and statistical classification. The purpose of bagging is to decrease variance while retaining the bias of a decision tree and preventing overfitting. The Bagging Tree randomly generates multiple sets of input data from training samples by replacement [115]. The chosen subset data are used to train the assigned trees and generate models. Subsequently, the average of all predictions from these trees is used to make the final decision with a higher degree of robustness. The accuracy of a single tree is increased by using multiple copies of the trained subset of data.
Boosting is a useful ensemble method in high bias situations. Predictors are trained sequentially with simple training models, and the data are then analyzed for errors. At every step, the net error is calculated from the prior decision tree [115]. In a high bias dataset for which an input is not well classified by an hypothesis, its weight is amplified so that next hypothesis will classify it properly.
For the present study, we used the Bagging Tree ensemble method on a well classified set of inputs with low bias. The method yields results with a lower variance than its components, which in turn makes the learning procedure more efficient. The best classifier type depends on the training dataset. In the current study, we employed a classifier that provides the optimum tradeoff in memory, speed, interpretability, and flexibility.
We subdivided the dataset into two probable classes and generated an algorithm of continuous classifiers (H m , = 1 . . . M) Hm : Dm → R on a training set (flood collection) D. We then grouped the generated classifiers into a composite classifier with a resulting prediction weight as follows: Equation (6) describes a voting procedure known as majority (plurality) voting for each classifier. Plurality voting efficiently attains the optimum tradeoff in error and rejection rate. An example d i is classified based on the majority of classifier votes [116][117][118]. a m , m = 1, . . . , M are parameters that indicate the impact of more accurate classifiers on the final result. H m are termed 'weak classifiers' because their accuracy is higher than the accuracy of other random classifiers [119].
We used the following bagging algorithm in our study [120]: where sign is : We note that to achieve a better performance and decrease the classification error, the H m values can be reformed, while α m values remain constant.

Proposed New Ensemble Machine Learning Models of Bagging with KNNs Functions
We used the Classification Learner application in MATLAB R2018a to automatically train a selection of different KNN classification models on a training dataset. Then we used the Bagging Tree ensemble together with the coarse, cosine, cubic, and weighted KNN base classifiers to spatially predict floods. For a given training set, we produced multiple different training sets ('bootstrap samples') from replacement samples from the original dataset. Then, we built KNN models for each bootstrap sample. The result is an ensemble of models, where each model votes with equal weight. The goal of this procedure is to reduce the variance of the model of interest.

Flood Factor Selection Using the Relief Attribute Evaluation (RFAE) Technique
Supervised machine learning algorithms rely on the selection of the best factors or features to accurately classify sample data and enhance the efficiency of training [121]. The main aims of factor and feature selection are to enhance the learning efficiency of the modelling process and the robustness of predictive accuracy, and to reduce complexity, noise, and overfitting by eliminating irrelevant or low-performing factors [122]. Conditioning factors can be evaluated and categorized based on a variety of metrics, including distance, information, dependency, consistency, and classifier error rate [123]. In this study, we selected the Relief Attribute Evaluation (RAE) technique to check the importance of conditioning factors on flood classification performance ( Figure 3). RAE is a distance-based attribute/factor ranking approach proposed by Kira and Rendell [124], and later improved by Kononenko [125] and Hall and Holmes [126]. It calculates the class of each attribute based on the distance between the data point and its nearest neighbors (Figure 4). First, it randomly selects instances in the training dataset (Ri in line 3 of Figure 4). Then, it searches for K of its nearest neighbors from the same class, as well as from each of the different classes, called nearest hit Hj and nearest miss (Mj(C) (lines 4 and 6, respectively). Depending on the average values of Ri, Hj, and Mj (C) (lines 7, 8, and 9), RAE updates the quality estimation W[A] for all attributes. W[A] is reduced when instances Ri, and Hj have different values of attribute A. To obtain a desired value, attribute A is separated into two instances with the same class values. If Ri and Mj (C) have different values of attribute A, attribute A is divided into two instances with different class values. The prior probability for each class of misses, P(C), is calculated based on the training dataset. P(C) is symmetric and ranges from 0 and 1 for hits and misses. If the sum of the class is missing, its probability weight is divided by factor 1-P (class (Ri)) to represent its probability sum. This process is repeated m times. The quality of a flood attribute is evaluated based on how well it distinguishes nearby instances. Weights for all attributes are assigned by the ReliefF algorithm through iterative estimation using the nearest hit-and-miss neighbors. Accordingly, an attribute is ranked highest if the same value is obtained for instances of the same class and distinguished for instances of different classes [127,128].

Evaluation and Comparison
New models should be tested to verify their performance and evaluate their potential applicability in other regions. For the purpose of validation, an objective function ('forecasting error'), such as mean square error (MSE) and root mean square error (RMSE), can be used to find the difference between observed and predicted values. Although there are a variety of error indices that can be used to assess the predictive capability of the models, many studies advocate the use of RMSE as a standard metric for model errors in geosciences [129]. MSE and RMSE can be formulated as follows: where F est. , F obs. and n are respectively, estimated floods, observed (actual) floods, and the number of floods for the modelling process. In addition to MSE and RMSE, we used accuracy, the receiver operatic characteristic curve (ROC), and the area under the ROC curve (AUC) to further evaluate the predictive capability of the models. The accuracy metrics are formulated based on true positive (TP), true negative (TN), false positive (FP), and false negative (FN) values. TP and TN are the number of flood pixels that are correctly classified as flood and nonflood pixels, respectively [37,52]. FP and FN are the number of nonflood pixels that correctly classified as nonflood and flood pixels, respectively [16,17]. Accuracy can be formulated as follows: The ROC curve has been used in some flood modeling studies to check the overall performance of models [8,39,40,130]. It is plotted using two statistical metrics-specificity on the x axis and sensitivity on the y axis [63]. Specificity and sensitivity are defined, respectively, as the number of incorrectly and correctly classified floods [92]. An AUC equal to 1 indicates that the model is perfect or ideal, whereas a value of 0 indicates an inaccurate model [3].
where M and N are the number of total flood and nonflood pixels [131].

Flood Detection Using AIRSAR and Optical Satellite Images
Using the InSAR technique and SNAP software, we generated coherence, unwrapped, and phase bands from Sentinel-1 satellite imagery dating to between 05/10/2016 and 23/11/2017. The highest and lowest values in the phase and unwrapped bands were mapped and depicted on maps in red and green colors. The coherence band provided the best results because white areas (high values) can be clearly distinguished from stable areas (low values). The InSAR-generated coherence, phase, and unwrapped bands were then transformed into KML format and draped on Google Earth (GE) images to digitize flood locations. Our Sentinel-1-derived flood polygons are in good agreement with our field survey observations ( Figure 5).

The Most Important Factors for Flood Modelling
The results of factor selection by the RFAE technique are shown in Figure 6

Flood Modelling Process
The Bagging Tree and Modified K-Nearest Neighbor classifiers (Cubic-KNN, Coarse-KNN, Cosine-KNN, and Weighted-KNN) were used in this study for flood modelling. We trained and tested the models with, 70% and 30% of our dataset, respectively. We calculated the accuracy criteria of the models by comparing the training/test dataset with predicted flood pixels as output (Figure 7   We also evaluated the accuracy of the KNN classifier functions in the modelling process. Table 3 shows the optimum parameters for achieving the highest model accuracy. The Cubic-KNN model has the highest accuracy value (96.4%), followed by the Cosine-KNN (92.8%), Weighted-KNN (92.14%), and Coarse-KNN (92.1%) models. We also built hybrid models of Bagging Tree based on KNN classifiers and derived their optimum parameters based on the highest accuracy. Table 4 shows the optimum parameter values of the hybrid models. We obtained the highest accuracy for the hybrid model of Bagging Tree-Coarse KNN (98.6%), followed by Bagging Tree-Weighted KNN (97.1%), Bagging Tree-Cosine KNN (96.6%), and Bagging Tree-Cubic KNN (94.3%).

Development of Flood Susceptibility Maps
We used the hybrid methods to evaluate the flood susceptibility index (FSI) in all pixels in our study area. Each pixel was given a unique FSI, and the results then were exported into a readable ArcGIS 10.3 format for the task of flood mapping. We classified the calculated FSIs into flood and nonflood classes. Figure 8 shows flood susceptibility maps produced by the Bagging Tree ensemble and based on Modified K-Nearest Neighbor classifiers. The maps show that flood-prone areas in the watershed are located near rivers at lower elevations and on low-gradient slopes. Figure 8b,d; Figure 8b; Figure 8d,f,h show that the Bagging Tree ensemble model can enhance and extend flood-prone areas adjacent to rivers such that most known flood locations are located in high and very high susceptibility classes. In addition, Figure 8 shows that areas near the outlet of the Haraz watershed, as well as areas in the northwest part of the catchment, are more prone to flooding than other parts of the study area. In comparison to the nearest neighbor models, the hybrid models predict that higher proportions of the study area are flood susceptible (Figure 8). Of the hybrid models, the Bagged Tree-Cubic KNN model (Figure 8b) has the largest flood-prone area.

Evaluation and Comparison
We next compared the flood susceptibility performance of the new hybrid Bagging Tree-KNN models with that of the KNN models using the area under receiver operating characteristic (AUC) curve. Figure 9 shows the AUC of ROC curves that we produced for the training and testing steps of our flood susceptibility map datasets. The AUC curves show that the Coarse-KNN model performed best in the training and testing steps, with AUC values of 0.795 and 0.790, respectively. It is followed by the  (Figure 9c,d). The hybrid models outperformed the KNN classifier models. This result accords with the conclusion of Kantardzic [132] that the Bagging Tree-Cubic KNN model performs better than rival models (Figure 9). We therefore recommend that our highest performing model, the Bagging Tree-Cubic KNN model, be tested for flood susceptibility modelling in other areas.

Discussion
Flood susceptibility maps can be used by a variety of decision-makers and hazard managers to reduce injury and damage to built infrastructure from floods. We found that Sentinel-1 radar data are useful for mapping flood extent. In terms of flood susceptibility modelling, the task of choosing the best-performing machine learning algorithm can be difficult due to data complexity [102]; it commonly requires a trial-and-error approach. In our study area, the best performing model is a new intelligent hybrid model (Bagging Tree-Cubic KNN), which is a combination of a bagging ensemble technique and the four functions of the KNN classifier. We used the information gain ratio (IGR) on our ten flood conditioning factors and showed that, although all factors are significant in the model training, distance to a river stands out as the most important factor, followed by slope gradient and curvature. Our results are in agreement with those of Ahmadlou et al. [130], Bui et al. [39], Khosravi et al. [3], and Shafizadeh-Moghadam et al. [40]. As most floods in the Haraz watershed result from brief heavy rainfall and overbank river flow, it follows that areas adjacent to rivers and floodplains have the greatest flood susceptibility.
The KNN model is one of the most popular neighborhood classifiers; it is very simple to use and highly efficient in some fields of studies [133]. Computer memory requirements and operation time are the main limitations of KNN classifier performance, because this classifier depends on every example in the entire training set [134]. To solve these limitations and increase the performance of KNN, we used a bagging meta classifier. The combination of the Bagging Tree ensemble method and the KNN classifier allowed us to overcome the above-mentioned limitations and develop a reliable flood model. The AUC value (0.800) of the proposed Bagging Tree-Cubic KNN model indicates that its performance is best. This hybrid model may significantly improve the prediction accuracy of Cubic KNN as a base classifier.
Chapi et al. [8] tested and evaluated the bagging ensemble method to improve the power prediction of the logistic model tree (LMT) classifier in a new model (Bagging-LMT) for flood mapping in the Haraz watershed. They concluded that bagging increases the power prediction of the LMT base classifier in flood modelling. The ensemble model outperforms the basic classifier due to the synergy provided by the two classifiers when used together. We therefore recommend the proposed new model as an appropriate method for flood hazard management.
Flood modelling is a complex procedure with numerous uncertainties. Machine learning approaches efficiently handle these uncertainties as long as reliable historical flood inventory maps are available. The proposed machine learning model provides decision makers with a less expensive and less time-consuming way of evaluating flood hazards and risk than field surveys. It also provides authorities guidance as to what additional data (e.g., rainfall and river discharge data) might be required to produce more accurate flood maps for mitigating further damage. The flood susceptibility maps are thus fundamental products for further analyses and for hazard and risk disaster management and mapping. Our model may be used in other areas aside from the Haraz watershed.

Conclusions
The best way to mitigate and control floods is to identify all factors that have a relationship to flooding; in this study, we refer to these as conditioning factors. We used Sentinel-1 remote sensing radar data to identify and map flood locations in the Haraz watershed in northern Iran. We used 10 flood conditioning factors and 201 flood locations as our model inputs. Eight new hybrid models (Cubic-KNN, Bagging Tree-Cubic KNN, Coarse-KNN, Bagging Tree-Coarse-KNN, Cosine-KNN, Bagging Tree-Cosine-KNN, Weighted KNN, and Bagging Tree-Weighted KNN) were created to analyze and map flood susceptibility. Results based on the relief attribute evaluation metric indicate that distance from the river and slope gradients are the two most important factors for flood occurrence in the Haraz watershed. Among the eight models, we found that Bagging Tree-Cubic KNN model has the highest predictive power.
Flood modeling is a complicated task with many uncertainties, but we have shown that machine learning algorithms can improve flood susceptibility mapping. Our proposed flood model is effective, simple and intuitive. It reduces the variance and the noise of the training dataset, resulting in enhanced prediction accuracy. Our method of combining satellite radar data with the Bagging Tree-Cubic KNN model should be evaluated in other flood-prone regions, especially in large catchments where collecting data in the field is difficult and commonly expensive. This machine learning model can be used to improve the efficiency and accuracy of flood hazard mapping and thus assists in disaster management and land-use planning.

Conflicts of Interest:
The authors declare no conflict of interest.