Assessment Analysis of Flood Susceptibility in Tropical Desert Area: A Case Study of Yemen

: Flooding is one of the catastrophic natural hazards worldwide that can easily cause devastating effects on human life and property. Remote sensing devices are becoming increasingly important in monitoring and assessing natural disaster susceptibility and hazards. The proposed research work pursues an assessment analysis of ﬂood susceptibility in a tropical desert environment: a case study of Yemen. The base data for this research were collected and organized from meteorological, satellite images, remote sensing data, essential geographic data, and various data sources and used as input data into four machine learning (ML) algorithms. In this study, RS data (Sentinel-1 images) were used to detect ﬂooded areas in the study area. We also used the Sentinel application platform (SNAP 7.0) for Sentinel-1 image analysis and detecting ﬂood zones in the study locations. Flood spots were discovered and veriﬁed using Google Earth images, Landsat images, and press sources to create a ﬂood inventory map of ﬂooded areas in the study area. Four ML algorithms were used to map ﬂash ﬂood susceptibility (FFS) in Tarim city (Yemen): K-nearest neighbor (KNN), Naïve Bayes (NB), random forests (RF), and eXtreme gradient boosting (XGBoost). Twelve ﬂood conditioning factors were prepared, assessed in multicollinearity, and used with ﬂood inventories as input parameters to run each model. A total of 600 random ﬂood and non-ﬂood points were chosen, where 75% and 25% were used as training and validation datasets. The confusion matrix and the area under the receiver operating characteristic curve (AUROC) were used to validate the susceptibility maps. The results obtained reveal that all models had a high capacity to predict ﬂoods (AUC > 0.90). Further, in terms of performance, the tree-based ensemble algorithms (RF, XGBoost) outperform other ML algorithms, where the RF algorithm provides robust performance (AUC = 0.982) for assessing ﬂood-prone areas with only a few adjustments required prior to training the model. The value of the research lies in the fact that the proposed models are being tested for the ﬁrst time in Yemen to assess ﬂood susceptibility, which can also be used to assess, for example, earthquakes, landslides, and other disasters. Furthermore, this work makes signiﬁcant contributions to the worldwide effort to reduce the risk of natural disasters, particularly in Yemen. This will, therefore, help to enhance environmental sustainability. comparing four susceptibility maps. The


Study Area
The study area, Tarim city, is located in the Hadhramout province of Yemen (15°45′-16°15′ N, 48°45′-49°15′ E) ( Figure 1). The geology of the study area is a series of thick, flatlying sedimentary formations eroded into a complicated Wadi pattern, including limestone [53]. The soil type's classifications in Wadi Hadramout are dominated by rock (73.09%), which directly affects increased runoff. Gravel soil is classified as the second type (13.18%). This results in reduced soil infiltration and surface water flow (Runoff). Sand, clay, and silt are the remaining categories, accounting for 13.52% [54]. During summer, the mean temperature of the study area is around 35 °C, while the mean temperature in winter is approximately 19.7 °C. The annual mean precipitation of the area is 100 mm [55]. Due to changes in land-use patterns, rapid population growth, migration, unplanned urbanization, and facilities construction in flood-prone areas without adequate drainage capacity, environmental degradation and global climate change are significant reasons for unexpected flooding. Another cause is that hills surround the study area; the rainfall-runoff from this hilly area brings a considerable water inflow to Tarim city during the monsoon season [55,56].
During 1996 and 2008, several floods occurred in the study area, which destroyed, killed, and washed away human and animal lives, hydraulic structures, and fertile land [3,54]. On 2 May 2021, a flash flood affected the area, resulting in four confirmed deaths and injuries; officials and partners claimed that 167 households were impacted, where their homes were either partially or wholly destroyed [57]. The intensity of precipitation on 27 October 2008, was nearly 91 mm, which resulted in catastrophic floods in the Hadramout [3]. The majority of the damaged structures in the study region were composed of conventional mud bricks with stone foundations (Figure 1) [55]. The natural disaster had a tremendous impact on housing, with 561 dwellings demolished, and floods not only damaged buildings but also wreaked havoc on agricultural land [52]. While the During summer, the mean temperature of the study area is around 35 • C, while the mean temperature in winter is approximately 19.7 • C. The annual mean precipitation of the area is 100 mm [55]. Due to changes in land-use patterns, rapid population growth, migration, unplanned urbanization, and facilities construction in flood-prone areas without adequate drainage capacity, environmental degradation and global climate change are significant reasons for unexpected flooding. Another cause is that hills surround the study area; the rainfall-runoff from this hilly area brings a considerable water inflow to Tarim city during the monsoon season [55,56].
During 1996 and 2008, several floods occurred in the study area, which destroyed, killed, and washed away human and animal lives, hydraulic structures, and fertile land [3,54]. On 2 May 2021, a flash flood affected the area, resulting in four confirmed deaths and injuries; officials and partners claimed that 167 households were impacted, where their homes were either partially or wholly destroyed [57]. The intensity of precipitation on 27 October 2008, was nearly 91 mm, which resulted in catastrophic floods in the Hadramout [3]. The majority of the damaged structures in the study region were composed of conventional mud bricks with stone foundations (Figure 1) [55]. The natural disaster had a tremendous impact on housing, with 561 dwellings demolished, and floods not only damaged buildings but also wreaked havoc on agricultural land [52]. While the studied region received heavy precipitation in 2021, flooding in the neighborhood of "Al-Shabika" and Amid Aldan Hadrami's "Al Kef's palace" destroyed dozens of historic dwellings and caused large-scale damage [57].

Flood Susceptibility Mechanism and Conceptual Framework
Flood susceptibility is closely related to disaster-causing factors (hazard), the vulnerability environment, and management measures. The disaster-causing factors mainly refer Remote Sens. 2022, 14, 4050 5 of 28 to the precipitation factors, such as heavy rain. The vulnerability environment mainly refers to the underlying surface characteristics, topographic characteristics, vegetation status, soil factors, land use, and management measures refer to disaster prevention and reduction factors, mainly including drainage pipe network and river drainage capacity ( Figure 2). Shabika" and Amid Aldan Hadrami's "Al Kef's palace" destroyed dozens of historic dwellings and caused large-scale damage [57].

Flood Susceptibility Mechanism and Conceptual Framework
Flood susceptibility is closely related to disaster-causing factors (hazard), the vulnerability environment, and management measures. The disaster-causing factors mainly refer to the precipitation factors, such as heavy rain. The vulnerability environment mainly refers to the underlying surface characteristics, topographic characteristics, vegetation status, soil factors, land use, and management measures refer to disaster prevention and reduction factors, mainly including drainage pipe network and river drainage capacity ( Figure 2). The cause and mechanism of flood susceptibility are the basis for selecting flood susceptibility indicators. This study will estimate flood susceptibility based on this mechanism and propose a conceptual framework for flood susceptibility based on the cause and mechanism (Figure 3). The cause and mechanism of flood susceptibility are the basis for selecting flood susceptibility indicators. This study will estimate flood susceptibility based on this mechanism and propose a conceptual framework for flood susceptibility based on the cause and mechanism ( Figure 3).

Multicollinearity Assessment
In a dataset, multicollinearity is the presence of two or more linked variables that are linearly dependent on one another [58]. Therefore, it is a type of data disorder, and if it exists, statistical inferences formed from the data may not be accurate or trustworthy [59]. A multicollinearity test can aid with the selection of appropriate factors for hazard mapping, which can improve the model's results [60]. The most common causes of multicollinearity are inaccuracies in using dummy variables, variables of the same kind being repeated, and a high degree of relationship among the variables [61]. The multicollinearity

Multicollinearity Assessment
In a dataset, multicollinearity is the presence of two or more linked variables that are linearly dependent on one another [58]. Therefore, it is a type of data disorder, and if it exists, statistical inferences formed from the data may not be accurate or trustworthy [59]. A multicollinearity test can aid with the selection of appropriate factors for hazard mapping, which can improve the model's results [60]. The most common causes of multicollinearity are inaccuracies in using dummy variables, variables of the same kind being repeated, and a high degree of relationship among the variables [61]. The multicollinearity test is performed using tolerance (TOL) indices. It is not error prone if TOL > 0.1 because the variables are not multi-collinear. Multi-collinearity in the linear domain is a common feature of every statistical application and should be noted. The below equation was used to calculate tolerance [62].
where R 2 j is the coefficient of determination, TP (true positive) and TN (true negative) represent the number of pixels appropriately identified, and FP (false positive) and FN (false negative) represent the number of pixels classified incorrectly.

Detection of Flood-Prone Area by Sentinel-1
Sentinel-1A and Sentinel-1B, launched in April 2014 and April 2016, respectively, are the first of a series of Earth-imaging satellite constellations operated under the Copernicus program of the European Space Agency. The Sentinel-1 satellites collect data in four separate imaging modes: interferometric wide-swath (IW), strip map (SM), extra wide-swath (EW), and wave (WV), each with its own acquisition configurations [63]. The fact that radar beams cannot penetrate dense foliage is its fundamental disadvantage [64]. Sentinel-1 SAR data packages, which are freely available through the Sentinel Scientific Data Hub, can be used to identify backscatter signals from underwater areas (scihub.copernicus.eu, accessed on 6 May 2021). Sentinel-1 Level-1 ground range detected (GRD) data were projected onto the land using an Earth ellipsoid model (WGS84) in the current work since the specular reflection of C-band signals over flooded areas is substantially lower than over bare ground. Finally, using the SAR approach, Sentinel-1 SAR data were used to locate and map flooded areas [38,65].

Data Pre-Processing and Processing
Sentinel-1 (GRD and IW) data for 12 April 2021 (before the flood) and 6 May 2021 (after the flood) for Tarim City were acquired to identify and detect flood locations in the study area. The Sentinel Application Platform (SNAP 7.0) was used to manipulate radar data [66] and used the interferogram creation technique to apply pre-and post-flood data [38]. In addition to threshold data collected during the flood ( Table 1). First of all, the data were clipped as the study area using the SNAP application, and their orbit files were successfully updated, followed by calibration to optimize extracted data. In most cases, raw satellite data contains speckle. As a result, they were smoothed using the SNAP speckle filtering tool. The pixel values in SAR imaging can be related to the scene's radar backscatter; calibration transforms the pixel values from the sensor's digital values into backscatter coefficient values, which are effectively calibrated backscatter coefficient values [66].
The purpose of speckle filtering is to reduce image noise and provide higher-quality imagery. All of the preprocessing steps are detailed below: (i) Apply orbit file: Orbit state vectors, which are included in the metadata of SAR results, are frequently inaccurate. The precise orbits of satellites are computed over several days and are available days to weeks after the product is created. The SNAP application of a precise orbit allows for the automatic download and update of the orbit state vectors for each SAR scene in its product metadata, delivering exact satellite location and velocity information [67]. (ii) Calibration: The process of converting digital pixel data to radiometrically calibrated SAR backscatter is known as calibration. The calibration applies a constant offset and a range-dependent gain, including the absolute calibration constant, and reverses the scaling factor used during level-1 product development. (iii) Terrain correction: Terrain correction is the use of a digital elevation model to correct the location of each pixel to rectify geometric distortions induced by topography, such as foreshortening and shadows [67]. Distances can be altered in SAR images due to topographical changes and the tilt of the satellite sensor. Image data that are not in the nadir location of the sensor will be distorted. Terrain adjustments are meant to compensate for these distortions, bringing the geometric representation of the image as close to the real environment as possible (SNAP Toolbox) [68]. SAR geometry effects such as foreshortening, layover, and shadows may all be corrected with terrain correction [69]. (iv) Creating dB bands and stacking both data. In this step, a logarithmic transformation is used to convert the unitless backscatter coefficient to dB [68] Equation [69]: where β • is the digital number value of the image and β • db is the backscattered value in dB.
To extract the maximum amount of information, a dB band was generated for both images. The data were then layered for further processing in ArcGIS.
The free satellite data of Sentinel-1 were employed to detect the flooded areas. For the same place, two images on different dates were used, and the images represent the area before and after the flood occurrence; this method depends on the unique SAR interaction nature with water surface and flooded vegetation compared with the other features. Geometric distortions because of terrain effects of the study area are not considered in GRD imagery provided by ESA. Therefore, the GRD scenes have to be terrain corrected to improve the geolocation accuracy of the imagery [70].
After performing all of the necessary processing steps and terrain corrections, we used an RGB combination of the before flood image in the red (R) channel and the after-flood image in green (G) and blue (B) channels. Figure 4 resumes all steps followed to process Sentinel-1 in order to identify flooded areas in the study area.

Random Forest (RF)
RF is an ML classification algorithm that improves the classification tree's flexibility and accuracy [71]. It is a variant of bagged decision trees created from many de-correlated trees and only requires tuning a few parameters [72]. The number of a split attribute (M try ), which sets the number of parameters to divide at each tree node, is the most critical parameter to tune in RF. For classification and regression, RF is a highly efficient method. It can deal with multidimensional, categorical, and continuous data. RF does not necessitate any assumptions regarding the data's statistical distribution, and it is resistant to changes in the dataset's composition. One of the advantages of the RF is that it has a quick training speed and can detect the mutual influence of characteristics. RF can balance faults in uneven datasets, and even if a large portion of the features is missing, accuracy can still be maintained [73,74]. These characteristics come in handy when working with nonlinear mutual relationship variables [74]. It consists of K integrated decision trees composed of a set of unrelated regression decision trees {h(x, θ k ), k 1, 2, · · · K}.
where x is the conditioning factor of flood and k are the numbers of the decision tree, θ k is an independent, identically distributed random variable. N is the total number of decision trees generated by the model.
where I k is the importance of factor x i , and I is the importance of factor x in all random forests. After performing all of the necessary processing steps and terrain corrections, we used an RGB combination of the before flood image in the red (R) channel and the afterflood image in green (G) and blue (B) channels. Figure 4 resumes all steps followed to process Sentinel-1 in order to identify flooded areas in the study area.

Random Forest (RF)
RF is an ML classification algorithm that improves the classification tree's flexibility and accuracy [71]. It is a variant of bagged decision trees created from many de-correlated trees and only requires tuning a few parameters [72]. The number of a split attribute (Mtry), which sets the number of parameters to divide at each tree node, is the most critical parameter to tune in RF. For classification and regression, RF is a highly efficient method. It can deal with multidimensional, categorical, and continuous data. RF does not necessitate any assumptions regarding the data's statistical distribution, and it is resistant to changes in the dataset's composition. One of the advantages of the RF is that it has a quick training speed and can detect the mutual influence of characteristics. RF can balance faults in uneven datasets, and even if a large portion of the features is missing, accuracy can still be maintained [73,74]. These characteristics come in handy when working with nonlinear mutual relationship variables [74]. It consists of K integrated decision trees composed of a set of unrelated regression decision trees { ℎ( , ), ϵ1,2, ⋯ }.
where x is the conditioning factor of flood and k are the numbers of the decision tree, is an independent, identically distributed random variable. N is the total number of decision trees generated by the model. In contrast, RF is complicated and could overfit the training data when dealing with noisy regression or classification problems. RF outcomes will be influenced more by attributes with more values. In general, RF is an effective integrated learning method [73].

K-Nearest Neighbor (KNN)
KNN algorithms are supervised ML algorithms; however, they are also called lazy algorithms because they do not require learning [47]. KNN can be used to handle regression and classification issues. KNN computes the k nearest samples utilizing the distance between samples and uses their value to predict the value of the desired selection [75]. These k samples are most similar to the sample examined. Once the method has selected the k nearest samples, it may simply output a weighted sum of their values as the model's prediction for the target sample [75]. KNN's drawbacks include the necessity for extensive calculation and the requirement for a large memory [76]. The distance formula in KNN is as follows.
where p = 2, the Euclidean distance used in this study. NB is a simple and extensively used algorithm applied in various fields (computer science, earth sciences, text classification, and medicine). This approach is practical when sample S can be characterized as conjugating conditionally independent attributes [77]. Using Bayesian learning, based on the Bayesian probability theory, we can compute the posterior probability given the last possibilities [78]. The primary advantage of the NB model is that it is relatively simple to implement and does not necessitate the use of extensive hyperparameter tuning [76]. One of the advantages of the NB is that it has a solid mathematical foundation and stable classification efficiency. NB excels with small-scale data, can handle a variety of classification problems, and is well-suited to incremental training [73]. The disadvantage of the NB model is that it is susceptible to how the input data are represented; it is necessary to compute the prior probability [73]. The probability of C i under the samples S is calculated as follows: where S is unknown class sample data, C i is the class of study object, Li is the number of samples in C i , L is total samples.

Extreme Gradient Boosting (XGBoost)
XGBoost is a sophisticated ensemble learning algorithm based on classification or regression trees [79]. First, this algorithm generates several subsequent decision trees utilizing the prediction errors or residuals from the preceding tree instead of averaging independent trees; consequently, it focuses on samples with a higher level of uncertainty. The decision trees generated in previous steps are combined to arrive at the final output [79]. XGBoost aims to minimize computational complexity while optimizing computer resources [79]. In contrast to the above algorithms, XGBoost contains several tunable parameters that add complexity. XGBoost has a few parameters in common with other tree-based algorithms, but it also requires hyperparameters to limit the risk of overfitting, reduce prediction variability, and increase accuracy [76]. XGBoost's key advantages are flexibility and speed. Due to its outstanding performance in a growing several of Kaggle contests, XGBoost has established itself as a unique inclusive algorithm [79]. However, XGBoost has only been used in a few research thus far to map geological hazard susceptibility. The target value (O t ) of the algorithm after t iterations is calculated using Equations (8)-(10) [79]: where σ and γ are penalty factors, G r and H r are calculated as follows, T is the number of leaf nodes, and l denotes the loss caused by differences between the predicted and true values.
where y i is an actual factor, y i,t−1 is the value after t times calculations.

Model Validation
The validation and accuracy assessment in modeling is critical. A machine learning model should preferably not be assessed on the same data on which it was trained, as this can cause the model to overfit the data, producing mildly more robust results than they would have been. Thus, to obtain an independent model assessment, it is essential to run it on testing data that the model has not utilized previously [80]. Here, 25% (150 points) of the total inventoried flood points were used to validate each model. The receiver operating characteristic (ROC) curve and confusion matrix were used to assess each model's performance, which are frequent and critical features used in most statistical or probabilistic applications, especially in susceptibility mapping [80,81]. Moreover, the area under the curve (AUC), a summary of the ROC curve, was computed. ROC curve is a graphical depiction of how well locations are classified as non-events or events [81]. The AUC values range from 0 to 1, where value0 indicates a low predictive accuracy that does not accurately categorize the FFS, and value 1 indicates a perfect predictive accuracy with absolute FFS pixel categorization [81]. The kappa index was also computed to assess the model performance, where their values ranged between 0 to 1 denoting a low to a high kappa index [82].
Confusion matrices are the primary tool for evaluating classification errors (sorting items into classes, i.e., categories or kinds of items). Machine learning under supervision is a typical application of confusion matrices. They provide the complete specification of misclassifications: the number of misclassified items for each pair of original classes to which items should be classified and incorrect class to which items are classified incorrectly. From a trusted collection of pre-classified things, it is known that items belong to an original class (a ground truth) [83].
This study used five statistical evaluation measures to assess the trained FFS models' performance: accuracy, specificity, sensitivity, negative predictive value, and positive predictive value. Accuracy refers to the proportion of FFS and non-FFS pixels successfully classified by the resulting models. Sensitivity refers to the proportion of FFS pixels accurately detected as flood occurrences, and specificity refers to the proportion of non-FFS pixels correctly classified as non-FFS. The positive predictive value indicates the likelihood that pixels will be correctly identified as FFS, while the negative predictive value indicates the likelihood that pixels will be correctly classed as non-FFS.
Positive predictive value = TP FP + TP where TP (true positive) and TN (true negative) represent the number of pixels appropriately identified, while FP (false positive) and FN (false negative) represent the number of pixels classified incorrectly.

Factor System of Flood Susceptibility and Model Building 2.8.1. Flash Flood Conditioning Factors
A significant phase in FSM is the selection of dominant key variables for assessing flood risks [12]. In general, flooding occurs due to natural and human factors. When mapping sensitivity to floods or other natural disasters, the number of conditioning factors must be specified [84,85]. The flood conditioning factors were chosen based on the geoenvironmental condition of the study area and related studies from areas with similar climatic circumstances [86,87].
Twelve (12) independent variables were prepared as separate maps in R software, spatially registered (Table 2), and resampled to a determined pixel size resolution proportional to the land use map we extracted using ArcGIS 10.3 from a map issued jointly by ESRI and the Impact Observatory Institute (resolution10 m). Rainfall: As far as floods are concerned, rainfall is the most significant factor [40]. Due to no more current ground-station rainfall measurements for the study area, gridded data derived from Climate Hazard Infrared Group Precipitation Station (CHIRPS) that were explored on Google Earth Engine (GEE) were used to derive the average of maximum annual rainfall per year. CHIRPS data were also used to attain the average of the most extended period of consecutive days of rainfall per year from 1996 to 2021 (Figure 5h,i). These data can equally be used in difficult-to-reach locations with scant or time-incomplete observational data [88]. This study selects the rain intensity and rain duration to measure the hazard.
Elevation: Previous research has established that elevation significantly influences flooding [12,22]. The elevation map of the research area was produced from an ALOS PALSAR sensor-derived digital elevation model (DEM) with a 12.5 m pixel size (Figure 5a).

Topographic wetness index (TWI):
The terrain-driven balance of catchment water supply and local drainage for each cell in a DEM is expressed by the topographic wetness index (TWI), which integrates water supply from the upslope catchment area and downslope water drainage [89]. TWI provides information about the spatial distribution and saturation sources contributing to runoff generation. As a result, the TWI has an indirect role in affecting runoff systems in a given area. TWI values were computed in this work using the DEM model, as shown in the following equation [90] (Figure 5b): where As is the specific contributing area and β is the gradient or slope.

Stream power index (SPI):
The power of the stream, shear stress, and velocity are all essential elements in the development of flood damage and the erosion of river channels. SPI is a statistic that measures the erosive strength of discharge compared to a specific area within a watershed. It is also a measure of the erosive force of the flowing water [91]. SPI can be accounted for using the following equation [92]: The SPI map was derived from DEM by applying the equation above in ArcGIS 10.3 software (Figure 5c). Remote Sens. 2022, 14, x FOR PEER REVIEW 12 of 30 Elevation: Previous research has established that elevation significantly influences flooding [12,22]. The elevation map of the research area was produced from an ALOS PALSAR sensor-derived digital elevation model (DEM) with a 12.5 m pixel size ( Figure  5a).

Topographic wetness index (TWI):
The terrain-driven balance of catchment water supply and local drainage for each cell in a DEM is expressed by the topographic wetness index (TWI), which integrates water supply from the upslope catchment area and downslope water drainage [89]. TWI provides information about the spatial distribution and saturation sources contributing to runoff generation. As a result, the TWI has an indirect role in affecting runoff systems in a given area. TWI values were computed in this work using the DEM model, as shown in the following equation [90] (Figure 5b): where As is the specific contributing area and β is the gradient or slope.

Stream power index (SPI):
The power of the stream, shear stress, and velocity are all essential elements in the development of flood damage and the erosion of river channels. SPI is a statistic that measures the erosive strength of discharge compared to a specific area within a watershed. It is also a measure of the erosive force of the flowing water [91]. SPI can be accounted for using the following equation [92] : SPI = As × tan(β) The SPI map was derived from DEM by applying the equation above in ArcGIS 10.3 software (Figure 5c). Slope: The slope is affected by surface runoff and infiltration and thus is critical in the flood susceptibility mapping [93]. The slope is the ratio of a feature's steepness or degree of inclination relative to the horizontal plane [94]. The slope map was created simply in ArcGIS 10.3 software using the DEM and divided into five categories, as shown in Figure 5d.
Aspect: Aspect is another factor that influences flooding water flow directions, evapotranspiration, local climate, soil moisture, evapotranspiration, and infiltration [95]. The aspect factor influences the occurrence of natural occurrences on the earth's surface since it is influenced by climatic elements such as precipitation direction and sunshine intensity [6]. Although this element only has a minor impact on flooding, most researchers have included it as one of the factors to consider when mapping flood susceptibility [6,96]. Aspect was classified into nine groups, each corresponding to a cardinal direction. Flood pixels are spread very evenly throughout these nine types as shown in Figure 5e.
Curvature: Curvature is another factor affecting floods event; a surface part can be concave or convex and influences the flow's divergence and convergence across the surface [12]. The curvature was extracted from DEM in ArcGIS (Figure 5f).

Normalized difference vegetation index (NDVI):
The normalized difference vegetation index (NDVI), another critical factor for flood susceptibility mapping, is an essential indicator of vegetation cover and its impact on flooding in a catchment [33]. The NDVI value is calculated using the following equation [97]: where the R-value is the red portion of the electromagnetic spectrum and the IR value is the infrared portion of the electromagnetic spectrum. For this study, the NDVI map was obtained from Sentinel-2 data by combining bands B8 and B6, as shown in the equation below (Figure 5j): Land use/land cover (LU/LC): LU/LC types play an influential role, directly or indirectly influencing some hydrological methods components, such as runoff generation, infiltration, and evapotranspiration [28]. Land-use, land-cover (LULC) maps can be obtained from high-resolution satellite sensors. We derived the research area's land use map from a global map issued jointly by ESRI and the Impact Observatory Institute. Access to the entire global GeoTIFF zip file is available on (https://livingatlas.arcgis.com/landcover, accessed on 26 June 2021). The data were accessed and downloaded on 26 June 2021. We then categorized the study area into seven groups: water, trees, grass, crops, scrub/shrub, built area, and bare ground using ArcGIS 10.3 (Figure 5k).
Soil type: Soil type is an essential factor in flood susceptibility mapping, as primary water infiltration is dependent on soil characteristics [26]. The national soil map of Yemen was compiled in 2006 by Renewable Natural Resources Research Center (RNRRC) in the Agricultural Research and Extension authority (AREA), Dhamar, Yemen [98]. From conversion tools In ArcGIS 10.3, we converted soil map polygon to raster layer, then extracted study area, and categorized into two groups, namely Etc (dry soil, dry sedimentary, soil dry, and limestone soil), and Rcc (dry limestone, soil dry, shallow calcareous soil, and shallow soil) (Figure 5l).
Drainage density (Dd): Hydrological networks are measured by the average length of rivers in a river basin, which determines their length. Dd is the ratio of drainage length (km) to area km 2 [99]. It is an essential factor in determining flood-prone areas. The drainage density is used to describe the management measures (Figure 5g).

Flood Inventory Map
The key to accurately predicting flash-flood-prone areas is to survey the areas already affected by torrential events. It is necessary to investigate prior flooding incidents in the study area to determine the chance of future flooding. Rapid surface runoff frequently induces flash floods that spread down the valley's slope [6]. To identify and detect flood areas in the study area, we gathered a set of Sentinel-1 images (level-1 ground range detected (GRD), more interferometric wide swath (IW)) acquired between 12 April 2021 and 6 May 2021, and we manipulated them using the sentinel application platform (SNAP 7.0). The time-series interferogram construction technique was applied on pre-and post-flood Sentinel-1 data to map flooded areas [38,66]. Two images from separate dates were utilized to represent the area before and after the flood occurred; this strategy is based on the unique nature of SAR interaction with the water surface and flooded vegetation compared to other features. The flood inventory map was created using data from flood occurrences between 1996, 2008, and 2021. The Sentinel-1 images and interpretation of satellite and Google Earth images and news reports were used to create the flood inventory map [27]. For flooding susceptibility mapping, flood and non-flood points are required [100]. The training flood areas (300 points) were chosen based on past disaster reports and sentinel data and the same number of non-flooded areas were randomly created [101]. Figure 6 shows the spatial distribution of the 600 points (flooded and non-flooded) used to prepare flooding susceptibility maps. A flood layer was prepared as a dependent component. In this layer, the points corresponding to flood and non-flood areas were indicated by the values 1 and 0, respectively. Of the total points, 75% were used to train the models and the remaining 25% were used for the validation of the trained models [102,103]. Remote Sens. 2022, 14, x FOR PEER REVIEW 15 of 30 Figure 6. Flood inventory map of the study area.

Applied ML Models for Flood Susceptibility Mapping
In this study, ArcGIS 10.3 and R 3.6.1 software are used to analyze flood susceptibility. Figure 7 shows the methodology used in this research study. The first step involved data collection and preprocessing, second-multicollinearity testing of flood causative factors, third-data split into 75% for model training and 25% for validation, fourth-the preparation of flood susceptibility maps by RF, XGboost, NB, and KNN, and the last step is the validation and comparison of flood susceptibility maps.

Applied ML Models for Flood Susceptibility Mapping
In this study, ArcGIS 10.3 and R 3.6.1 software are used to analyze flood susceptibility. Figure 7 shows the methodology used in this research study. The first step involved data collection and preprocessing, second-multicollinearity testing of flood causative factors, third-data split into 75% for model training and 25% for validation, fourth-the preparation of flood susceptibility maps by RF, XGboost, NB, and KNN, and the last step is the validation and comparison of flood susceptibility maps. Remote Sens. 2022, 14, x FOR PEER REVIEW 16 of 30 Figure 7. The flowchart adopted for this study.

Multicollinearity Analysis
In this study, we proceeded with the assumption that there would be no linear dependency among conditioning factors that would have a detrimental impact on our susceptibility models. Table 3 lists the results of the multicollinearity analysis of the 12 flood conditioning factors. The TOL of all variables used in this study are higher than 0.292, indicating no multicollinearity between these variables. Thus, we used them all in the modeling step.

Multicollinearity Analysis
In this study, we proceeded with the assumption that there would be no linear dependency among conditioning factors that would have a detrimental impact on our susceptibility models. Table 3 lists the results of the multicollinearity analysis of the 12 flood conditioning factors. The TOL of all variables used in this study are higher than 0.292, indicating no multicollinearity between these variables. Thus, we used them all in the modeling step.

Flood Detection Results Using Sentinel-1 Data
In Figure 8, the flooded areas are differentiated (in red) from the other areas (pre-flood water bodies). The flooded areas on the map were zoomed in to make them more visible.

Variable's Importance
The relevance of each conditioning factor for the four models was assessed using R software. The results indicate that the slope is the most critical factor for all used algorithms, followed by the drainage density, TWI, and elevation ( Figure 9). These findings support earlier research revealing that these factors are essential in the flood susceptibility [22,104]. Further, the other factors show low and various variable importance, whereas the factors (rain duration and aspect) show the least essential variables. It is worth noting that flooding is associated with rainfall; however, in our study, rainfall duration was the least important factor after aspect contributing to flooding susceptibility because flash floods are sudden and caused by sudden heavy rainfall in a short period of time. Elevated areas typically receive more rainfall, which flows downward and inundates areas with gentle slopes and low elevated areas. Our findings are consistent with previous studies where rainfall is the least important contribution to flood susceptibility [27,[105][106][107].

Flood Detection Results Using Sentinel-1 Data
In Figure 8, the flooded areas are differentiated (in red) from the other areas (preflood water bodies). The flooded areas on the map were zoomed in to make them more visible.

Variable's Importance
The relevance of each conditioning factor for the four models was assessed using R software. The results indicate that the slope is the most critical factor for all used algorithms, followed by the drainage density, TWI, and elevation ( Figure 9). These findings support earlier research revealing that these factors are essential in the flood susceptibility [22,104]. Further, the other factors show low and various variable importance, whereas In summary, there is a similarity in the significance rankings of RF, KNN, NB, and XGBoost variables may be seen (Figure 9), and factor distribution by floods occurrences (Figure 10) The relevance of each conditioning factor for the four used models was assessed using R software. The variables in the RF model with the highest importance were found to be slope and drainage density, while variables with medium to lesser importance included TWI, elevation, SPI, curvature, and NDVI. In the case of soil, land use, rain intensity, rain duration, and aspect, no such importance was discovered in the modeling of flood susceptibility.
The variables with the highest importance in the KNN model are slope, TWI, drainage density, elevation, SPI, and soil, while those with medium to low importance are land use, NDVI, and curvature. There is no such importance in flood susceptibility modeling for factors such as rain duration, rain intensity, and aspect.
The highest importance of the variables in the NB model was found in the slope variable, TWI, drainage density, elevation, SPI, and soil, while the other variables, such as land use, NDVI, and curvature had a medium to less importance. There was no importance found in flood susceptibility modeling for rain duration, rain intensity, and aspect.
The only variable with the highest importance in the XGBoost model is the slope, while those with medium to less importance are drainage density and elevation. There is no such importance in flood susceptibility modeling for curvature, NDVI, TWI, SPI, soil, land use, rain intensity, rain duration and aspect.
Flooding probability was calculated independently by investigating the link between every explaining variable and flash flooding frequency. Using histogram analysis, a flood factor distribution (FD) was produced for each of the variables, with each variable being separated into different classes. The distribution of flood incidence among the various classes of each variable was evaluated using FD analysis.
The FD analysis was carried out (Figure 10). The majority of flood incidents occurred in low-lying areas, as well as locations with high accumulate water (TWI) and area of low SPI. Floods were also found to be concentrated on slopes facing east, northeast, southeast, areas with almost flat slopes and surface curvature of convex to flat. Floods were also found to be close to areas with high drainage density. Most floods occurred in areas of gravels, bare ground, with low shrub. In summary, there is a similarity in the significance rankings of RF, KNN, NB, and XGBoost variables may be seen (Figure 9), and factor distribution by floods occurrences (Figure 10) The relevance of each conditioning factor for the four used models was assessed using R software. The variables in the RF model with the highest importance were found to be slope and drainage density, while variables with medium to lesser importance included TWI, elevation, SPI, curvature, and NDVI. In the case of soil, land use, rain intensity, rain duration, and aspect, no such importance was discovered in the modeling of flood susceptibility.
The variables with the highest importance in the KNN model are slope, TWI, drainage density, elevation, SPI, and soil, while those with medium to low importance are land use, NDVI, and curvature. There is no such importance in flood susceptibility modeling for factors such as rain duration, rain intensity, and aspect.
The highest importance of the variables in the NB model was found in the slope variable, TWI, drainage density, elevation, SPI, and soil, while the other variables, such as land use, NDVI, and curvature had a medium to less importance. There was no importance found in flood susceptibility modeling for rain duration, rain intensity, and aspect.
The only variable with the highest importance in the XGBoost model is the slope, while those with medium to less importance are drainage density and elevation. There is no such importance in flood susceptibility modeling for curvature, NDVI, TWI, SPI, soil, land use, rain intensity, rain duration and aspect.
Flooding probability was calculated independently by investigating the link between every explaining variable and flash flooding frequency. Using histogram analysis, a flood factor distribution (FD) was produced for each of the variables, with each variable being separated into different classes. The distribution of flood incidence among the various classes of each variable was evaluated using FD analysis.
The FD analysis was carried out (Figure 10). The majority of flood incidents occurred in low-lying areas, as well as locations with high accumulate water (TWI) and area of low SPI. Floods were also found to be concentrated on slopes facing east, northeast, southeast, areas with almost flat slopes and surface curvature of convex to flat. Floods were also found to be close to areas with high drainage density. Most floods occurred in areas of gravels, bare ground, with low shrub. Remote Sens. 2022, 14, x FOR PEER REVIEW 19 of 30

Flash Flood Susceptibility Mapping
The maps of flood susceptibility for each pixel in the basin were computed using four machine learning models: RF, KNN, NB, and XGBoost. Based on the aforementioned experiment findings, the RF model has been proven to be the highest-performing prediction model across all benchmark models for geospatial datasets. In ArcGIS 10.3, there are several approaches for reclassifying flood susceptible models, including a natural break, equal interval, quantile, regular interval, standard deviation, and manual methodology. Quantile and natural break methods are two of these strategies that have been frequently described in flood susceptibility studies literature [96,108].

Flash Flood Susceptibility Mapping
The maps of flood susceptibility for each pixel in the basin were computed using four machine learning models: RF, KNN, NB, and XGBoost. Based on the aforementioned experiment findings, the RF model has been proven to be the highest-performing prediction model across all benchmark models for geospatial datasets. In ArcGIS 10.3, there are several approaches for reclassifying flood susceptible models, including a natural break, equal interval, quantile, regular interval, standard deviation, and manual methodology. Quantile and natural break methods are two of these strategies that have been frequently described in flood susceptibility studies literature [96,108].
The FFS maps are divided into five classes using Jenks' natural break method in ArcMap: very high, high, moderate, low, and very low [41,51]. The flooding areas are located along the main WadiRiver and tributary streams ( Figure 11). The flooded areas appear to be highly influenced by distances from streams, where the areas nearest to streams are more flood-prone than those far away. The FFSM also shows that drainage density and elevation have relatively significant contributions to flood modeling where low elevation zones tend to gather more water (high drainage density) and then increase the probability of flooding. In a similar vein, the slope had clear significance for flooding, where the low slope areas typically have the potential to collect water. RF susceptibility map reveals that the very low, low, moderate, high, and very high classes cover 42.58%, 23.24%, 13.52%, 11.59%, and 9.05% of the total study area. For the KNN model, the surface area is computed as; 26.97% for the very low class, 22.57% for the low class, 20.30% for the moderate class, 16.79% for the high class, and 13.34% for the very high class. Further, in the case of the NB model, the obtained susceptibility map indicates that 91.65%, 1.18%, 0.89%, 1.13%, and 5.13% of the total surface area corresponds to the very low, low, moderate, high, and very high classes, respectively. Last, the XGBoost susceptibility map indicates that 75.14%, 8.72%, 7.05%, 4.53%, and 4.53% of the total surface area correspond to the very low, low, moderate, high, and very high classes, respectively ( Figure 12). The FFS maps are divided into five classes using Jenks' natural break method in ArcMap: very high, high, moderate, low, and very low [41,51]. The flooding areas are located along the main WadiRiver and tributary streams ( Figure 11). The flooded areas appear to be highly influenced by distances from streams, where the areas nearest to streams are more flood-prone than those far away. The FFSM also shows that drainage density and elevation have relatively significant contributions to flood modeling where low elevation zones tend to gather more water (high drainage density) and then increase the probability of flooding. In a similar vein, the slope had clear significance for flooding, where the low slope areas typically have the potential to collect water. RF susceptibility map reveals that the very low, low, moderate, high, and very high classes cover 42.58%, 23.24%, 13.52%, 11.59%, and 9.05% of the total study area. For the KNN model, the surface area is computed as; 26.97% for the very low class, 22.57% for the low class, 20.30% for the moderate class, 16.79% for the high class, and 13.34% for the very high class. Further, in the case of the NB model, the obtained susceptibility map indicates that 91.65%, 1.18%, 0.89%, 1.13%, and 5.13% of the total surface area corresponds to the very low, low, moderate, high, and very high classes, respectively. Last, the XGBoost susceptibility map indicates that 75.14%, 8.72%, 7.05%, 4.53%, and 4.53% of the total surface area correspond to the very low, low, moderate, high, and very high classes, respectively ( Figure 12).          On other hand, the RF model has the largest negative predictive value (0.943), indicating that the model has a high probability of correctly categorizing non-flood susceptibility areas correctly. For the models KNN, NB, and XGBoost, this probability is 0.901, 0.934, and 0.930, respectively. The RF and XGBoost models have the highest sensitivity (0.949), indicating that 0.936% of the pixels were classified correctly as a flood. The pixels properly classified as flood are 0.924% and 0.962% for the KNN and NB models, respectively. Furthermore, the RF model likewise scored the best specificity (0.943), indicating that 0.94% of the non-flood areas were adequately classified as non-flood. The KNN, NB, and XGBoost models are 0.774, 0.605, and 0.943, respectively. These results indicate that, in the case of the KNN model, certain study area zones would be medium or highly susceptible to flooding. In contrast, in the instance of the RF, NB, and XGBoost models, these zones would be less prone to flooding. The findings indicate that the models used in this study adequately depict the positive connections between susceptibility maps and flood inventory points.

Discussion
The first and most critical stage in flood risk assessment is determining how sensitive an area is to floods through flood susceptibility mapping. Flood-prone areas can be identified, and the necessary support solutions can be put in place to reduce flood-related losses. In this study, we used various geospatial datasets integrated with machine learning and geographic information systems to investigate and analyze flood susceptibility in the data-scarce region.
Due to the annual monsoonal rains, Yemen is prone to flooding. Flooding is a natural occurrence that cannot be avoided entirely, causing significant economic losses and infrastructure and natural ecosystem damage [1]. Commonly, climate change has been found to impact flood occurrences substantially. It is still unclear how climate change may affect floods in the future, notably the seasonal effects of climatic factors, which need more investigation. However, a poor understanding of flood management might emerge from a lack of information about the spatial variability of floods. Flood impact reduction can be achieved by determining the primary factors influencing flood events and producing a flood susceptibility map. Nevertheless, numerous other hydrological, geological, topographical, and morphological factors influence floods [108]. Furthermore, only some of these factors are included in flood susceptibility models; hence, choosing appropriate flood-affecting factors is a critical step in flood susceptibility modeling.
Therefore, for reliable flood susceptibility maps, the research area must be affected by understood factors. No systematic research has been done on flood events in Tarim city (Yemen). This study tries to fill this knowledge gap by comparing four ML algorithms to find the most effective one for predicting FFS in a semi-arid area. In America, Europe, and Asia, similar research has been conducted for flood susceptibility maps. The performance of susceptibility modeling utilizing different suitable ML algorithms has been the focus of several studies in this field. Commonly, techniques based on ML and artificial intelligence (AI) save time and money and can provide a high degree of accuracy.
Based on results of variable importance, the most important factor that may cause flooding in this study is the slope, followed by drainage density, TWI, and elevation. These findings are in agreement with results from other recent studies [109,110], which stated that slope is the most important factor in flood occurrence [111,112]. Drainage density is a fundamental feature of river basins that represents relief, flood peak, and geology from a hydrological standpoint [113]. Floodwaters usually inundate areas with flat slopes at low elevations. Increased rainfall at higher elevation zones is less prone to flooding because the water flows from high elevation zone to low-lying areas, and therefore elevated areas are less prone to flooding [27]. Additionally, the areas of high TWI have saturated soil, so the flood potential increases as TWI increases as the soils cannot absorb more water resulting in flooding [39,114].
The AUC values of all four models were greater than 0.90 in terms of ROC results, indicating that the four models performed well in predicting flash flood susceptibility. Besides that, the outcomes of other statistical metrics like kappa index, sensitivity, specificity, and accuracy, revealed that all models produced good and reasonable results. In our study, in terms of performance, the tree-based ensemble algorithms RF and XGBoost outperform other ML algorithms, where the RF algorithm provides robust performance (AUC = 0.982) for assessing flood-prone areas with only a few adjustments required prior to training the model. These findings are consistent with previous research, which has shown that tree-based ensemble algorithms perform better than other algorithms [33,51]. According to a study [51] for flood susceptibility assessment in Musi River, Hyderabad, India using ML models, RF and XGBoost outperform other ML algorithms, logistic regression, support vector machine, K-nearest neighbor, adaptive boosting (AdaBoost), which are compatible with the results of this research.
Zhao et al. [115], in the study Mapping flood susceptibility in mountainous areas on a national scale in China, by using the RF model, stated that the RF model could identify the flood susceptibility with satisfactory accuracy. Chen et al. [116] in modeling flood susceptibility using data-driven approaches of naïve Bayes tree, alternating decision tree, and random forest methods stated that the RF method is an efficient and reliable model in flood susceptibility assessment. Abedi et al. [33] assessed flood susceptibility by using RF, XGboost, and boosted regression trees and stated that RF was the most accurate in predicting flash flood susceptibility. That study's outcome was in line with what we found in our research. Although it had the lowest predicted accuracy of the three ways evaluated, the NB method was similarly beneficial. As a result, based on their performance and ease of interpretation, this study shows that the chosen models are genuinely possible. On the other hand, one of the drawbacks of the study was the lack of critical hydrological data, such as flood depth, velocity, and discharge, making developing a robust model difficult.
Flood modeling is a complicated operation fraught with uncertainties. As long as credible historical flood inventory maps are available, machine learning algorithms can efficiently address these uncertainties [117]. So, to avoid the uncertainties, we made a flood inventory map of the study area by using flood damage reports and Google earth pro and field visits for historical floods from 1996, 2008, and 2021 and verified the flood inventory by flood sentinel-1 SAR data for flood episode 2021 [38]. All the spatial data resembled 12.5 m resolution to avoid uncertainties arising from inconsistent spatial data resolution. As long as credible historical flood inventory maps are available, machine learning algorithms can efficiently address these uncertainties [38]. The proposed models could be a valuable and novel strategy for managing flood threats in dry and semi-arid regions like Tarim (Yemen).
However, as with any other study, the results of the current study are susceptible to error and uncertainty due to factors such as subjective classification of flood-influencing factors, selection of performance indicators, training, and testing datasets. Each of these factors necessitates more research to demonstrate how these uncertainties influence the final flood susceptibility maps and subsequent decisions. Future research should consider the effect of these uncertainties by choosing other flood factors, such as daily or subdaily rainfall, classifying the flood factors in collaboration with stakeholders [117,118], conducting a sensitivity analysis of the effect of classification of the observed dataset (other than 75% and 25% for training and testing), and evaluating the efficacy of the four methods using alternative goodness-of-fit measures [117,119].

Conclusions
Flooding is a natural disaster that threatens people's lives, and the structural integrity of buildings in flood-affected communities can never be avoided entirely. Because of this, it is very important to improve flood forecasting and prevention methods to reduce the number of people who die and the negative social and economic effects of floods.
RS data (Sentinel-1 images) were used to detect flooded areas in the study area. Used the Sentinel application platform (SNAP 7.0) for Sentinel-1 image analysis and detecting flood zones in the study locations. Flood spots were discovered and verified using Google Earth images, Landsat images, and press sources to create a flood inventory map of flooded areas in the study area. This study used four ML algorithms (RF, KNN, NB, and XGBoost). The models were built using a spatial database that comprised 12 topographic and geo-environmental flood conditioning factors and data from 300 previous flooding occurrences. The tests revealed no evidence of multicollinearity between the identified conditioning factors. The validation findings revealed that all of the models utilized performed admirably, with the RF and XGBoost models outperforming the others. However, RF is more computationally efficient than XGBoost since training the model with RF requires less execution time. Thus, the RF model might create a flood susceptibility map and a potential method for flash flood prediction in the era of big data due to its capacity to handle multiple types of variables and represent complex non-linear interactions.
In addition, we discovered that the KNN model provided us with a false alarm area in some locations, even though there was no flood in the actual observations, but there was a flood predicted by the model. These locations were on higher ground, and it was not possible for floods to occur there.
The results show that approximately 4.53% to 13.34% of the overall area is highly vulnerable to floods. The resultant map may serve as a basis for establishing plans for minimizing flood susceptibility and assisting in developing adaptation measures. The difficulty in obtaining relevant intense precipitation records and combining them with the results of flood simulation models is a limitation of this study. Moreover, several factors are pertinent to flood occurrences, such as flood depth and velocity parameters, but we couldn't acquire them. These data could be used in future work to make it more robust.