Novel Ensembles of Deep Learning Neural Network and Statistical Learning for Flash-Flood Susceptibility Mapping

: This study aimed to assess ﬂash-ﬂood susceptibility using a new hybridization approach of Deep Neural Network (DNN), Analytical Hierarchy Process (AHP), and Frequency Ratio (FR). A catchment area in south-eastern Romania was selected for this proposed approach. In this regard, a geospatial database of the ﬂood with 178 ﬂood locations and with 10 ﬂash-ﬂood predictors was prepared and used for this proposed approach. AHP and FR were used for processing and coding the predictors into a numeric format, whereas DNN, which is a powerful and state-of-the-art probabilistic machine leaning, was employed to build an inference ﬂash-ﬂood model. The reliability of the models was veriﬁed with the help of Receiver Operating Characteristic (ROC) Curve, Area Under Curve (AUC), and several statistical measures. The result shows that the two proposed ensemble models, DNN-AHP and DNN-FR, are capable of predicting future ﬂash-ﬂood areas with accuracy higher than 92%; therefore, they are a new tool for ﬂash-ﬂood studies.


Introduction
As a result of the climatic changes that occurred during in recent years as well as of the massive changes in land use, a significant growth in the number of extreme meteorological and hydrological phenomena can be observed [1]. Thus, the increase in the global average temperature, which determines a surplus of energy in the atmosphere, and the conversion of more and more natural surfaces into built areas, are the leading causes of multiplying the number of floods and flash-floods throughout the world [2]. Popular floods and flash-floods are considered among the most devastating hazards [2,3]. Worldwide, these natural hazards cause multiple economic losses annually, affecting over 200 million people [4,5]. Flash-floods are defined as the rapid onset, usually up to 6 h, of the river level and flow, with high flooding potential in the areas where the slope angle allows water accumulation and the exceedance of the river banks [6]. Among the flash-floods triggering causes, heavy rainfalls rank the first, followed by the snow melting process and the breaking of dams [7]. The mitigation of flash-floods' adverse effects requires the appropriate structural and non-structural measures.
The delimitation of the areas susceptible to flash-floods triggering is one of the essential non-structural measures that can be adopted [8]. Precise detection of surfaces susceptible to flash-floods generation is the basis for issuing the forecasts and warnings, which can further help to decrease

Deep Neural Network
Unlike simple neural networks whose architecture contains a single hidden layer of neurons, the DNN has a feed-forward structure, which, along with input and output layers, contains two or more hidden layers [33]. Because the presence of multiple hidden layers is intended to solve complex classification problems, DNN models are considered to be more powerful and more efficient than simple neural networks [34]. In the present case, in Figure 1, the DNN architecture used in the present research is schematically represented to determine susceptibility to flash-floods. The input layer will contain information regarding the flash-flood predictors, which will be forwarded to hidden layers where this information will be analyzed and processed. The signals processed by hidden layers will be further sent to the output layer containing the negative class (non-flash-flood locations) and positive class (flash-flood locations) [35]. Further, using the back-propagation algorithm, the output error will be transmitted back from the output layer to the hidden layers. In the classification problems, the backpropagation algorithm is frequently used to train the feed-forward neural networks. In terms of the DNN, this algorithm calculates the gradient of the loss function with respect to each weight by the chain rule and avoids the redundant computations within the intermediate factors of this chain rule [36]. To process the received information, each neuron contained by the hidden layers will be applied to an activation function. In the present situation in which the deep neural network will use, for training, the vanishing gradient with the back-propagation algorithm, the activation function represented by the Rectified Linear Unit (ReLU) [35] will be employed. This function facilitates the determination of an optimal trade-off between the complexity, which is measured in terms of the number of nonzero weights of the network, and the approximation fidelity of neural networks when piecewise constant functions is approximated [37]. It should also be mentioned that there is a general agreement in the literature according to which ReLU can approximate more efficiently the smooth function than shallow networks [38]. ReLU function, which is expressed in Equation (1) will significantly reduce the vanishing gradient.
where x highlights the input signal received by a neuron and r denotes the ReLU function. The use of the back-propagation algorithm requires the derivative of the ReLU function, which can be obtained as follows: The difference between the flood inventories and the estimated floods is to reduce in the training procedure of DNN, by using the weights of the connections between the layers. In this regard, this difference, which is reduced through the back-propagation algorithm, will be highlighted by the cross-entropy function (E). In the present study, the cross-entropy function, which is usually used as a loss function, will be involved in the frame-leveling training. Remarkably, the cross-entropy function has a high contribution to the success of DNN [39]. The cross-entropy function has the following form: To process the received information, each neuron contained by the hidden layers will be applied to an activation function. In the present situation in which the deep neural network will use, for training, the vanishing gradient with the back-propagation algorithm, the activation function represented by the Rectified Linear Unit (ReLU) [35] will be employed. This function facilitates the determination of an optimal trade-off between the complexity, which is measured in terms of the number of nonzero weights of the network, and the approximation fidelity of neural networks when piecewise constant functions is approximated [37]. It should also be mentioned that there is a general agreement in the literature according to which ReLU can approximate more efficiently the smooth function than shallow networks [38]. ReLU function, which is expressed in Equation (1) will significantly reduce the vanishing gradient.
where x highlights the input signal received by a neuron and r denotes the ReLU function. The use of the back-propagation algorithm requires the derivative of the ReLU function, which can be obtained as follows: The difference between the flood inventories and the estimated floods is to reduce in the training procedure of DNN, by using the weights of the connections between the layers. In this regard, this difference, which is reduced through the back-propagation algorithm, will be highlighted by the cross-entropy function (E). In the present study, the cross-entropy function, which is usually used as a loss function, will be involved in the frame-leveling training. Remarkably, the cross-entropy function has a high contribution to the success of DNN [39]. The cross-entropy function has the following form: Water 2020, 12, 1549 4 of 24 where N is the total flash-flood points in the training sample; M is the flash-flood values, while P is the predicted flash-flood values. Further, the training of the deep neural network in this research was done through the adaptive moment (Adam) estimation method, which combines the advantages of gradient descend with the momentum model and RMSProp model [40]. Adam is an algorithm used in the stochastic optimization process. It is considered that the Adam method is more performant than other stochastic optimization models and can achieve outstanding results [40]. The following formula can express the gradient used by the back-propagation procedure: where m is a training sample x (1) , x (2) , . . . , x (m) ; T (i) (i = 1, 2, . . . , m) are the flash-flood; and L represents the cost function; C = 2 is the output classes, non-flash-flood and flash-flood; w highlights the network weights. Using Adam, the first and second moment are estimated through the exponential moving averages, which are calculated with the following formulas [36]: where m and v represent the moving averages, g is the gradient on the current mini-batch, and β represents new hyper-parameters introduced within the algorithm. β 1 has a default value of 0.9 while β 2 has a default value of 0.999. The bias correction for the first and second moment will be done with the help of the following formulas:m Further, using the moving averages, the network weights can be updated as follows [36]: where w t is the model weight, η is the learning rate and is equal to 0.001, and is used for the numerical stability and is equal to 10 −8 . Equation (10) ensures the update of the DNN parameters:

Analytical Hierarchy Process
Developed and proposed by Saaty [41], AHP allows us to analyze and resolve the problems in a flexible and easy-to-understand way. A characteristic of this multicriteria decision-making approach is represented by the participation of experts within the methodological process [42]. The AHP is a popular algorithm used in studies related to the evaluation of propensity to risk phenomena [43][44][45][46]. This algorithm is based on dividing an unstructured problem into many parts. Within the application of AHP for computing the input information into DNN algorithm, six main steps can be depicted as follows [47]: (i) Describe the main goal and divide the central problem into several parts.
(ii) Determine the detailed parameters.
(iii) Construct the comparison matrix by applying a pair-wise comparison between class/category inside each factor [31]. These matrices were generated according to the expert opinion regarding the manner in which the classes/categories of factors influence the rapid runoff on the slopes. A class/category which has a higher importance than another which it is compared with, will be assigned a relative value from 1 to 9 on horizontal cells of the matrix, while in the case when a class/category has smaller importance will be assigned a value between 1/9 and 1/2. The vertical cells of the matrix will have inverse values of those from horizontal cells [48].
(iv) Estimating weights for the factor classes based on the eigenvalues.
(v) Compute the consistency ratio (CR), reflecting the quality of the comparison between each component [41]. In the present case, the CR values were estimated according to the following equations: where CI represents the consistency index, λ max is equal to the largest matrix eigenvalue that is derived from the comparison matrix, and n is the sum of classes/categories.
where RI is the random consistency index whose calculation procedure is described by Costache and Tien Bui [48].
(vi) Using the weights of factor class/category as input data in the training process of Deep Neural Network (DNN) algorithm.

Frequency Ratio
Frequency Ratio represents one of the most known and easy to implement a statistical algorithm [25]. This model analyzes the spatial correlation between the area with the presence of the analyzed phenomenon and the spatial variability of the values of the flash-flood predictors. In this case, the frequency ratio values will be established following the analysis of the spatial relationships between the pixels representing the presence of the flash-flood phenomena and the values of the classes/categories related to the flash-flood predictors. According to Equation (13), we will analyze the ratio between the sum of pixels with flash-flood phenomena and the sum of pixels within each class/category of predictors but also within the study area [27].
where: FR represents the value of frequency ratio associated to class i with predictor j, Np(LXi) is equal to the sum of points with flash-floods in class i of predictor X, Np(Xj) is the sum of pixels in factor variable Xj, m represents the sum of classes in the factor variable Xi, and n is the sum of factors inside the research area.
Therefore, it can be concluded that the FR value of 1 is an average value, whereas values below 1 show a low correlation between flash-flood phenomena and flood predictors, and values above 1 show a high correlation between them [49]. The FR values, derived from the present research, will be used as input data within the DNN model.

Description of the Study Area
The study area is a catchment of the Prahova river in southeastern Romania and has the following geographical limits: 45 • 32 21.85" and 45 • 00 23.33" North latitude, and 25 • 27 32.00" and 26 • 27 10.19". East longitude (Figure 2). The research area has a total surface of 2600 sq. km.
rocks. Most of the hard rocks are specific to areas with slope values over 15°, where the surface runoff is intensified. Overall, at the level of the research zone, there is an altitude difference of almost 2400 m, the highest peaks being located at over 2500 m.
The enormous amount of precipitation, which exceeds 1000 mm annually, falls into the Carpathian region, where the heavy rainfalls leading to flash-floods are frequent. These phenomena occur primarily during the summer, once with the development of convective cloud systems. Land use is another essential factor that controls surface runoff . The built-up surfaces that highly favor the flash-flood potential cover around 10% of the study zone, while the natural grasslands, which also determine a high value of flash-flood potential, have a percentage equal to 15% of the total. The forest surfaces greatly influence the water balance at the river catchment level. From this point of view, a percentage of 52% of the research surface is occupied by forests. It should be mentioned that within the research area are located the most important touristic cities of Romania that are internationally recognized: Predeal, Bușteni, Azuga, and Sinaia.  The relief of the study area that belongs to both the mountain and hilly regions of Carpathian Curvature is developed on a complex lithological structure consisting of loess, gravels, and hard rocks. Most of the hard rocks are specific to areas with slope values over 15 • , where the surface runoff is intensified. Overall, at the level of the research zone, there is an altitude difference of almost 2400 m, the highest peaks being located at over 2500 m.
The enormous amount of precipitation, which exceeds 1000 mm annually, falls into the Carpathian region, where the heavy rainfalls leading to flash-floods are frequent. These phenomena occur primarily during the summer, once with the development of convective cloud systems. Land use is another essential factor that controls surface runoff. The built-up surfaces that highly favor the flash-flood potential cover around 10% of the study zone, while the natural grasslands, which also determine a high value of flash-flood potential, have a percentage equal to 15% of the total. The forest surfaces greatly influence the water balance at the river catchment level. From this point of view, a percentage of 52% of the research surface is occupied by forests. It should be mentioned that within the research area are located the most important touristic cities of Romania that are internationally recognized: Predeal, Bus , teni, Azuga, and Sinaia.

Flash-Flood Inventory
For a more accurate prediction of areas that may be affected in the future by flash-flood, it is essential to correctly evaluate the spatial relationships between the predictors and this phenomenon [50]. In this respect, the areas previously affected by torrential phenomena were identified and mapped. According to Costache and Zaharia (2017) [21], the surfaces characterized by unitary distribution guiles and ravines, which are considered torrential microforms, were considered into the analysis. The remote sensing imagery is essential in order to detect the phenomena from the earth's surface [51]. It should be noted that these areas were extracted, in a first stage, from Google Earth aerial imagery. It is remarkable, also, that the Google Earth uses aerial imagery, in natural colors, provided by Airbus Defense and Space under contract with French Centre National d'Études Spatiales (CNES) at a spatial resolution of 0.5 m [52]. The surfaces were then validated by field measurements using a Trimble GeoXH 2008 Series GPS device. Finally, a total area of 260 sq. Km, containing about 288,120 pixels, was delineated ( Figure 3). In order to be involved in the training process, the geo-environmental values of the torrential areas were generated into a sample, including 178 points. In the GIS environment, these points selected from torrential areas were converted in torrential pixels with the cell size equal to that of the raster of the flash-flood conditioning factors.
Water 2020, 12, x FOR PEER REVIEW 7 of 26

Flash-Flood Inventory
For a more accurate prediction of areas that may be affected in the future by flash-flood, it is essential to correctly evaluate the spatial relationships between the predictors and this phenomenon [50]. In this respect, the areas previously affected by torrential phenomena were identified and mapped. According to Costache and Zaharia (2017) [21], the surfaces characterized by unitary distribution guiles and ravines, which are considered torrential microforms, were considered into the analysis. The remote sensing imagery is essential in order to detect the phenomena from the earth's surface [51]. It should be noted that these areas were extracted, in a first stage, from Google Earth aerial imagery. It is remarkable, also, that the Google Earth uses aerial imagery, in natural colors, provided by Airbus Defense and Space under contract with French Centre National d'Études Spatiales (CNES) at a spatial resolution of 0.5 m [52]. The surfaces were then validated by field measurements using a Trimble GeoXH 2008 Series GPS device. Finally, a total area of 260 sq. Km, containing about 288,120 pixels, was delineated ( Figure 3). In order to be involved in the training process, the geo-environmental values of the torrential areas were generated into a sample, including 178 points. In the GIS environment, these points selected from torrential areas were converted in torrential pixels with the cell size equal to that of the raster of the flash-flood conditioning factors.

Flash-Flood Predictors
Similarly to any other natural risk phenomena, the occurrence process of flash-floods is controlled by the simultaneous action of several geographic factors. The genesis of these natural hazards is related to massive rainfall events. Regarding the study area, the heavy rainfall with the same intensity could occur in any part of its, so it is impossible to differentiate the areas according to

Flash-Flood Predictors
Similarly to any other natural risk phenomena, the occurrence process of flash-floods is controlled by the simultaneous action of several geographic factors. The genesis of these natural hazards is related to massive rainfall events. Regarding the study area, the heavy rainfall with the same intensity could occur in any part of its, so it is impossible to differentiate the areas according to this climatic parameter. For this reason, the rainfall factor will be considered constant across the study area, so its values will not be considered within the analysis. Instead, the within the analysis will be involved a number of 10 geographic factors that, with a variable influence, on surface runoff manifestation.
Lithology is a crucial flash-flood predictor because it controls the water infiltration process during the massive rainfall events. Thus, the hard rocks with a high impermeability degree will favor the surface runoff manifestation, unlike sedimentary rocks, that favor the infiltration process. From the Geological Map of Romania, 1:200,000, a number of 12 lithological groups were extracted across the study area. Obtained initially in vector format, they were transformed to grid dataset with a size of cell equal to 30 m (Figure 4d).
Water 2020, 12, x FOR PEER REVIEW 8 of 26 this climatic parameter. For this reason, the rainfall factor will be considered constant across the study area, so its values will not be considered within the analysis. Instead, the within the analysis will be involved a number of 10 geographic factors that, with a variable influence, on surface runoff manifestation.
Lithology is a crucial flash-flood predictor because it controls the water infiltration process during the massive rainfall events. Thus, the hard rocks with a high impermeability degree will favor the surface runoff manifestation, unlike sedimentary rocks, that favor the infiltration process. From the Geological Map of Romania, 1:200,000, a number of 12 lithological groups were extracted across the study area. Obtained initially in vector format, they were transformed to grid dataset with a size of cell equal to 30 m (Figure 4d). Hydrological Soil Group also influences the quantity of water infiltrated into the soil, the hydraulic conductivity ranging from 1 µ m/s (group D) to more than 40 µ m/s (group A). The Hydrological Soil Group also influences the quantity of water infiltrated into the soil, the hydraulic conductivity ranging from 1 µm/s (group D) to more than 40 µm/s (group A). The hydrological soil group A occurs on most of the surface area (801 sq. km), followed by group B (727 sq. km), group C (213 sq. km) and group D (682 sq. km) (Figure 5d). Initially extracted in vector format, by using the Digital Soils Map of Romania, 1:200,000, this layer was transformed into a grid dataset with a size of cell equal to 30 m.
Water 2020, 12, x FOR PEER REVIEW 9 of 26 hydrological soil group A occurs on most of the surface area (801 sq. km), followed by group B (727 sq. km), group C (213 sq. km) and group D (682 sq. km) (Figure 5d). Initially extracted in vector format, by using the Digital Soils Map of Romania, 1:200000, this layer was transformed into a grid dataset with a size of cell equal to 30 m. Land use, together with the slope angle, controls the surface runoff velocity by influencing the Manning Roughness coefficients and by influencing the rainfall interception process [53][54][55][56]. Forest areas are characterized by high values for both rainfall interception and Manning Roughness coefficients. Therefore, these land-use types are the most restrictive for surface runoff while the builtup areas and pastures favor the flash-flood occurrence. Within the study were identified a number of 10 land use categories (Figure 4c), the most extensive surfaces being covered by forests (50%), while the most torrential areas are located within the natural grasslands category (55%).
Slope angle has been confirmed as a key factor for flash-flood studies because it greatly controls the speed of the floodwaters [57][58][59][60]. This was derived from the Digital Elevation Model (DEM) with a cell size of 30 m. The DEM was extracted from Shuttle Radar Topography Mission (SRTM) database at a spatial resolution of 30 m. It should be noted that SRTM is an international project coordinated by the U.S. National Geospatial-Intelligence Agency (NGA) and the U.S. Aeronautics and Space Administration (NASA) [61]. According to Zaharia et al. [62] the following slope classes have been established: 0°-3°, 3°-7°, 7°-15°, 15°-25° and > 25° (Figure 5a). It is remarkable the fact that the most torrential areas are included in the class ranging from 15° to 25°, while the maximum concentration of torrential pixels is present within the class characterized by slopes above 25°. This spatial distribution of torrential areas shows that the flash-flood potential increases with the increase of slope value.
Convergence index (Ci) is a morphometric factor with high importance on the flash-flood potential [18,21]. Its negative values highlight the river valleys perimeters, meanwhile the values Land use, together with the slope angle, controls the surface runoff velocity by influencing the Manning Roughness coefficients and by influencing the rainfall interception process [53][54][55][56]. Forest areas are characterized by high values for both rainfall interception and Manning Roughness coefficients. Therefore, these land-use types are the most restrictive for surface runoff while the built-up areas and pastures favor the flash-flood occurrence. Within the study were identified a number of 10 land use categories (Figure 4c), the most extensive surfaces being covered by forests (50%), while the most torrential areas are located within the natural grasslands category (55%).
Slope angle has been confirmed as a key factor for flash-flood studies because it greatly controls the speed of the floodwaters [57][58][59][60]. This was derived from the Digital Elevation Model (DEM) with a cell size of 30 m. The DEM was extracted from Shuttle Radar Topography Mission (SRTM) database at a spatial resolution of 30 m. It should be noted that SRTM is an international project coordinated by the U.S. National Geospatial-Intelligence Agency (NGA) and the U.S. Aeronautics and Space Administration (NASA) [61]. According to Zaharia et al. [62] the following slope classes have been (Figure 5a). It is remarkable the fact that the most torrential areas are included in the class ranging from 15 • to 25 • , while the maximum concentration of torrential pixels is present within the class characterized by slopes above 25 • . This spatial distribution of torrential areas shows that the flash-flood potential increases with the increase of slope value.
Convergence index (Ci) is a morphometric factor with high importance on the flash-flood potential [18,21]. Its negative values highlight the river valleys perimeters, meanwhile the values higher than 0 highlight the presence of the interfluvial areas. For Ci computation, the Digital Elevation Model was imported into SAGA GIS 2. Profile curvature was taken into account because it is useful to delimit the areas characterized by high runoff and the areas with low runoff [17]. The surfaces with accelerated runoff (−8.2-0) occupy approximately 50% of the total (Figure 5f), is also characterized by the presence of 50% of torrential pixels ( Figure 6).
Plan curvature allows us to distinguish between the areas on which the convergent runoff occurs and the areas on which the divergent runoff occurs. Therefore, we can state that this morphometric factor controls the water flow (Pham et al., 2017). Values between 0.1 and 0.5 are spread on almost 58% of the research area and incorporate 48% of the torrential pixels (Figure 5e).
Topographic Position Index (TPI) is another morphometric factor which was obtained in SAGA GIS software. It is defined as the difference between each cell of the Digital Elevation Model and the mean elevation of a defined neighborhood around that cell [63]. The TPI can be calculated according to the following formulas: where z 0 is the elevation of the central cell for which the TPI is calculated, z is the average elevation around the central cell within a specific radius (R) [64]. The Natural Breaks method was used to divide the TPI value into 5 groups. These groups were finally used to generate the TPI map. Thus, the largest surface is covered with the third TPI values class (45%), ranging from (−0.83) to 0.14 ( Figure 5b). The same class is characterized by the largest surface of torrential areas, which counts around 34% of the total number of torrential pixels.
Topographic Wetness Index (TWI) is defined as a quantitative index that indicates the balance between the flow accumulation and the drainage conditions existent at the local scale [65]. TWI is a widely used conditioning factor in the delineation of flood-prone areas [66]. This another essential runoff factor can be calculated through the following relation [67]: where α represents the upslope area that drains through a pixel and tan β represents the slope angle in that pixel.
Similar to TPI, the TWI values, ranging between −9.7 and 25 (Figure 5c), were classified by the mean of the Natural Breaks algorithm. Therefore, five classes were used in order to map the TWI values. The medium TWI values (8.5-12) cover the highest percentage of the study area (29%) (Figure 6).
Aspect is a predictor obtained from the Digital Elevation Model. This morphometric indicator was split into 10 groups, which influence in a different manner the soil humidity, solar radiation, or the amount and regime of rainfall. With a weight of 15% of the study area, eastern and south-western surfaces rank the first in terms of area occupied by each aspect direction (Figure 4a). Eastern surfaces contain around 18.3% of the total torrential pixels ( Figure 6).

Proposed Ensemble Approach Based on DNN, AHP, and FR for Flash-Flood Susceptibility Mapping
Because the main objective of this research is the mapping of surfaces prone flash-floods, only the geospatial data will be used in the analysis. The entire workflow of the analysis is synthetically presented in Figure 7.

Flood Database
The flood databases, containing 10 flash-flood predictors and 178 flash-flood locations, were processed and created in ArcGIS 10.5 software (ESRI, Redlands, CA, USA). According to the literature, the selection and the inclusion into the analysis of another sample containing the locations where the phenomenon is absent are mandatory in order to increase the performance of the models [24,49,68]. Therefore, another 178 non-torrential points were located over the areas having a slope around 0° where flash-floods are almost impossible. According to the vast majority of researchers, 70% of the torrential and non-torrential pixels were considered as training data set, while the rest of 30% will be employed to validate the results of applied models [2,47,50,69]. All the data involved in the methodological process is transformed into a raster data set with a cell dimension of 30 × 30 m.

Data Diagnosis and Checking
Using the Information Gain Ratio (IGR), the predictors were selected according to their predictive power. The IGR is one of the most popular methods intended to check the predictive ability of a predictor in terms of a given phenomenon [70]. This algorithm, proposed by Chapi et al. [10] uses the following relation:

Flood Database
The flood databases, containing 10 flash-flood predictors and 178 flash-flood locations, were processed and created in ArcGIS 10.5 software (ESRI, Redlands, CA, USA). According to the literature, the selection and the inclusion into the analysis of another sample containing the locations where the phenomenon is absent are mandatory in order to increase the performance of the models [24,49,68]. Therefore, another 178 non-torrential points were located over the areas having a slope around 0 • where flash-floods are almost impossible. According to the vast majority of researchers, 70% of the torrential and nontorrential pixels were considered as training data set, while the rest of 30% will be employed to validate the results of applied models [2,47,50,69]. All the data involved in the methodological process is transformed into a raster data set with a cell dimension of 30 × 30 m.

Data Diagnosis and Checking
Using the Information Gain Ratio (IGR), the predictors were selected according to their predictive power. The IGR is one of the most popular methods intended to check the predictive ability of a predictor in terms of a given phenomenon [70]. This algorithm, proposed by Chapi et al. [10] uses the following relation: where D is the training dataset composed of n input samples, n (Y i , D) is the number of samples in the training data D belonging to the class label Y i (Torrential, non-Torrential).

Model Setup and Training
A number of 10 pair-wise comparison matrices were constructed in Excel 2016 software (Microsoft, Redmond, WA, USA)for the derivation of AHP weights corresponding to each factor class/category. The consistency of each matrix was evaluated by calculating the Consistency Ratio according to 2.2.
The computation of FR coefficients required the use of both ArcGIS 10.5 and Excel 2016 software. In the first stage, in a GIS environment, the information regarding the values of predictors was extracted to each flash-flood location. Further, this information was used in Excel, where the FR coefficients were calculated using formula (13). The FR coefficients were normalized in the range 0.1-0.9 through the following equation [21]: where y is the standardized value of x; x is the current value of the variable; d is the range value limits and n is is standardization range limits. The values of AHP weights and FR coefficients were further used to compute the ensemble models DNN-AHP and DNN-FR. Using these ensembles, the FFPI values were determined across the study area. It should be mentioned that the DNN-AHP and DNN-FR hybrid models were trained by using the Adam method, which classified the raster pixels into the following two categories flash-flood and non-flash-flood. According to Figure 1, the architecture of DNN ensembles is characterized by a number of 3 hidden layers. Each of them contains 100 neurons. A trial-error procedure was employed to establish the numbers of hidden neurons and hidden layers.

Quality Evaluation
In a first step, the results quality was checked by computing the next statistical metrics: Negative Predicted Values (NPV), Positive Predictive Value (PPV), Specificity, Sensitivity, Accuracy, Kappa Index. In order to determine the values of these statistical indices, the true positive (TP), the false positive (FP), the false negative (FN), and the true negative (TN) were involved in the following formulas: The results of DNN ensembles were also evaluated by using the Receiver Operating Characteristic (ROC) Curve. This algorithm is frequently used to evaluate the reliability of results provided by a model [10,23,71,72]. To draw the ROC Curve, the acquired results and existing flash-flood locations were compared. Therefore, the FFPI results were tested using the success rate, obtained with the training sample, and also the prediction rate, constructed with the validation sample. The Area Under Curve (AUC) associated with the success rate shows the model's performance in terms of results classification, while the prediction rate highlights the accuracy of the results. An AUC value near 1 indicates a high performant model, while a non-informative model is revealed by an AUC near 0 [8,21]. The AUC can be estimated using Equation (27) below [70]: where TP (true positive) is the sum of flash-flood pixels accurately classified, TN (true negative) is the sum of non-flash-flood pixels accurately classified, P and N are the sum of flash-flood pixels and non-flash-flood pixels, respectively.

Compiling Flash-Flood Susceptibility Map
The derivation of the importance assigned to each flash-flood predictor was possible by applying the DNN-AHP and DNN-FR. The last step of flash-flood potential mapping was performed in ArcGIS 10.5, where the weights of predictors were multiplied with the FR and AHP coefficients. Figure 8 shows the results regarding the predictive capability of flash-flood predictors which were estimated through the Information Gain Ratio method. The highest predictive capability was attributed to slope angle (0.92), followed by profile curvature (0.83), hydrological soil group (0.71), lithology (0.64), plan curvature (0.59), TWI (0.54), convergence index (0.43), TPI (0.41) and aspect (0.32). As can be seen, the weak predictive capability, which is equal to 0.32, is higher than 0 and, therefore, all the predictors initially considered were used in the present analysis. Water 2020, 12, x FOR PEER REVIEW 16 of 26

AHP and FR Weights
All the possible comparison, involving the factor classes/categories, are included in Table 1. In this regard, by attributing the relative dominant value, each class/category was rated against every other. Table 1 also includes the final coefficients resulted from the application of the AHP algorithm. As was mentioned in Section 4.3, in order to evaluate the consistency of judgments, the CR value for each matrix was computed. Because all CR values are below 0, we can state that all the comparisons included in the matrices are consistent. Moreover, the other parameters used in the AHP algorithm workflow are shown in Table 2.
Together with AHP normalized values, Table 1 also includes the normalized value of FR coefficients. Within each flash-flood predictor, the highest FR normalized values (0.9) were obtained by the following classes/categories: slope angles above 25°, TPI ranging from −20 to −3.8, TWI ranging from −9 to 4.5, grassland land use category, sandy flysch lithological group, profile curvature ranging from 0.9 to 9.7, plan curvature ranging from 0.6 to 8.5, eastern slopes, hydrological soil group B and convergence index ranging from −2 to −1.

AHP and FR Weights
All the possible comparison, involving the factor classes/categories, are included in Table 1. In this regard, by attributing the relative dominant value, each class/category was rated against every other. Table 1 also includes the final coefficients resulted from the application of the AHP algorithm. As was mentioned in Section 4.3, in order to evaluate the consistency of judgments, the CR value for each matrix was computed. Because all CR values are below 0, we can state that all the comparisons included in the matrices are consistent. Moreover, the other parameters used in the AHP algorithm workflow are shown in Table 2.
Together with AHP normalized values, Table 1 also includes the normalized value of FR coefficients. Within each flash-flood predictor, the highest FR normalized values (0.9) were obtained by the following classes/categories: slope angles above 25 • , TPI ranging from −20 to −3.8, TWI ranging from −9 to 4.5, grassland land use category, sandy flysch lithological group, profile curvature ranging from 0.9 to 9.7, plan curvature ranging from 0.6 to 8.5, eastern slopes, hydrological soil group B and convergence index ranging from −2 to −1.

Flash-Flood Modelling with DNN-AHP and DNN-FR
The training process of DNN-AHP and DNN-FR was ensured by the use 249 flash-flood and non-flash-flood locations, which account around 70% of the entire dataset. Together with the information regarding the presence or the absence of flash-flood past phenomena, to these points were attributed the AHP and FR weights. In terms of DNN-AHP ensemble, its training performance is highlighted by the values of the following metrics: PPV = 95.2%, NPV = 100%, Sensitivity = 100%, Specificity = 95.42%, Accuracy = 97.6%, Kappa = 0.952 (Table 3   Given the high performances of the two ensembles presented above, the deep learning models were involved in the mapping procedure of FFPI across the study area. This operation was done in ArcGIS 10.5 by using the FFPI values in raster format. According to Figure 10, the values of both indices, FFPIDNN-AHP and FFPIDNN-FR, were partitioned into 5 classes by using the Natural Breaks algorithm. In terms of FFPIDNN-AHP, the very low values ranging from 0 to 0.13 cover around 37.9% of the study area, while the low values between 0.14 and 0.34 quantify approximately 9.48% of the research zone. The medium FFPIDNN-AHP class is present on 8.69% of the research zone, while the high and very high classes span on a total of 43.9% of the study area (Figure 10a). In terms of FFPIDNN-FR, it can be observed that the first and the second classes, Given the high performances of the two ensembles presented above, the deep learning models were involved in the mapping procedure of FFPI across the study area. This operation was done in ArcGIS 10.5 by using the FFPI values in raster format. According to Figure 10, the values of both indices, FFPI DNN-AHP and FFPI DNN-FR , were partitioned into 5 classes by using the Natural Breaks algorithm. In terms of FFPI DNN-AHP , the very low values ranging from 0 to 0.13 cover around 37.9% of the study area, while the low values between 0.14 and 0.34 quantify approximately 9.48% of the research zone. The medium FFPI DNN-AHP class is present on 8.69% of the research zone, while the high and very high classes span on a total of 43.9% of the study area (Figure 10a). In terms of FFPI DNN-FR , it can be observed that the first and the second classes, between 0 and 0.4, quantify approximately 41% of the entire study zone, while the medium class accounts for 13.58% of the total surface. The regions with critical FFPI DNN-FR values are distributed over 45% of the research territory (Figure 10b).

Discussions
In Romania, a significant increase in the intensity of flash-floods can be observed in recent years [24]. This fact is mainly associated with global climate change affecting the area in which Romania is located but also is due to the transformations undergone within the land-use, especially the excessive deforestation. [21]. The most affected area by these natural risk phenomena is the mountain and hilly

Discussions
In Romania, a significant increase in the intensity of flash-floods can be observed in recent years [24]. This fact is mainly associated with global climate change affecting the area in which Romania is located but also is due to the transformations undergone within the land-use, especially the excessive deforestation. [21]. The most affected area by these natural risk phenomena is the mountain and hilly area of Romania, where the slope of the relief, and the lack of forest across certain regions, create the premises for the production of devastating fast-floods. [18]. This is also the case of the Prahova river basin area on which the present research work has focused. Thus, at the level of the study area, there is a need to identify as accurately as possible the perimeters exposed to the formation of the torrential flow which further causes the production of severe flash-floods. Testing of many modern methods for detecting these surfaces is mandatory in order to identify the techniques that provide the best results and which can be recommended to be used in other areas exposed to flash-floods. The potential for rapid surface runoff at the level of the Prahova River basin has been determined in the past through bivariate statistical techniques combined with machine learning algorithms such as Logistic Regression, Rotation Forest, Multilayer Perceptron Neural Network, Support Vector Machine, Naive Bayes, k Nearest Neighbor (k-NN) or K-Star [25,26,28,73]. All these models have achieved high performances and are generally characterized by accuracies that exceeded 85%. The AHP model has been applied, within the same study area, in combination with k-NN and K-Star, obtaining also high accuracy values (90.1%) [73]. In this study, instead, it was decided to use DNN which is considered one of the state of the art techniques used to determine the susceptibility to this type of hazard [74]. In order to increase the performance of the model, the input data were optimized by determining the AHP and FR weights for each class/category of factors. The high performance of DNN-FR and DNN-AHP algorithms was also demonstrated in the present study in which the accuracy of the models was very close to perfection (higher than 94%). Thus, their accuracy exceeded the accuracy of the best model applied in previous research, in the same study area, namely Rotation Forest (RF) (91%). The RF algorithm was applied by Costache [26]. The application of the AHP method in other studies regarding the evaluation of the susceptibility to floods and flash-floods in other regions is also noted. One such example is the study carried out by Das [66], which evaluated the Ulhas basin (India) from the flood susceptibility point of view using the AHP method. It should be mentioned that the accuracy of stand-alone AHP method (84%), applied by Das [75], was lower than the accuracy of DNN-AHP (97.9%) obtained in the present research.
According to Figure 8, the factor with the highest influence on flash-flood susceptibility is the slope angle. Land use and profile curvature also have an essential influence in the flash-flood triggering. The high slopes and the absence of forests in the northern and north-western part of the study area is the main explanation for the very high FFPI values recorded within these regions. As can be observed in Figure 10 the most susceptible areas are especially located in the aforementioned areas are represented with red color, because this color is usually used to highlight the high intensity of a phenomenon.

Conclusions
Identifying, as accurately as possible, the areas with a high surface runoff potential is a necessary action based on which the non-structural measure for defense against the flash-flood adverse effects can be adopted. Given this consideration, the present study proposes a novel methodology used to identify the areas exposed to flash-flood occurrence across the Prahova river basin. The main goal of the developed workflow was of the computation of the Flash-Flood Potential Index through two ensembles of DeepNeural Networks (DNN-AHP and DNN-FR). A number of 10 predictors, selected through the Information Gain Ration method, along with 70% of flash-flood and non-flash-flood samples, were employed in the training procedure of the models, while the validation of the results was carried out by using the other 30% of the data sample. The assessment of the performance of the models and the validation of the result was made in two steps: (i) through ROC Curve and (ii) by calculating several statistical metrics. ROC Curve method revealed that the most reliable ensemble model was DNN-AHP taking into account both Success Rate (AUC = 0.979) and Prediction Rate (AUC = 0.953). Areas identified as having critical values (high and very high) of flash-flood potential cover between 43.9% (FFPI DNN-AHP ) and 45% (FFPI AHP ) of the study area. These areas are distributed mainly in the North-Western part where several touristic cities are located. In these conditions, these localities are highly vulnerable to flash-flood phenomena. Many flash-flood events occurred during the next years: 2005, 2006, 2010, 2014, 2017, and 2018.
The principal novelty that characterizes the present study consists of the use for the first time in the literature of the DNN-AHP and DNN-FR ensembles models to assess the susceptibility to natural risk phenomena, and therefore for estimating the Flash-Flood Potential Index across the study area. The outcomes of the present research can be successfully used for effective territorial planning as well as for the improvement of flash-flood forecasts and warning accuracy.