Flood Hazard Mapping Using the Flood and Flash-Flood Potential Index in the Buz ă u River Catchment , Romania

The importance of identifying the areas vulnerable for both floods and flash-floods is an important component of risk management. The assessment of vulnerable areas is a major challenge in the scientific world. The aim of this study is to provide a methodology-oriented study of how to identify the areas vulnerable to floods and flash-floods in the Buzău river catchment by computing two indices: the Flash-Flood Potential Index (FFPI) for the mountainous and the Sub-Carpathian areas, and the Flood Potential Index (FPI) for the low-altitude areas, using the frequency ratio (FR), a bivariate statistical model, the Multilayer Perceptron Neural Networks (MLP), and the ensemble model MLP–FR. A database containing historical flood locations (168 flood locations) and the areas with torrentiality (172 locations with torrentiality) was created and used to train and test the models. The resulting models were computed using GIS techniques, thus resulting the flood and flash-flood vulnerability maps. The results show that the MLP–FR hybrid model had the most performance. The use of the two indices represents a preliminary step in creating flood vulnerability maps, which could represent an important tool for local authorities and a support for flood risk management policies.


Introduction
Floods represent natural risk phenomena which vary in intensity, causing significant economic and human losses, and are the result of the interaction between several different anthropogenic and natural variables which are specific to an area and have different influences on the generation of these events.In the context of the global climate change caused by ever-increasing anthropic activities, the intensity and frequency of these events has increased in the past years and is continuing to intensify [1,2].
The Intergovernmental Panel on Clime Change (IPCC) recommends measures to be adopted and actions to be taken in the context of the current climatic impact on the environment [3].
European rivers are specifically analyzed in accordance with the European Flood Directive 2007/60 and the Directive 2008/94/EC of the European Parliament and of the Council, published in the Official Journal of the European Union, whilst statistical, hydraulic and GIS techniques are used for hazard and flood mapping [4][5][6] Adaptation and mitigation have generally been treated as two separate issues, both in public politics and in practice, in which mitigation is seen as the attenuation of the cause, and studies of adaption look into dealing with the consequences of climate change [7].Studies on the impact of climate change on flood risk are mostly conducted at the river basin or regional scale [8,9].
The integration of strategies for the goals of mitigation and adaptation also includes the technology and information available for decision-makers [10].
Flood hazard management procedures consist of control measures and command, including spatial planning and engineered flood defense systems, financial aid issued by national governments in order to facilitate cooperative approaches between local communities and the authorities for a safe development, and support for local flood planning [11].
In general, two types of flood analysis approaches can be distinguished-deterministic modelling and parametric approaches-which aim to use the available information in order to build an image of an area prone to this natural hazard [12].If the information required to run the model is not available, the used method may have significant anomalies.In this context, it becomes important to assess vulnerability to flooding by using the parametric approach.
The parametric approach aims to estimate the value of the vulnerability of a system by using databases which are free of charge and aims to design a methodology allowing the evaluation of vulnerability to this natural hazard.
The present study proposes the use of three models to detect areas prone to floods and flash floods: the frequency ratio (FR), a bivariate statistical method; the Multilayer Perceptron Neural Networks (MLP), a machine learning solution; and the hybrid integration of the FR and the MLP models.The FR method has been used in previous studies, due to its easy applicability [26][27][28][29][30].The MLP represents a supervised machine learning solution which uses the backpropagation algorithm and is a commonly used method in landslide and flood hazard assessment studies.The results of the abovementioned analyses were translated into a GIS environment, thus resulting in flood and flash-flood hazard maps.These methods are widely used in research in studies which aim to provide landslide or flood mapping [24,[31][32][33][34].The performance validation of the proposed models has been done using receiver operating characteristic (ROC) curves.The processing of data through the machine learning techniques requires considerable computing resources; therefore, high-performance computing systems are required [27].

General Characteristics of the Study Area
The present research was conducted in the Buzău river catchment-one of the most affected regions by floods and flash-floods in Romania-and was done by integrating bivariate statistical methods, machine learning techniques and GIS (Figure 1).According to the map with areas subject to a significant risk of flooding developed by the National Administration "Romanian Waters", at a national scale, the Buzău river is one of the rivers with the highest flood risk.Romania uses a flood and flash-flood forecasting system based on the information obtained from several sources, such as radars (1, 3, 6 h) with a resolution of 1 km 2 and data from weather stations (160 stations) and from the 1000 hydrometric stations placed on its major rivers.According to the National Administration of "Romanian Waters" and the National Meteorological Administration, the South East European Flash Flood Guidance System (SEEFFG), European Flood Awareness System (EFAS), and the Romanian Flash Flood Guidance System (ROFFG) are used to forecast flood and flash-floods for 8851 small river catchments.Previous researchers, who have used these indices to determine the areas prone to this type of natural risk phenomena, have shown the importance of knowing how to implement these indices to help local authorities to manage their interventions and minimize economic and human losses [35][36][37][38][39][40][41][42].
The Buzău river catchment is located in the south-eastern part of Romania at the cross between the historical regions of Muntenia and Moldova and is a left tributary of the Siret river with a total surface of 5264 km 2 .The catchment overlaps five counties-Bras , ov, Covasna, Prahova, Buzău and Brăila-and 116 territorial-administrative units.The Buzău river springs from the Ciucas , Mountains, located in the Curvature Carpathians, a southern group of the Eastern Carpathians, and has a total length of 302 km.The river has 32 left tributaries and 24 right tributaries (Table 1).The Buzău river drops 1242 m in elevation from its source, located at 1250 m from its mouth, where it confluences with the Siret river in the village of Voines , ti (Brăila county) at an altitude of 8 m.The Buzău river catchment has a circularity ratio of 0.24-a value which indicates that is has an elongated shape [43].Romanian Flash Flood Guidance System (ROFFG) are used to forecast flood and flash-floods for 8851 small river catchments.Previous researchers, who have used these indices to determine the areas prone to this type of natural risk phenomena, have shown the importance of knowing how to implement these indices to help local authorities to manage their interventions and minimize economic and human losses [35][36][37][38][39][40][41][42].
The Buzău river catchment is located in the south-eastern part of Romania at the cross between the historical regions of Muntenia and Moldova and is a left tributary of the Siret river with a total surface of 5264 km 2 .The catchment overlaps five counties-Brașov, Covasna, Prahova, Buzău and Brăila-and 116 territorial-administrative units.The Buzău river springs from the Ciucaș Mountains, located in the Curvature Carpathians, a southern group of the Eastern Carpathians, and has a total length of 302 km.The river has 32 left tributaries and 24 right tributaries (Table 1).The Buzău river drops 1242 m in elevation from its source, located at 1250 m from its mouth, where it confluences with the Siret river in the village of Voinești (Brăila county) at an altitude of 8 m.The Buzău river catchment has a circularity ratio of 0.24-a value which indicates that is has an elongated shape [43].

Inventory of the Historical Flood Locations and Areas Affected by Torrentiality
In order to compute the proposed models, we created a database which contains the historical flood locations from 1970-2012 and the locations of the areas affected by torrentiality.The areas affected by torrentiality were identified based on satellite imagery and from the RUSLE (Revised Universal Soil Loss Equation) model, which contains the areas where soil is affected by water erosion [44].The database contains the locations of 168 historical floods (Figure 2a) which were obtained from the National Administration "Romanian Waters" and 172 locations affected by torrentiality (Figure 2b).According to previous studies which use Machine Learning techniques [45][46][47][48][49][50], the training and testing data were split in a 70% ratio for the training samples and 30% for the testing samples.This step is important as the proposed models are trained on the training samples while the testing

Inventory of the Historical Flood Locations and Areas Affected by Torrentiality
In order to compute the proposed models, we created a database which contains the historical flood locations from 1970-2012 and the locations of the areas affected by torrentiality.The areas affected by torrentiality were identified based on satellite imagery and from the RUSLE (Revised Universal Soil Loss Equation) model, which contains the areas where soil is affected by water erosion [44].The database contains the locations of 168 historical floods (Figure 2a) which were obtained from the National Administration "Romanian Waters" and 172 locations affected by torrentiality (Figure 2b).According to previous studies which use Machine Learning techniques [45][46][47][48][49][50], the training and testing data were split in a 70% ratio for the training samples and 30% for the testing samples.This step is important as the proposed models are trained on the training samples while the testing samples are used as a final evaluation/confirmation of the model created based on the training samples.

Flood and Flash-Flood Conditioning Variables
The selection of the flood and flash-flood conditioning variables represents a key step in running the proposed models.The present study proposes the use of 14 flood conditioning variables, used to compute the Flood Potential Index (FPI), and 13 flash-flood conditioning variables, used to compute the Flash-Flood Potential Index (FFPI).The 14 variables used for the FPI are as follows: slope, elevation, hydrological soil groups (HSG), slope aspect, elevation above channel (EaC), distance from rivers (DfR), saturated hydraulic conductivity (SHC), land-use, drainage density (DD), plan curvature (PLC), Topographic Position Index (TPI), Topographic Wetness Index (TWI), multi-annual precipitations (MaP) and the Convergence Index (CI).The 13 variables used for the FFPI are as follows: slope, profile curvature (PC), HSG, slope aspect, slope length and steepness factor (L-S factor), Curve number (CN), CI, land-use, soil erodibility by water (SEW), DD, TPI, TWI, and MaP.
Most of the flood and flash-flood conditioning variables were derived from the Digital Elevation Model (DEM) which was extracted from the EU-DEM (European Digital Elevation Model) dataset with a global resolution of 25 m, which is available from the European Environment Agency (EEA).Therefore, the following variables were extracted from the DEM or from other outputs derived from it: slope, elevation, slope aspect, elevation above channel, distance from rivers, plan curvature, TPI, TWI, Convergence Index, profile curvature and L-S factor.The slope (Figure 3a) represents a very important flood or flash-flood conditioning variable, which is widely used in similar research, as its values influence the runoff process on steep slopes and the water accumulation process in areas with low slopes [40,41].For both the FPI and FFPI, the slope was classified into five classes, as shown in Tables 2 and 3. Most of the flood locations overlap the areas with slopes between 0 and 5°, representing 73.2% of the total flood pixels, while most of the areas affected by torrentiality overlap the slopes between 25-55°, representing 44.7% of the total torrential pixels.
The elevation (Figure 3b) was used for the analysis of the FPI, and was classified into five elevation classes, ranging from 1.29-1925 m.Most of the flood locations are in the elevation class between 1.29-55 m, representing 39.2% of the flood pixels.The hydrological soil groups (Figure 3c) were used in both indices, representing a very important flood conditioning variable with an impact on the water infiltration process.
The HSGs were created based on the curve number and the soil properties of the areas in which they overlap and were classified into four groups according to the National Engineering Handbook [51].The four groups are as follows: A, B, C and D. The most predominant hydrological soil group was group C, covering 58.9% of the study area, followed by group B, with 19% coverage.The four groups have different hydraulic conductivity properties, and based on these, one can say that group A has a high infiltration rate and a low runoff whilst group D has a low infiltration rate and a high

Flood and Flash-Flood Conditioning Variables
The selection of the flood and flash-flood conditioning variables represents a key step in running the proposed models.The present study proposes the use of 14 flood conditioning variables, used to compute the Flood Potential Index (FPI), and 13 flash-flood conditioning variables, used to compute the Flash-Flood Potential Index (FFPI).The 14 variables used for the FPI are as follows: slope, elevation, hydrological soil groups (HSG), slope aspect, elevation above channel (EaC), distance from rivers (DfR), saturated hydraulic conductivity (SHC), land-use, drainage density (DD), plan curvature (PLC), Topographic Position Index (TPI), Topographic Wetness Index (TWI), multi-annual precipitations (MaP) and the Convergence Index (CI).The 13 variables used for the FFPI are as follows: slope, profile curvature (PC), HSG, slope aspect, slope length and steepness factor (L-S factor), Curve number (CN), CI, land-use, soil erodibility by water (SEW), DD, TPI, TWI, and MaP.
Most of the flood and flash-flood conditioning variables were derived from the Digital Elevation Model (DEM) which was extracted from the EU-DEM (European Digital Elevation Model) dataset with a global resolution of 25 m, which is available from the European Environment Agency (EEA).Therefore, the following variables were extracted from the DEM or from other outputs derived from it: slope, elevation, slope aspect, elevation above channel, distance from rivers, plan curvature, TPI, TWI, Convergence Index, profile curvature and L-S factor.The slope (Figure 3a) represents a very important flood or flash-flood conditioning variable, which is widely used in similar research, as its values influence the runoff process on steep slopes and the water accumulation process in areas with low slopes [40,41].For both the FPI and FFPI, the slope was classified into five classes, as shown in Tables 2 and 3. Most of the flood locations overlap the areas with slopes between 0 and 5 • , representing 73.2% of the total flood pixels, while most of the areas affected by torrentiality overlap the slopes between 25-55 • , representing 44.7% of the total torrential pixels.
The elevation (Figure 3b) was used for the analysis of the FPI, and was classified into five elevation classes, ranging from 1.29-1925 m.Most of the flood locations are in the elevation class between 1.29-55 m, representing 39.2% of the flood pixels.The hydrological soil groups (Figure 3c) were used in both indices, representing a very important flood conditioning variable with an impact on the water infiltration process.
The HSGs were created based on the curve number and the soil properties of the areas in which they overlap and were classified into four groups according to the National Engineering Handbook a high infiltration rate and a low runoff whilst group D has a low infiltration rate and a high runoff potential.Group A soils have a hydraulic conductivity of over 40 µm/s and have the lowest runoff potential of the four groups.Also, when thoroughly wet, they have a high infiltration rate.Group B soils have a hydraulic conductivity between 10-40 µm/s when thoroughly wet and have a low runoff potential.Group C soils have a slow water infiltration rate with hydraulic conductivity values between 1-10 µm/s and have a moderately high runoff potential.Group D soils have the highest runoff potential and a low water infiltration rate when thoroughly wet, with a hydraulic conductivity of below 1 µm/s.
The slope aspect (Figure 3d) was classified into five slope orientation classes, with most of the catchment being covered by flat areas, representing 34.7%.Most of the flood locations are located on the flat areas, representing 52.9% of the flood pixels, whereas the areas affected by torrentiality are located on slopes with a northwest and easterly orientation, representing 29% of the torrential pixels.The importance of this variable is given by the fact that the slope orientation has a great influence on the humidity of the soil.
The elevation above channel variable (Figure 3e) was used to analyze the FPI, representing a suitable variable in the generation of floods, as the lower values are more subject to flooding than the areas with high values, and it was classified into four elevation classes.The distance from rivers (Figure 3f), as with the elevation above channel, was used in the analysis of the FPI, and was classified into eight distance classes from 0 to over 900 m.Over 46.4% of the flood locations overlap the distance class between 0 and 50 m, whilst the class ranging from 50 to 150 m overlaps 31.5% of the flood pixels.
The saturated hydraulic conductivity (Figure 3g) was derived from the dataset of the soil hydraulic properties of Europe, which is available to download from the European Soil Data Centre (ESDAC) and represents a variable which shows the amount of water that would infiltrate vertically through a saturated soil unit [52][53][54].This factor is widely used in soil and water researches.The SHC was classified into four classes, ranging from 56-3386 cm/day.Most of the flood locations overlap the class between 2668-3386 cm/day, representing 40.4% of the flood pixels, while the class between 1283-1949 cm/day overlaps 30.9% of the flood pixels.
The land use (Figure 3h) represents a variable used to compute both indices which was derived from the CORINE Land Cover 2018 dataset and classified into five land-use classes.The forest land-use class covers most of the study area, representing 40.7% of the catchment's surface, followed by the lands used for agricultural purposes, which cover 33.3% of the area.Over 25.5% of the flood pixels overlap the areas occupied by agricultural lands, while the areas most affected by the torrentiality phenomena overlap the forest areas where the slopes are the steepest, comprising 45.9% of the study area, followed by the areas covered by scrub, at 22.6%.
The land use and the deforested areas were corelated in order to get a better representation of our study area.The deforested areas were extracted from the Global Forest Change dataset, which is available from the Department of Geographical Sciences of the University of Maryland.The dataset offers us the forest changes from 2000 to 2018 and is available in a raster format.The forest change data are encoded with 0 for areas with no forest change and 1 for areas with forest change [55,56].The deforested areas have been assigned the value of "open spaces with little or no vegetation".The forest changes have been validated and partially corrected using Sentinel-2 satellite imagery, which is available from the European Space Agency (ESA).Around 14.9% of the torrential pixels overlap the areas covered by pastures, natural grasslands and open spaces with little or no vegetation.
The drainage density (Figure 3i) shows the drainage degree of the river network and represents a factor with a direct impact on the generation of floods [57,58]; it was classified into four classes from 0 to 27 km/km 2 .Around 55.9% of the flood pixels are in the class between 0-4.7 km/km 2 , whilst 42.2% of the flood pixels overlap the class between 4.7-9.7 km/km 2 .Over 98.8% of the torrential pixels overlap the class between 0 and 9.7 km/km 2 .
The plan curvature (Figure 4a) represents a flood conditioning variable used to compute the FPI, as shown by the areas with a divergent or a convergent flow.The variable was classified into four classes, ranging from (−4.03)-4.48.Over 71.4 of the flood pixels overlap the class between (−0.22)-0.07.
The areas with values close to 0 indicate that the surface is linear, while the areas with negative values indicate that the surface is concave (convergent) and the positive values are convex (divergent) [59].
The Topographic Position Index (Figure 4b) indicates the altitude difference between the neighboring and focused cells in a DEM.The TPI was classified into four classes from (−122.8)-153.8.Positive values indicate that the altitude difference is higher than the one of the neighboring cells, whilst negative values indicate that the focused cell has a lower elevation than the neighboring cells, represented in general by valleys [60].
The Topographic Wetness Index (Figure 4c) was used in the computation of both indices.It represents a morphometric factor which indicates the moisture of the soil and shows the tendency of water distribution on soil [61].The TWI depends on the topography of the area.This flood conditioning variable was classified into four classes ranging from (−0.36)-19.8.Over 44% of the flood pixels overlap the class between 3.2-5.4,whereas the areas affected by torrentiality are mostly present in the class between (−0.36)-3.2and represent 69.7% of the study area.
The precipitation (Figure 4d) represents a flood and flash-flood conditioning variable which is widely used in flood research.The multi-annual precipitation was derived from the Global Climate dataset and classified into five classes ranging from 460-1162 mm/year.The MaP was used in the computation of both indices.The class 600-750 mm/year overlaps most of the flood pixels, at around 35.1%, followed by the class 750-900 mm/year, comprising 33.3% of the flood locations.Most of the torrential pixels-42.4%-overlapthe class 900-1050 mm/year.The Convergence Index (Figure 4e) was derived from the DEM and was used for both the FPI and FFPI.This flood conditioning variable was classified in four classes ranging from (−100)-100.The negative values show that the structure of the surface is divergent and the values close to 0 indicate that the surface is planar, while the values close to 100 show that the surface is convergent.
The convergent areas are usually represented by channels, while the areas close to (−100) are represented by peaks or ridges.Around 83.9% of the flood pixels overlap the areas with values between (−32)-20.4,whilst 84.8% of the torrential pixels overlap the class between (−32)-(−6.2).The profile curvature (Figure 4f) was generated from the DEM and indicates the direction of the maximum slope of an area.The negative values indicate the fact that the surface is upwardly convex, the values close to 0 show that the surface is linear, and the areas with positive values are upwardly concave.The PC was classified into four curvature classes ranging from (−0.04) to 0.08.
The areas affected by torrentiality have an almost equal distribution in all classes, but the class between (−3.15)-(−0.04)overlaps 29% of the torrential pixels, making it the class with the highest percentage of torrential areas.The L-S factor (Figure 4g) or the slope length and steepness factor was extracted from the soil threat dataset from ESDAC [62].The L-S factor resulted from the combination of the length of the slope and the slope angle.This flash-flood conditioning variable shows the effects of the slope steepness and is used to determine the areas prone to soil erosion.This factor was classified into four classes ranging between 0.03 and > 11.2.Most of the torrential pixels-around 58.1%-overlap the class between 4.93-11.2.
The curve number (CN; Figure 4h) represents a variable widely used in flash-flood research as it indicates the areas with high runoff values.The CN was classified into five classes and was used in the computation of the FFPI.The areas overlapping most of the torrential pixels cover 37.7% and range from 49-69, followed by the class 69-83 with 10.4% of the areas affected by torrentiality.
The soil erodibility by water factor (SEW) (Figure 4i) indicates areas where soil is prone to erosion caused by water.This factor used as inputs the topography of the area, soil types, rainfall, and land-use [44].The SEW was classified into six erosion classes ranging from 0.0006-132.5tons per ha/year and was derived from the RUSLE model.Over 63.7 of the torrential pixels overlap the class 0.0006-2.5 tons per ha/year, while the class 2.5-7.7 tons per ha/year overlaps 12.7% of the areas affected by torrentiality.

Training and Testing the Models
A very important step in the present study is represented by the training of the models and testing them based on the locations in which flood or torrential phenomena are either present or not.These locations are called flood/torrential and non-flood/non-torrential areas.As mentioned before, the training and testing samples are split in a 70-30% ratio.The flood and torrential areas take the value of 1, whilst the non-flood and non-torrential areas are encoded with the value of 0. The training and testing samples also hold the values of the factors which overlap the locations with 1 and 0 and were extracted using the Extract Multi Values to Points tool in ArcGIS.The analysis of the samples was carried out in Microsoft Excel and in Weka 3.9 (open-source Machine Learning software).The resulting values were computed using ArcGIS, thus resulting in the hazard maps.

Frequency Ratio Model (FR)
The frequency ratio (FR) model represents a bivariate statistical method which is widely used in research for landslide and flood prediction mapping [24,29,31,32,47,63].The present paper proposes the use of the FR model to map the areas prone to floods and flash-floods in the Buzău river catchment.The frequency ratio is a probabilistic model.It is simple, easy to understand and apply, and it aims to determine the ratio of the area in which the occurrence of a phenomenon is present in the study area and also the probability ratio of an occurrence to a non-occurrence for given attributes [64].The FR method is based on the association of the flood and flash-flood conditioning variables and the locations of the historical floods or areas affected by torrentiality.The ratio is determined based on the analysis of the relation between the used factors and the flood and torrential locations and is shown in Tables 2  and 3, alongside the prediction ratio (PR).High PR values show that the factor holds a high influence on the generation of floods or on the surface runoff.The PR was determined based on the spatial association of each variable within the training dataset for both indices and was calculated using Equation (1) [65]: where PR represents the prediction ratio of each variable and SA represents the maximum and the minimum spatial association between the variables and the flood or torrential locations.After determining the PR values of each factor, the FPI-FR and FFPI-FR models were computed in ArcGIS.The resulting weights for each index were compared and validated through a pairwise comparison matrix.Each raster of the flood conditioning variables was reclassified based on the relative frequency values (RF).The RF values were determined as follows: where RF is the relative frequency, R + is the ratio (+) or the positive ratio, and R tot represents the sum of each positive ratio of a certain factor.
The FR model for the FPI and FFPI was determined using Equation (3) [29,63]: where FR represents the frequency ratio model applied for the FPI or FFPI, n is the number of flood conditioning variables, and W ij represents the weight of the class i of the parameter j.

Multilayer Perceptron Neural Networks (MLP)
The Multilayer Perceptron is an artificial neural network (ANN) used in function approximation and pattern recognition and is made up of three components (Figure 5) [66].Artificial neural networks represent a simple way to mimic the neural system of the human brain, in which, through various Water 2019, 11, 2116 14 of 25 samples-in this case, the training samples-one can recognize data which were previously unseen, and make decisions and solve problems regarding the spatial relationship/association between input variables and the presence or absence of a certain phenomenon [34,67,68].An MLP is based on the backpropagation algorithm-a supervised learning technique [66,69].The neurons, represented by the variables/factors used in the analysis, are known as "input layers" and are connected to the "hidden layers" through a neural connection which holds the weights of the hidden layers.The connection of the input and output layers with each neuron of the hidden and output layers is represented by (4) [70]: where Nh represents the neurons in the hidden layer, ω ij (1) represents the weight of the connection between the neuron x i and the input layer and the neuron of the second layer, ω o (0) is the bias variable which prevents the parameter a j from becoming the value zero.
which prevents the parameter aj from becoming the value zero.
The hidden layers are also connected to the output layers through a neural connection which holds the output weights [33,[71][72][73][74][75].Initially, the weights of the connections hold random values until they intersect another connection-a phase in which they are multiplied by the associated weights and that intersection [34].The interconnected neurons show the complex relationship between the input layers and the output layers-in this case, the flood/torrential or non-flood/nontorrential areas-and are also encoded with the values of 1 and 0 [34,72].In order to avoid the overfitting of the neural network, the selection of the number of hidden neurons represents an important step as it controls the accuracy level of the network, proportionate to the noise level [76,77].The number of neurons is determined based on Equation ( 5): where Nh represents the number of hidden neurons and V represents the number of flood or flashflood conditioning variables.Thus, the FPI-MLP model has 29 hidden neurons, whilst the FFPI-MLP has 27 hidden neurons.Based on the weight of each connection, the output layers generate an output decision with the values of 1 and 0. The output decisions are determined as follows: where Od is the output decision, ωj and ω0 represent the connection weights, and h is the hidden layer.The hidden layers are also connected to the output layers through a neural connection which holds the output weights [33,[71][72][73][74][75].Initially, the weights of the connections hold random values until they intersect another connection-a phase in which they are multiplied by the associated weights and that intersection [34].The interconnected neurons show the complex relationship between the input layers and the output layers-in this case, the flood/torrential or non-flood/non-torrential areas-and are also encoded with the values of 1 and 0 [34,72].In order to avoid the overfitting of the neural network, the selection of the number of hidden neurons represents an important step as it controls the accuracy level of the network, proportionate to the noise level [76,77].The number of neurons is determined based on Equation ( 5): where N h represents the number of hidden neurons and V represents the number of flood or flash-flood conditioning variables.Thus, the FPI-MLP model has 29 hidden neurons, whilst the FFPI-MLP has 27 hidden neurons.Based on the weight of each connection, the output layers generate an output decision with the values of 1 and 0. The output decisions are determined as follows: where O d is the output decision, ω j and ω 0 represent the connection weights, and h is the hidden layer.

Flood and Flash-Flood Hazard Mapping Using the Frequency Ratio (FR) Model
The first steps in computing the FR model for both the FPI and FFPI was to determine the positive ratio (Ratio +) and the prediction ratio (PR).These tasks were carried out in Microsoft Excel.After obtaining the values, the next step consisted of reclassifying each flood or flash-flood conditioning variable based on the Ratio + values, computing them using the raster calculator in ArcGIS, and multiplying them with the PR values.Hence, the resulting FPI-FR and FFPI-FR hazard maps are as shown in Figures 6 and 7.

Flood and Flash-Flood Hazard Mapping Using the Frequency Ratio (FR) Model
The first steps in computing the FR model for both the FPI and FFPI was to determine the positive ratio (Ratio +) and the prediction ratio (PR).These tasks were carried out in Microsoft Excel.After obtaining the values, the next step consisted of reclassifying each flood or flash-flood conditioning variable based on the Ratio + values, computing them using the raster calculator in ArcGIS, and multiplying them with the PR values.Hence, the resulting FPI-FR and FFPI-FR hazard maps are as shown in Figures 6 and 7.

Flood and Flash-Flood Hazard Mapping Using the Frequency Ratio (FR) Model
The first steps in computing the FR model for both the FPI and FFPI was to determine the positive ratio (Ratio +) and the prediction ratio (PR).These tasks were carried out in Microsoft Excel.After obtaining the values, the next step consisted of reclassifying each flood or flash-flood conditioning variable based on the Ratio + values, computing them using the raster calculator in ArcGIS, and multiplying them with the PR values.Hence, the resulting FPI-FR and FFPI-FR hazard maps are as shown in Figures 6 and 7.The FPI-FR and FFPI-FR models were classified into four hazard classes by using the Natural Breaks classification method in ArcGIS, as follows: low, average, high and very high.The resulting hazard map for the FPI-FR model shows that the low hazard class covers 50% of the catchment's surface, the average class covers 30%, the high hazard class covers 16% of the area, and the very high hazard class covers 4%.The high and very high hazard classes are present in the lower catchment, where the slopes and altitudes are low, which is favorable for floods.The FFPI-FR model shows that 45% of the study area overlaps the low hazard class, 36% is in the average hazard class, 15% is in the high-hazard class, and the remaining 4% is characterized by areas with a very high hazard for flash-floods.The high and very high hazard classes are characteristic to areas with steep slopes in the upper basin, which are covered by pastures or areas with little or no vegetation.

Flood and Flash-Flood Hazard Mapping Using the Multilayer Percepton Neural Networks Model
The computation of the MLP model consisted in training the neural network with the flood locations for the FPI and with the torrential areas for the FFPI.This task was performed in Weka.The MLP model for each index was trained using 1000 maximum training epochs and 30 validation thresholds.The MLP model proposes the use of 30 validation thresholds in order to allow the algorithm to check for a decrease in error.If the model does not show a decrease in error, the training stops.Each model used a multi-start approach, which consists of running multiple models in parallel.The MLP model, used in the present study, uses a gradient descent optimization algorithm and a sigmoid activation function.The backpropagation algorithm uses the gradient descent to look for the minimum function each in weight space [78].The sigmoid activation function used for the backpropagation algorithm is defined as follows, where c is an arbitrarily selected constant and 1/c is its reciprocal which is known as the temperature parameter [78]: The resulting weights of each variable (Figure 8) were used to compute both indexes (Figures 9  and 10).The variable importance for the Multilayer Perceptron was determined using the sensitivity analysis, which generates/computes the importance of the factors used in the neural network.The sensitivity analysis is generated automatically by the software after each run.The indices were computed in the same manner as for the FR model.The overall classification accuracy for the FPI-MLP is 90.91%, whilst for the FFPI-MLP, it is 86.03%.The percentage of correct observations per predicted values (0: non-flood/non-torrential areas; 1: flood/torrential) for the FPI-MLP are 86.41% for the non-flood areas and 90.52% for the flood areas, whilst for the FFPI-MLP, the percentages are 82.35% for the non-torrential areas and 89.72% for the torrential areas.
The FPI-FR and FFPI-FR models were classified into four hazard classes by using the Natural Breaks classification method in ArcGIS, as follows: low, average, high and very high.The resulting hazard map for the FPI-FR model shows that the low hazard class covers 50% of the catchment's surface, the average class covers 30%, the high hazard class covers 16% of the area, and the very high hazard class covers 4%.The high and very high hazard classes are present in the lower catchment, where the slopes and altitudes are low, which is favorable for floods.The FFPI-FR model shows that 45% of the study area overlaps the low hazard class, 36% is in the average hazard class, 15% is in the high-hazard class, and the remaining 4% is characterized by areas with a very high hazard for flashfloods.The high and very high hazard classes are characteristic to areas with steep slopes in the upper basin, which are covered by pastures or areas with little or no vegetation.

Flood and Flash-Flood Hazard Mapping Using the Multilayer Percepton Neural Networks Model
The computation of the MLP model consisted in training the neural network with the flood locations for the FPI and with the torrential areas for the FFPI.This task was performed in Weka.The MLP model for each index was trained using 1000 maximum training epochs and 30 validation thresholds.The MLP model proposes the use of 30 validation thresholds in order to allow the algorithm to check for a decrease in error.If the model does not show a decrease in error, the training stops.Each model used a multi-start approach, which consists of running multiple models in parallel.The MLP model, used in the present study, uses a gradient descent optimization algorithm and a sigmoid activation function.The backpropagation algorithm uses the gradient descent to look for the minimum function each in weight space [78].The sigmoid activation function used for the backpropagation algorithm is defined as follows, where c is an arbitrarily selected constant and 1/c is its reciprocal which is known as the temperature parameter [78]: The resulting weights of each variable (Figure 8) were used to compute both indexes (Figures 9  and 10).The variable importance for the Multilayer Perceptron was determined using the sensitivity analysis, which generates/computes the importance of the factors used in the neural network.The sensitivity analysis is generated automatically by the software after each run.The indices were computed in the same manner as for the FR model.The overall classification accuracy for the FPI-MLP is 90.91%, whilst for the FFPI-MLP, it is 86.03%.The percentage of correct observations per predicted values (0: non-flood/non-torrential areas; 1: flood/torrential) for the FPI-MLP are 86.41% for the non-flood areas and 90.52% for the flood areas, whilst for the FFPI-MLP, the percentages are 82.35% for the non-torrential areas and 89.72% for the torrential areas.As for the FR model, the FPI and FFPI MLP model was classified into four hazard classes by using the Natural Breaks classification method.The FPI-MLP model shows that 37% of the study area is characterized by areas with a low hazard of floods, 41% by areas with an average-hazard, 15% by areas with a high hazard, and 7% was in the very high hazard class.As in the case of the FR model, the areas with average, high and very high hazard classes are located in the lower catchment.The slope, land use and distance from rivers were the variables with the highest importance values in the computation of the FPI-MLP model.The FFPI-MLP model shows that 41% of the study area is under the average flash-flood hazard class, which is predominant in the upper and middle part of the catchment.The low hazard class occupies 33% of the area, the high hazard occupies 23%, and the very high hazard class occupies 3% of the study area.The slope, multi-annual precipitation and the land use were the variables with the highest importance in the computation of the FFPI-MLP model.As for the FR model, the FPI and FFPI MLP model was classified into four hazard classes by using the Natural Breaks classification method.The FPI-MLP model shows that 37% of the study area is characterized by areas with a low hazard of floods, 41% by areas with an average-hazard, 15% by areas with a high hazard, and 7% was in the very high hazard class.As in the case of the FR model, the areas with average, high and very high hazard classes are located in the lower catchment.The slope, land use and distance from rivers were the variables with the highest importance values in the computation of the FPI-MLP model.The FFPI-MLP model shows that 41% of the study area is under the average flash-flood hazard class, which is predominant in the upper and middle part of the catchment.The low hazard class occupies 33% of the area, the high hazard occupies 23%, and the very high hazard class occupies 3% of the study area.The slope, multi-annual precipitation and the land use were the variables with the highest importance in the computation of the FFPI-MLP model.
where nv is the standardized value, v represents the used variable, r is the limit of the range value, and l is the limit of the standardization range.The role of hybrid models is to develop more accurate methods and reduce the potential disadvantages of the more traditional methods.The resulting hazard maps are shown in Figures 8  and 9.The MLP-FR hybrid model was classified, as with the previous models, into four hazard classes by using the Natural Breaks method.The results indicate that for the FPI-MLP-FR model (Figure 11), 38% of the area are experience a low hazard of floods, while 37% have an average risk.The high and very high hazard classes represent 19% and 6% of the study area, respectively, and are located predominantly in the lower catchment in the proximity of the main water courses.The FFPI-MLP-FR (Figure 12) model shows that 55% of the catchment, especially in the lower catchment, is represented by areas with a low risk of flash floods.The average hazard class covers 36% of the study area, the high hazard class represents 12%, and the very high hazard class covers 4%.The very high values are mostly located in the middle part of the catchment.For both indices, the hybrid integration of both models proved to be the best-performing model.The flood and flash-flood hazard map (Figure 13) shows the areas which overlap the high and highest values of both indices.
integrating the FR positive ratio values with the variable importance resulting from the MLP model and computing them in ArcGIS.After the FR of each flood or flash-flood conditioning variable was computed and normalized, each factor was reclassified based on the Ratio + values.The values were normalized using Equation ( 8): where nv is the standardized value, v represents the used variable, r is the limit of the range value, and l is the limit of the standardization range.The role of hybrid models is to develop more accurate methods and reduce the potential disadvantages of the more traditional methods.The resulting hazard maps are shown in Figures 8  and 9.The MLP-FR hybrid model was classified, as with the previous models, into four hazard classes by using the Natural Breaks method.The results indicate that for the FPI-MLP-FR model (Figure 11), 38% of the area are experience a low hazard of floods, while 37% have an average risk.The high and very high hazard classes represent 19% and 6% of the study area, respectively, and are located predominantly in the lower catchment in the proximity of the main water courses.The FFPI-MLP-FR (Figure 12) model shows that 55% of the catchment, especially in the lower catchment, is represented by areas with a low risk of flash floods.The average hazard class covers 36% of the study area, the high hazard class represents 12%, and the very high hazard class covers 4%.The very high values are mostly located in the middle part of the catchment.For both indices, the hybrid integration of both models proved to be the best-performing model.The flood and flash-flood hazard map (Figure 13) shows the areas which overlap the high and highest values of both indices.

Flood and Flash-Flood Model Performance Evaluation with ROC Curves
The performance evaluation of a model using the ROC (receiver operating characteristic) curve is a widely used method in research.The ROC curve represents a 2D plot which indicates the performance of a classifying system as the value of the discrimination cut-off is changed with respect to the predictor variable.The AUC (area under the curve) model represents a way to evaluate the testing ability in order to discriminate the true values.The ROC and AUC curves are made up from the sensitivity and specificity axes [79].The success rate (Figure 14a,b) shows that the MLP-FR hybrid model was the best-performing model for both indices, with a value of 0.986 for the FPI and 0.952 for the FFPI, whilst the prediction rate (Figure 14c,d) shows the same trend.For both indices, the frequency ratio model proved to be the worst-performing model in terms of the success and prediction rate.

Flood and Flash-Flood Model Performance Evaluation with ROC Curves
The performance evaluation of a model using the ROC (receiver operating characteristic) curve is a widely used method in research.The ROC curve represents a 2D plot which indicates the performance of a classifying system as the value of the discrimination cut-off is changed with respect to the predictor variable.The AUC (area under the curve) model represents a way to evaluate the testing ability in order to discriminate the true values.The ROC and AUC curves are made up from the sensitivity and specificity axes [79].The success rate (Figure 14a,b) shows that the MLP-FR hybrid model was the best-performing model for both indices, with a value of 0.986 for the FPI and 0.952 for the FFPI, whilst the prediction rate (Figure 14c,d) shows the same trend.For both indices, the frequency ratio model proved to be the worst-performing model in terms of the success and prediction rate.

Discussion
Previous studies [80][81][82] have analyzed and presented flood forecasts at a resolution of 100 m; however, in order to determine and validate the areas prone to this natural hazard, it is crucial to have data at a high resolution [47].Similar approaches have been tested using satellite imagery at

Discussion
Previous studies [80][81][82] have analyzed and presented flood forecasts at a resolution of 100 m; however, in order to determine and validate the areas prone to this natural hazard, it is crucial to have data at a high resolution [47].Similar approaches have been tested using satellite imagery at different spatial resolutions [83], alongside various image processing techniques.Such approaches show great potential in areas where ground observations are rare or lacking.We consider that using a 25 m resolution does not affect the relevance of the obtained results.
The present study has shown that it is possible to obtain the areas prone to floods and flash-floods through Machine Learning techniques and statistical methods.Obtaining both indices through the spatial correlation of flood and flash-flood conditioning variables represents a useful tool for local authorities to assess the zones prone to these types of natural hazards.The methodology uses open-source technologies and data, which is relevant for researchers as the data-obtaining process represents an important obstacle in the development of relevant methodologies and studies in the analysis of various natural hazards.The present study proposes the computation of both indices which were highlighted in numerous studies aiming at determining the areas prone to these natural hazards.The outputs of the analysis make this study relevant, as other studies [32,35,36,38,40,47,72] propose the computation of only one index for creating hazard maps.The optimal split of the training and testing set used in the present paper was determined through successive testing using different split ratios-80-20 and 90-10.The 70-30 split ratio used in this study can only be generalized for similar datasets [84].Thus, the developed models constitute a support in assisting decisions taken regarding the management and the expansion of public policies which aim at mitigating natural risks.
The obtained results show the need to complete similar approaches [47,72,85] with new variables, which will increase the relevance of the advanced modelling techniques.
The study shows the importance of developing methodologies for assessing areas vulnerable to floods and flash-floods as climatic events tend to intensify and the everchanging land use makes it imperative to develop new methodologies with new approaches and, more importantly, to obtain outputs.

Conclusions
The methodology developed in this study has been applied on the Buzău river catchment, known in Romania as one of the most affected catchments by these types of natural hazards.The methods used can be applied at a national level or on different river catchments, considering the increase in intensity of the climatic events and anthropic activities which nevertheless have a direct impact on the generation of floods and flash-floods.The flood hazard assessment by correlating the parameters of various factors which hold a direct impact on the generation of floods and flash-floods, with forest changes, historical flood locations and areas affected by torrentiality, represents a vital tool in the management of river catchments and for local authorities in order to implement public policies to prevent these types of natural hazards and avoid any human or economic losses [86][87][88].
A distributed hydrological model has the advantage of performing spatially refined simulations of hydrological components over a large area; thus, increasing the accuracy of data gives us the possibility to run the models on a large scale, with low costs and maximum benefits.

Water 2019 ,
11, x FOR PEER REVIEW 4 of 25 samples are used as a final evaluation/confirmation of the model created based on the training samples.
[51].The four groups are as follows: A, B, C and D. The most predominant hydrological soil group was group C, covering 58.9% of the study area, followed by group B, with 19% coverage.The four groups have different hydraulic conductivity properties, and based on these, one can say that group A has Water 2019, 11, 2116 5 of 25

Figure 8 .
Figure 8. Variable importance for the FPI (a) and FFPI (b) in the Multilayer Perceptron Neural Network (MLP) model.

Figure 8 . 25 Figure 9 .
Figure 8. Variable importance for the FPI (a) and FFPI (b) in the Multilayer Perceptron Neural Network (MLP) model.As for the FR model, the FPI and FFPI MLP model was classified into four hazard classes by using the Natural Breaks classification method.The FPI-MLP model shows that 37% of the study area is characterized by areas with a low hazard of floods, 41% by areas with an average-hazard, 15% by areas with a high hazard, and 7% was in the very high hazard class.As in the case of the FR model, the areas with average, high and very high hazard classes are located in the lower catchment.The slope, land use and distance from rivers were the variables with the highest importance values in the computation of the FPI-MLP model.The FFPI-MLP model shows that 41% of the study area is under the average flash-flood hazard class, which is predominant in the upper and middle part of the catchment.The low hazard class occupies 33% of the area, the high hazard occupies 23%, and the very

4. 3 .
Flood and Flash-Flood Mapping Using the Hybrid Integration between the Frequency Ratio and the Multilayer Perceptron Neural NetworksThe hybrid integration between the frequency ratio and the Multilayer Perceptron consisted of integrating the FR positive ratio values with the variable importance resulting from the MLP model and computing them in ArcGIS.After the FR of each flood or flash-flood conditioning variable was computed and normalized, each factor was reclassified based on the Ratio + values.The values were normalized using Equation (8): nv = (v−min(r))×(max(l)−min(l)) max(r)−min(r)

Figure 14 .
Figure 14.Reciever operating characteristic (ROC) curve and area under the curve (AUC) for the FPI (a,c) and FFPI (b,d) models.

Table 1 .
General characteristics of the Buzău river catchment and its main sub-catchments.

Table 1 .
General characteristics of the Buzău river catchment and its main sub-catchments.

Table 2 .
Frequency and prediction ratio and the distribution of the flood locations and classes of each flood conditioning variable.EaC: elevation above channel; DfR: distance from rivers; SHC: saturated hydraulic conductivity; DD: drainage density; MaP: multi-annual precipitation.

Table 3 .
Frequency and prediction ratio and the distribution of the flood locations and classes of each flash-flood conditioning variable.PC: plan curvature; CN: curve number; CI: Convergence Index; SEW: soil erodibility by water.