Flood Hazard and Risk Mapping by Applying an Explainable Machine Learning Framework Using Satellite Imagery and GIS Data

Antzoulatos, Gerasimos; Kouloglou, Ioannis-Omiros; Bakratsas, Marios; Moumtzidou, Anastasia; Gialampoukidis, Ilias; Karakostas, Anastasios; Lombardo, Francesca; Fiorin, Roberto; Norbiato, Daniele; Ferri, Michele; Symeonidis, Andreas; Vrochidis, Stefanos; Kompatsiaris, Ioannis

doi:10.3390/su14063251

Open AccessArticle

Flood Hazard and Risk Mapping by Applying an Explainable Machine Learning Framework Using Satellite Imagery and GIS Data

by

Gerasimos Antzoulatos

^1,*

,

Ioannis-Omiros Kouloglou

¹

,

Marios Bakratsas

¹

,

Anastasia Moumtzidou

¹

,

Ilias Gialampoukidis

¹

,

Anastasios Karakostas

¹,

Francesca Lombardo

²,

Roberto Fiorin

²,

Daniele Norbiato

²,

Michele Ferri

²,

Andreas Symeonidis

³

,

Stefanos Vrochidis

¹

and

Ioannis Kompatsiaris

¹

Information Technologies Institute (ITI), Centre for Research and Technology Hellas (CERTH), 57001 Thessaloniki, Greece

²

Eastern Alps River Basin District Authority (AAWA), Cannaregio 4314, 30121 Venice, Italy

³

School of Electrical and Computer Engineering, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece

^*

Author to whom correspondence should be addressed.

Sustainability 2022, 14(6), 3251; https://doi.org/10.3390/su14063251

Submission received: 27 December 2021 / Revised: 1 March 2022 / Accepted: 3 March 2022 / Published: 10 March 2022

(This article belongs to the Special Issue Environmental Water Monitoring for Sustainable Development in Urban and Rural Areas)

Download

Browse Figures

Versions Notes

Abstract

:

Flooding is one of the most destructive natural phenomena that happen worldwide, leading to the damage of property and infrastructure or even the loss of lives. The escalation in the intensity and number of flooding events as a result of the combination of climate change and anthropogenic factors motivates the need to adopt real-time solutions for mapping flood hazards and risks. In this study, a methodological framework is proposed that enables the assessment of flood hazard and risk levels of severity dynamically by fusing optical remote sensing (Sentinel-1) and GIS-based data from the region of the Trieste, Monfalcone and Muggia Municipalities. Explainable machine learning techniques were utilised, aiming to interpret the results for the assessment of flood hazard. The flood inventory was randomly divided into

70 %

, used for training, and

30 %

, employed for testing. Various combinations of the models were evaluated for the assessment of flood hazard. The results revealed that the Random Forest model achieved the highest F1-score (approx. 0.99), among others utilised for generating flood hazard maps. Furthermore, the estimation of the flood risk was achieved by a combination of a rule-based approach to estimate the exposure and vulnerability with the dynamic assessment of flood hazard.

Keywords:

flood hazard; flood risk maps; flood susceptibility; satellite imagery analysis; crisis maps; machine learning

1. Introduction

Over the past couple of decades, flood disasters have intensified and have become more frequent and more destructive compared to past ones, especially in the developing countries, such as those in Latin America and the Caribbean [1], causing a loss of human lives and properties worldwide. According to the CRED’s Emergency Events Database (EM-DAT (https://www.emdat.be/, accessed date: Monday, 6 September 2021)), 44% of all disaster events from 2000 to 2019 concern flooding events that have impacted on 1.6 billion people worldwide, which is the highest figure for any disaster type. Furthermore, floods are the most common type of event with an average of 163 events per year [2]. Climate changes along with anthropogenic factors play a significant role in escalating the severe impacts of flood disasters in terms of economic loss, social disruptions, and damage to the urban environment. Therefore, proper monitoring to identify areas prone to floods and the effective mitigation countermeasures are considered very important to risk reduction [3,4,5,6,7].

The deployment of real-time solutions for mapping flood hazards and the estimation of potential consequences of flood events might be extremely valuable towards confronting emergency responses and mitigating the impact of those events [8]. Therefore, realising the need for effective flood management, the European Union adopted European Directive 2007/60/EC on flood risk assessment and management, which entered into force on 26 November 2007. In this Directive, the flood mapping was considered as a crucial element of flood risk management, and moreover, it requested EU Member States to prepare two types of crisis maps, namely, the flood hazard and risk maps, by 2013 (art 6) and update them every six years [9,10].

Flood mapping is a process that describes the expected extent of water inundation into dryland as a result of intense precipitation or river water level rise driven by natural or anthropogenic factors [11]. Although flood mapping basically comprises flood hazard maps and flood risk maps, its processes vary considerably from project to project and/or country to country, depending on the specific project requirements and country-specific guidelines, legislation, etc. [9,10,12,13]. Flood mapping provides the baseline for a good understanding of historical flood trends, future expectations and the identification of vulnerable–susceptible locations likely to be impacted by flooding. Hence, flood hazard and risk maps are considered to be important tools in communicating flood risk to various target groups [12]. They convey the compiled information for flooding events to relevant public bodies such as civil protection and water management authorities, municipalities and local states or disaster/crisis managers and control staffs but also raise the awareness of the broader public [14].

Recently, the hazard, exposure and vulnerability from natural disasters have been assessed by utilising machine learning methods in a descriptive and/or predictive manner. Descriptive Machine Learning methods focus on the Response and Recovery phases of the Disaster Management Cycle, while the Predictive Machine Learning methods concentrate on providing forecasting assessments of a natural disaster, enhancing the preparedness and mitigation processes of the Disaster Management Cycle [5,6,15,16].

Specifically, flood hazard assessments employing descriptive machine learning methodologies focus primarily on the response phase by estimating current inundation extents and depths. The aim is to provide assistance in various levels: to emergency responders and those affected directly, as well as to public and government authorities assessing the impact of the event. The increasing volume of obtained data due to the rise of Earth Observation technologies, such as Synthetic Aperture Radar—SAR (e.g., Sentinel 1)—and optical data (e.g., Sentinel 2), as well as social media, provides opportunities for machine learning methods to improve the efficiency of existing flood detection approaches [5,6,15,17,18]. Satellite remote sensing capabilities have been utilised to monitor for timely and near-real-time flood disaster detection. Specifically, SAR technology overcomes the limitations of the remotely sensed optical data, which are not functional during cloud-cover or at night, and as a result, it enhances the total temporal resolution [6,7,15,17,18,19]. Advanced machine learning classification methods can be used to improve the process of the flood extent assessment and, consequently, the severity level of a flood hazard. However, the creation of these models requires the existence of annotated datasets to be used as training sets.

As stated in [5], one of the main key research challenges in this domain is the lack of large-scale annotation datasets related to social media and satellite sensing data for training and evaluating machine learning models to detect and analyse disasters generated by extreme natural events. Moreover, Said et al. [5] pointed out that another open issue in the application of the Remote Sensing Disaster Management cycle concerns the satellite imagery’s low temporal frequency. On the other hand, time is vital during a disaster event in order to enable authorities to respond effectively to minimise the socio-economic, ecologic and cultural impact of the event; to evacuate vulnerable people at risk; and in general for the recovery processes [20].

Motivated by the above limitations, the main contribution in this work is the adoption of a methodological framework for the creation of flood hazard and risk maps in near-real-time that relies on the fusion of the satellite imagery outcomes and GIS-based data. Explainable Machine Learning techniques are employed to analyse and aggregate the information in a pixel-based approach aiming to estimate the flood hazard in terms of the severity levels, namely, moderate, medium and high hazard. A thorough analysis of the specific local characteristics in pixel-based operation enhances the reliability of the proposed framework regarding the classification of these small areas in terms of their severity level. The annotation of the datasets which are needed for the modeling phase is carried out in an automated way, performing a rule that relies on the experts’ knowledge. Furthermore, relying on a rule-based approach, the assessment of the exposure, vulnerability as well as flood risk are carried out, producing the corresponding crisis maps. Hence, the proposed framework enables authorities and other crisis managers to reliably map and monitor flooding events by generating crisis maps almost dynamically, thus strengthening situational awareness and providing an adequate picture of the crisis.

2. Relevant Literature

Recently, numerous studies have been proposed to create flood susceptibility maps as a tool for efficient flood risk management [21,22,23,24,25,26,27,28,29,30]. Flood susceptibility indicates the propensity of an area, given by its physical-geographical characteristics, to be affected by flooding. Additionally, flood susceptibility mapping can be determined as a quantitative and qualitative assessment of an area with likely flood occurrence, simultaneously providing the spatial distribution of the particular natural event [22,26]. The analysis and the mapping of flood susceptibility identify the most vulnerable areas and therefore can be considered as one of the most important aspects of early warning systems or strategies for the prevention and mitigation of future flood situations [28,31]. It should be mentioned that, apart from flood hazard, the vulnerability and exposure can also be visualised as maps; therefore, they are spatially explicit and are integrated into a GIS context. For instance, in a grid cell of GIS maps of a certain size, we can explicitly exhibit the expected depth of a flood, the presence of buildings and people and the likelihood of them being damaged or harmed.

With the rise of technological advances in Remote Sensing, Geographic Information Systems and Machine Learning, multidisciplinary approaches have been proposed aiming to efficiently map, monitor and manage floods. Hence, in the flood risk assessment, multiple satellite-based flood mapping and monitoring can be considered as an essential and imperative process. By leveraging the increasing availability of free-of-charge or low-cost satellite data with global coverage (e.g., Sentinel-1 and -2 from ESA and Landsat and MODIS satellites from NASA) [32], new potentialities have emerged in near-real-time for mapping and modeling flood risk and its impact assessments [33]. As a result, authorities and stakeholders can be assisted in carrying out appropriate disaster response and relief activities, achieving in the early stages disaster risk reduction and mitigation [34]. Another low-cost Remote Sensing solution that has gained considerable interest in the last decades is the Unmanned Aerial Vehicles (UAVs) [35,36]. Equipped by high-resolution camera sensors, UAVs can capture high-quality topographical data and facilitate the monitoring and mapping of a natural hazardous event [37].

Advanced machine learning methods coupled with multi-criteria analysis methods and remote sensing technologies have been developed and applied effectively in flood susceptibility mapping. To name of a few, in [22], the performance of four machine-learning methods, namely, Kernel Logistic Regression, Radial Basis Function Classifier, Multinomial Naïve Bayes, and Logistic Model Tree, have been compared in terms of their efficiency to create reliable flash flood susceptibility maps. Similarly, in [23], novel hybrid computational approaches of machine learning methods for flash flood susceptibility mapping, namely, AdaBoostM1-based Credal Decision Tree, Bagging-based Credal Decision Tree, Dagging-based Credal Decision Tree, MultiBoostAB-based Credal Decision Tree and single-Credal Decision Tree, have been compared for flash flood susceptibility assessment. In [24], the authors focused on Support Vector Machines (SVMs) and applied various kernels to investigate their capabilities to accurately assess the flood susceptibility and produce the corresponding mappings. Logistic Regression (LR) has been employed in [25], aiming to determine the significance of flood conditioning factors to flood susceptibility. Researchers in [21] adopted an approach to identify the areas susceptible to flash flooding, by relying on the computation of Flash-Flood Potential Index (FFPI) and using two machine learning models (k-Nearest Neighbor and K-Star) along with their novel ensemble with an Analytical Hierarchy Process (AHP). Furthermore, in [26] an approach to derive an integrated model, considering the best performing models among the combinations of four models, Artificial Neural Network (ANN), AHP, LR and Frequency Ratio (FR), have been proposed. The goal was to develop a unique flood hazard map of Bangladesh by increasing the precision of flood susceptibility assessments. In [38], a hybrid model comprising Principal Component Analysis, LR and Frequency Distribution analyses has been presented, while in [39], an ensemble modeling approach which incorporates the SVM with Multivariate Discriminant Analysis (MDA) and Classification and Regression Trees (CART) to create a flood susceptibility map has been proposed. Another ensemble method that combines SVM using a radial basis function kernel with the FR approach to estimate flood probability has recently proposed [40]. The ultimate goal was to assess the flood risk. In [41], two machine learning techniques, namely, Convolutional Neural Network (CNN) and SVM, were fused to develop the most reliable flood susceptibility maps using GIS data. In [42], the authors proposed a Deep Neural Network (DNN) model that employed Sentinel-1 satellite data by fusing the SAR backscatter coefficients and the Digital Elevation Model (DEM) data, so as to generate water-bodies masks.

Generally, in the majority of the above studies, the satellite imagery and GIS-related data are provided in near-real-time in order to assess the risk of an extreme flood event which is in progress.

3. Materials and Methods

3.1. Study Area

The study domain is located in the northeast of Italy, and specifically in the eastern part of the Friuli Venezia Giulia Region and of the Eastern Alps River Basin District, close to the boundary between Italy and Slovenia. In particular, this work focuses on three distinct areas, each of them located in a different Municipality, namely, Trieste, Muggia and Monfalcone, as it is illustrated in Figure 1.

The area of Trieste and Muggia is unique in Italy from a hydrogeological perspective, having karst features and thus lacking of surface hydrography and well-defined watersheds. As regards the topography, these two municipalities are characterized by the presence of steep hillside close to the shoreline, as can be seen from the elevation plotted in Figure 2. However, the urban centers of the two municipalities, where this work is focused on, have a low elevation, close to the sea level. As regards the Monfalcone region, the Municipality is mostly located in the plain called in Italian ‘Pianura Isontina’, at the mouth of the Isonzo River. The elevation of the area is very close, if not inferior, to the sea level, and the terrain is mostly plain with very low slope (Figure 2).

Due to the fact that all three study areas are characterized by low elevation of the ground above sea level, they are particularly prone to floods due to high tides of the Adriatic sea triggered by meteorological conditions. In fact, flood hazard in the coastal area often manifests through storm surge simultaneous to specific climate conditions (rainfall, high tide, southern winds). Flooding in the urban areas of Trieste and Muggia is caused, in addition to the topography, by the excessive imperviousness of the soil and because of the difficult discharge of the superficial runoff when high tide is simultaneous to the flow of the superficial drainage network [43]. In addition, for the area of Muggia, even if the karst geology mostly causes the lack of superficial water bodies, there are two streams: Rosandra and Ospo. These two streams highlight some critical points from a hydraulic point of view due to the insufficient maintenance and to the increasing pluvial runoff caused by the intensive urbanization.

Regarding the Monfalcone area, the territory, located on the east side of the Isonzo River, is well-known to be humid (swampland). In particular, the drainage network often shows failures in the occasion of flood events simultaneous with high tides. As it can be seen from Figure 2, part of the territory also has an elevation lower than the mean sea level. In addition, the area presents a relevant underground hydrography (e.g., the Karst river Timavo). Thus, in this area, high tide can cause flooding due to the insufficiency of the marine levees, as well as for the overflowing of the drainage network [43]. Finally, for the Monfalcone area, the flood risk is also due to the presence of the Isonzo River, one of the most important rivers for the Eastern Alps River Basin District, as well as its most relevant transboundary water body. The Isonzo River originates in Trenta’s valley with springs at an altitude of 935 m and flows into the Adriatic sea, near Monfalcone, where it forms a delta that tends, over time, to move from west to east. The Isonzo catchment basin subtends a total area of approximately 3400 km², of which about 1150 km², that is about one-third of it, is in Italian territory. The Isonzo river, being purely torrential in character, collects and discharges the waters of the southern side of the Alps Giulie, which separate this basin from that of the Sava. The main right tributaries are the Coritenza, in Slovenian territory, and the Torre, which flows almost entirely in the Italian part. On the left, the Isonzo is fed by Idria and Vipacco, with their respective basins included totally and almost totally in Slovenian territory [44].

Digital Elevation Model in the Study Area

The Digital Elevation Model (DEM) has been provided by Eastern Alps River Basin District Authority (AAWA), who performed some GIS elaborations on the official DEM of the Friuli Venezia Giulia Region. DEM is provided into the reference system UTM 33N (EPSG 3045). It has been obtained using the Laser Imaging, Detection and Ranging (LIDAR) technique from a set of areal flights performed in 2019. The raw data obtained from the flights (a cloud of points) have been gradually processed to provide the final product. This, in turn, consists of a representation of the points of terrain, devoid of all the elements above the ground (such as buildings, vegetation, cables, etc.), on a regular grid with pixel resolution of 0.5 m × 0.5 m, divided between many different tiles. The DEM has a planimetric accuracy of 0.15 m and an altimetric one which ranges from 0.15 m (in open field) and 0.3 m (under vegetation cover), both estimated through a set of reference points all over the region. It should be noted that for the city of Trieste, which is particular vulnerable to floods caused by the tide, identifying flat areas near the sea is very important. We used three areas with DEM resolution equal to 0.5 m, as shown in the above figure (Figure 2).

3.2. Flood Conditioning Factors

Floods are natural phenomena caused by many different factors, including climatology, hydrology, geomorphology, topography and land use. For the purpose of this work, topography and land use are considered, extracting some of the most relevant conditioning factors from DEM analysis, well-known as Flood Conditioning Factors. The application of accurate Remote Sensing techniques is essential for obtaining reliable DEM and, consequently, more accurate factors. Furthermore, equivalent spatial resolution should be employed to calculate these factors. Below, a brief description of the factors that we utilised in this work is exhibited.

Elevation: the elevation of the terrain has a great influence on floods. Firstly, at a great scale, the dynamic of the event is usually completely different in high-elevation areas (mountains) than in low-elevation ones (i.e., plains), which usually are more vulnerable to flooding caused by various reasons such as river overtopping, drainage system failure and/or rising water level of seas or other water bodies. Secondly, at a minor scale, the terrain elevation determines the presence of preferential pathways, which channels the superficial runoff, or accumulation areas, which usually are represented by the local depression of the terrain.

Slope is an essential factor for studying flash flood susceptibility because it affects the speed of water. The slope of a line can be positive, negative, nil, etc. [27].

Aspect is related to the directions of water flow affecting flash flood occurrence. Flat areas are more vulnerable to water accumulation and/or spreading of water over a large surface, in particular when large volumes of water are involved. Therefore, by using this parameter, the flat regions can easily be identified [23,27].

Topographic Wetness Index (TWI) is a topo-hydrological factor and reflects the wetness potential of each pixel. It can be calculated as a fraction of flow accumulation,

A_{s}

, and the slope

α

(in degree) at the pixel:

TWI = ln \frac{A_{s}}{tan α}

(1)

The increment in the TWI index, indicating higher wetness characteristics, means that high flow accumulation carries out in low slope surfaces and, therefore, potentially indicates locations that are exposed at greater flood hazard [21,23,24,25,45].

Topographic Position Index (TPI) is a ratio of the pixel elevation (grid cell) and the mean elevation of its neighboring pixels (cells) [21,45]:

TPI = \frac{E_{p i x e l}}{E_{s u r r o u n d i n g}}

(2)

Terrain Ruggedness Index (TRI) is in contrast to the TWI and is responsible for quantifying ruggedness of the terrain by portraying the local variance in surface gradients or curvatures. TRI is considered as a morphometric measure that describes the heterogeneous condition of a land surface and facilitates characterizing it as smooth or rugged [27]. TRI, which is defined as the mean difference between a central pixel and its surrounding cells, can be calculated as follows [45]:

TRI = \sqrt{| x | (m a x^{2} - m i n^{2})}

(3)

where x shows the elevation of each neighbor cell to cell

(0, 0)

(m). In addition, min and max reflect the smallest and largest elevation value among nine neighbor pixels, respectively.

Land Use Land Cover (LULC) is considered an efficient and important factor associated with flooding [24,25,26,28]. It can be concluded that under different LULC patterns, the runoff conditions can be varied. Natural types of land cover differ in terms of infiltration capacity, while anthropogenic environments such as built-up areas, plantations, agricultural fields or deforested areas are also diverse. In vegetated areas, the runoff is minor due to the greater capacity of infiltration of the soil, which helps to mitigate the effect of a flood than in urban areas, where are typically composed of impermeable surfaces and increased surface runoff, and thus the infiltration rate is very low [24,25,26,28]. In this work, we employ the Corine Land Cover (CLC) map to estimate the Manning Roughness coefficient, as well as the presence of exposed assets for risk evaluation. CLC is a consistent classification system of long-term land cover data in Europe. The dataset gives detailed information about Land Cover for 44 classes, some of which are defined as mixed land cover and land use classes, with a thematic accuracy of more than 85%.

Water Velocity is another factor that, along with water depth, directly affects the flood occurrence. It is determined by combining the Water Depth (h), Slope (S), Manning Roughness (n) coefficient and pixel Resolution (L), based on the following formula:

v_{i} = \frac{1}{n_{i}} \sqrt{S_{i}} {(\frac{h_{i} L}{2 h_{i} + L})}^{2 / 3}

(4)

where:

$v_{i}$: denotes the Water Velocity (in m/s) at the ith pixel;
$h_{i}$: denotes the Water Depth (in m) at the ith pixel;
$S_{i}$: denotes the slope (in decimals) per pixel;
L: denotes the resolution (in m) of each pixel;
$n_{i}$: denotes the Manning Roughness (Gauckler–Manning–Strickler) coefficient (in s/m $^{1 / 3}$ ), which also depends on the land use and thus can be related by the Corine Land Cover index, indicating the surface roughness per pixel.

3.3. Satellite Imagery Analysis

For the flood detection, we processed the Sentinel-1 GRD-IW products of the flooded day and the timeseries images using ESA’s Sentinel Application Platform (https://step.esa.int/main/toolboxes/snap/, accessed date: 20 May 2021) (SNAP). The following preprocessing steps were applied [46]:

Apply Orbit File: The operation of applying a precise orbit available in SNAP allows the automatic download and update of the orbit state vectors for each SAR scene in its product metadata, providing an accurate satellite position and velocity information.
Thermal Noise Removal: Reduces noise effects in the inter-sub-swath texture, in particular normalizing the backscatter signal within the entire Sentinel-1 scene and resulting in reduced discontinuities between sub-swaths for scenes in multi-swath acquisition modes.
Subset: The initial product is cropped, so it contains only the lake we want to observe. Some balance between the inundated and non-inundated areas is desired.
Radiometric calibration: Fixes the uncertainty in the radiometric resolution of satellite sensor. The pixel values can be directly related to the radar backscatter of the scene. The information required to apply the calibration equation is included within the Sentinel-1 GRD product.
Speckle noise removal: Removes the salt-and-pepper-like pattern noise that is caused by the interference of electromagnetic waves. The “Lee Sigma” filter of Lee (1981) [47] with a 5 × 5 filter size is used to filter the intensity data. As noted by Jong-Sen Lee et al. (2009) [48], this step is essential in almost any analysis of radar images due to the speckle noise aggravation of the interpretation process.
Terrain correction: Projects the pixels onto a map system (WGS84 was selected) and re-samples it to a 10 m spatial resolution. In addition, topographic corrections with a Shuttle Radar Topography Mission (SRTM) digital elevation model (DEM) are performed. Corrects the distortions over the areas of the terrain.
Linear to Decibel (dB): The dynamic range of the backscatter intensity of the transmitted radar signal values is usually a few orders of magnitudes. Thus, these values are converted from linear scale to logarithmic scale, leading to an easier to manipulate histogram, also making water and dry areas more distinctive.

The analysis of the obtained Sentinel-1 images that are extracted from the Copernicus Open Access Hub (previously known as Sentinels Scientific Data Hub), carried out in order to estimate the Water-bodies Masks (water delineation maps). Particularly, we perform histogram thresholding on the processed VH band of the Area of Interest (AoI). The deep valley of the histogram separates the inundated from the non-inundated areas. This thresholding technique works better when there is an adequate number of inundated areas in order to distinguish them from the dry ones; otherwise, the threshold extraction may fail. In the satellite images of the areas that we study, it is quite common that water and land areas are not in balance. Thus, in order to increase the chance to estimate a valid threshold, we split the image to nine (9) tiles and then perform the thresholding to each one of them, eventually calculating the average threshold that is used in the whole image to separate the inundated from the non-inundated areas. This pixel-based classification of the region of interest will be fused with the information from DEM to estimate the Water Depth. For each separate water body (sub-area) of a water mask, the maximum elevation is detected using the DEM. Then, for this sub-area, the Water Depth is estimated by subtracting each pixel DEM value from the maximum elevation. It should be noted that flood depth along with flood duration directly contribute to flood occurrence [26].

3.4. Machine Learning Techniques

In this work, we utilised a well-known machine learning technique for classification, namely, Support Vector Machines (SVMs), Naive Bayes (NB), an ensemble learning method called Random Forest (RF) and a feed-forward Neural Network (NN). The following is a brief description of these techniques:

Support Vector Machine—SVM: Support Vector Machine (SVM) Classifier [49] represents a supervised machine learning technique that exploits the abilities of hyperplanes, reshaping the nonlinear world into linear in order to classify the features. Hyperplane is a decision plane that aims to separate a set of objects and label them into different classes. SVM consists of a method aiming to separate the features in more efficient way using hyperplanes.
Naive Bayes—NB: According to the Bayes Theorem, we deployed the statistical classification technique, Naïve Bayes (NB) classifier. This classifier belongs to the group of supervised learning algorithms and happens to be one of the simplest with high accuracy and speed, especially when it collocates with large datasets. NB is using a classifier model, which is assigning class labels into the problem events, represented as vectors of feature events, where a set is used to annotate the class labels.
Random Forest—RF: The Random Forest (RF) [50] is a well-known ensemble machine learning method either for classification or regression. The objective of this classification technique is to compare and analyze the dataset variables to define new weights for each factor. In our case of study, the RF model exploits decision trees in order to calculate and estimate the connection between Flood Hazard Index labeling and Flood feature factors values, focusing on the end to classify each vector of values into a predicted label. RF is simple, fast, able to handle large datasets, has generally high outcome through randomization and is applicable to multiclass algorithm characteristics.
Neural Network—NN: Neural Networks can be portrayed as the hierarchical multilevel relationships between neurons in a network of neurons similar to the function of the brain. The neurons implement a feedback mechanism with each other, transmitting the necessary signals to the next levels, based on the received input received from the respective previous levels, reaching one or more final results.

3.5. Model Evaluation Metrics

Confusion matrix: A confusion matrix is a table (Table 1) that presents the results from classifiers using some specific terms, such as “True positives (TP)", the predicted and actually positive result; “False positives (FP)”, the predicted positive but actually negative result; “True negatives (TN)”, the predicted and actually negative result; and “False negatives (FN)”, the predicted negative but actually positives.
Accuracy: Accuracy is the most commonly used percentage metric for machine learning models judging the accuracy of the results and can be calculated using confusion matrix terms:

$Accuracy = \frac{TP + TN}{TP + FP + TN + FN}$

(5)
Precision: Precision answers the question of what analogy of the positive results was in fact correct and can be calculated using:

$Precision = \frac{TP}{TP + FP}$

(6)
Recall: Recall, on the other hand, answers the question of what analogy of true positives was identified correctly and can be calculated using:

$Recall = \frac{TP}{TP + FN}$

(7)
F1-score: F1-Score is a measure to evaluate classification systems and is a way to combine the precision and recall results. It can be described as the harmonic mean of precision and recall and can be calculated using:

$F 1 -Score = \frac{2 * Precision * Recall}{Precision + Recall}$

(8)
Cross-Validation k-fold: Cross-validation is a statistical method of evaluating machine learning models, where it divides the dataset into random K-segments in order to use them for model training, and comparing them, we select the best model. The process of cross-validation has a single parameter k, which refers to the number of segments that will randomly separate each set of data. In our case, k is equal to 10, and we choose the best model using the average result per training.

4. Methodology

In the case of extreme natural events, such as floods, the hazard, exposure and vulnerability can be identified when interactions between these events and human societies are assessed. Flood Hazard can be estimated from the physical characteristics of the flood event such as the extent, water depth, persistence, and flow velocity. The hazard outcome is a map of flood intensity, provided by the hydrological analysis and modelling, i.e., flood frequency analysis, geomorphological characteristics of the region under assessment (pathway) and manufactured barriers against the hazard (attenuation) elements of the assessed area. Conventionally, approaches consider different return times and measures of intensity, producing multiple hazard maps [13,14].

Furthermore, the exposure refers to the characteristics of the people and assets that can be affected by flooding, focusing mainly on their social, environmental and economic value. Vulnerability is the human dimension of flood disasters and is the result of the range of economic, social, cultural, institutional, political and psychological factors. The physical component is captured by the likelihood that receptors located in the area considered could potentially be harmed (susceptibility of receptors). The social one is the ex ante preparedness of society given their risk perception of awareness to combat hazard and reduce its adverse impact or their ex post skills to overcome the hazard damages and return to the initial state (represented by adaptive and coping capacities). These can increase the susceptibility of an individual, a community, assets or systems to the impacts of flood hazards [51,52,53].

The proposed framework tailors the definition for the disaster risk, which was defined in 2017 by the UN Office for Disaster Risk Reduction (UNISDR) and includes the Sendai Framework for Disaster Risk Reduction 2015–2030 [53,54]. Therefore, Disaster Risk (R) is defined as the potential loss of life, injury or destroyed or damaged assets which could occur to a system, society or a community in a specific period of time, determined probabilistically as a function of hazard, exposure, vulnerability and capacity. Based on the above term, in the field of natural hazards, the disaster risk results from the coupling between hazard (H), vulnerability (V) and exposure (E):

Disaster Risk = f (Hazard, Vulnerability, Exposure)

(9)

In our approach, the severity level of the flood hazard is dynamically assessed by employing machine learning techniques that are able to fuse multimodal data generated by the analysis of Sentinel-1 images and GIS-based data. Then, a rule-based approach is utilised in order to estimate in near real-time the vulnerability and the exposure in the region of interest. Specifically, the proposed framework consists of 10 successive steps. as illustrated in the following figure (Figure 3).

The first two steps concern the specification of the area of interest and the choice of dates where flood events were carried out. The essential condition is the existence of the satellite images from the study area. Steps 3–7 concern the processes for the creation of flood hazard maps in near-real-time, when new satellite images appear for the particular area. The water mask, water depth and velocity of the water body along with other flood conditioning factors, which are derived from the analysis of satellite imagery or extracted from GIS tools, are fused by employing machine learning techniques. The result is the generation in near-real-time of flood hazard maps that highlight the areas that are affected by or are vulnerable to a potential flood hazard.

The remaining steps concern the assessment of vulnerabilities and exposure upon three main categories concerning the people, economic activities and environment, cultural-archaeological assets and protected areas. A rule-based approach has been utilised for this purpose. In the last step, the combination of the assessments of the hazard, vulnerabilities and exposure generates the hydraulic risk. In the following sections, the steps of the proposed methodological framework are described in more detail.

4.1. Dynamic Flood Hazard Assessment Algorithm

The proposed approach for Dynamic Flood Hazard Assessment consists of seven (7) steps, as illustrated in the Figure 3. Specifically, a study of the area of interest should be realised including the gathering of appropriate information from past extreme flood events. Then, the data acquisition phase should take place, and the appropriate features should be extracted from the data aiming to create a dataset for the application of machine learning methods. The obtained data should be homogenised and pre-processed so as to deal with missing values or outliers, data impurity issues, different ranges over the features, etc. Hence, a flood inventory will be created that contains data suitable for applying Machine Learning modeling. In the training/testing phase, machine learning models will be fit to the data and evaluate their performance in terms of their accuracy. The best machine learning model is chosen and utilised in the Validation phase to create the flood hazard maps.

4.1.1. Study Area and Historical Flood Events

As aforementioned (Section 3.1), the area of interest to further study is located in the municipality of Trieste. For this particular region, past flood events were chosen in dates for which there are satellite imagery that captured the events.

4.1.2. Data Acquisition and Feature Extraction

The processes of data collection and feature extraction aiming to create adequate feature space that will be utilised in the modelling phase are included in this step. The data will be gathered from two diverse sources (Figure 3), namely, from the analysis of satellite images and the DEM.

The Sentinel-1 Images (SAR) were analysed by employing the preprocessing steps that were described in the Section 3.3. Their spatial resolution was equal to 10 m, and temporal resolution was approximately 6 days or less. The outcome of these steps undergoes a histogram thresholding analysis that generates the appropriate water masks.

The Flood Conditioning Factors that are employed in this work derive from the DEM as described in Section 3.2. Each one of these factors can be considered as an independent feature in the feature space. As they are provided as maps, they can be converted to raster image (format) with pixel size equal to the pixel size of the DEM. In this way, all the images will obtain the same resolution. Then, a feature space of nine (9) attributes (features) are formulated, in which each feature corresponds to one raster image. The number of entries in the dataset depends on the total number of pixels in each image (width × height).

4.1.3. Data Preprocessing

The dataset that has generated after the fusion of all the features, as it was described in the above section, should be subdued under preprocessing procedures, including the following:

Create annotated dataset: Upgrade the data set by adding a target variable so that Machine Learning techniques can be applied. Our goal is to create machine learning models enable to assess the flood hazard level and which rely on the flood conditioning factors and the real-time analysis of satellite imagery. Hence, the target-variable should be the “Flood Hazard” that receives three potential values, namely, Moderate (Low) Hazard, Medium Hazard and High Hazard. To annotate the dataset, the following rule will be applied [44,55]:

If

W a t e r V e l o c i t y < 1

m/s and

0 m < W a t e r D e p t h < 1 m

Then Moderate Hazard

Else If

W a t e r V e l o c i t y < 1 m / s

and

W a t e r D e p t h \geq 1 m

Then Medium Hazard

Else If

W a t e r V e l o c i t y \geq 1 m / s

and

W a t e r D e p t h > 0 m

Then High Hazard

It should be mentioned here that the above rule is based on a hypothesis of medium probability of the flood, which has a 100-year return period in the study area.

Handle Imbalanced dataset: Due to the facts that inundated areas usually are a quite small portion of the whole region of interest and furthermore floods are a quite rare extreme event, then it is expected the majority of entries in the “Flood Hazard” will belong to the Moderate Hazard class causing an imbalanced dataset. Hence, the machine learning models will be biased to the majority class. To tackle this issue, a random sampling is performed, and a portion of the majority class is selected equal to the amount of data that belong to the other two classes.
Handle missing or extreme values: Pixels with missing values or extreme values that indicate areas that are out of the interest, e.g., inside the sea, should be detected and removed from the analysis.
Data Normalisation: The aim is to eliminate the numerical differences between the features and transform them to the same range. Machine learning models require that the input data are normalized using the same range, since the bias may occur in the results due to the bigger magnitude of the initial untransformed data. Hence, the min–max scaler is utilised that transforms each one of the input features (predictors) to min–max scale (i.e., [0, 1] scale). The formula is given as follows:

$X = \frac{x - x_{m i n}}{x_{m a x} - x_{m i n}}$

(10)

where X is the normalized data, x is the raw data, $x_{m i n}$ is the minimum value of each feature vector and $x_{m a x}$ is the maximum value of each feature vector.

It should be mentioned that the above two steps, namely, the data acquisition and feature extraction, as well as the preprocessing, could be performed iteratively taking into consideration historical flood events in a specific region. As a result, a Flood Inventory would be created that will be exploited to fit Machine Learning models capable to assess the flood hazard.

4.1.4. Training, Testing and Validation

In this phase, various Machine Learning methodologies are applied in the aim of assessing the flood hazard relying on the information from the Flood Inventory. The goal is to select the best machine learning model in terms of precision in the estimation of flood hazards. To achieve this, the dataset is divided randomly into two subsets. One portion of 70% of the data is commonly utilised for training and the remaining 30% for testing so as to evaluate the capability of each model for generalisation. In this work, we use four different machine learning approaches, namely, Naïve Bayes (NB), Random Forest (RF), Support Vector Machines (SVM) and Neural Networks (NN). The accuracy of each model is estimated in terms of the statistical validation measures, such as Accuracy, Precision, Recall and F-measure, as well as the corresponding Confusion Matrix. The outcome (target) of the Machine Learning model is the Flood Hazard Index (H), which is estimated for every pixel on the area of interest and takes values between 0 and 1. Flood Hazard Index represents the probability of flood occurrence in an area of interest and classified into three (3) categories, namely, Moderate, Medium and High.

4.1.5. Flood Hazard Assessment and Mapping

The above process results in the classification of each pixel in terms of the level of severity of a potential flooding event that expressed by the Flood Hazard Index. To colour the necessary labels of the Flood Hazard categories, we followed colouring suggestions by end-users (AAWA). The outcome of this process is a flood hazard map.

4.2. Dynamic Flood Risk Assessment Algorithm

To estimate the Hydraulic Flood Risk, it is necessary to calculate three basic parameters, namely, the Flood Hazard, the Vulnerability and the Exposure, as mentioned above. The first parameter relates to the Flood Hazard Index, which is estimated by adopting the process that proposed in Section 4.1 by fusing information from the analysis of Satellite images and GIS-related data.

The other two parameters are the Vulnerability and Exposure of socioeconomic elements in the impacted area. The flood risk assessment algorithm presented in this work has been developed in collaboration with AAWA, as an adaptation of the procedure presented in AAWA’s Flood Risk Management Plan (FRMP) of the Eastern Alps River Basin District. FRMP has been redacted by AAWA in compliance with the Directive 2007/60/EU, which also prescribes a periodic update of the contents of the plan every six years. The first iteration of the plan was finalized in 2015 and approved in 2016 [44], while the second iteration (referring to the period 2022–2028) is being finalized [55]. From first to second cycle, some of the criteria have been updated. The methodology presented in this work is coherent with the newest criteria.

According to the Flood Risk Management Plan (FRMP), for the estimation of the Vulnerability and Exposure, the knowledge of the usage and land cover of the area of interest is crucial and necessary. Therefore, in this work we employ geospatial data files, such as Corine Land Cover [56]. Then, a specific land use type from FRMP is corresponded with the Corine Land Cover Codex (CLC), and the Manning roughness coefficient is estimated [44].

4.2.1. Vulnerability Estimation

To mitigate the consequences of flood disasters, suitable Disaster Risk Reduction (DRR) measures need to be carried out. In addition to flood hazard awareness and knowledge, information on Elements at Risk (EaR), i.e., people, infrastructure and assets that may suffer damage when exposed to a flood hazard, also needs to be considered [57]. EaR’s vulnerability assessment toward the specific flood hazard at different event magnitudes, and the resulting risk allows for effective monitoring and early warnings to be given in the case of an impending hazardous situation.

In this work, the Flood Risk Assessment algorithm defines three different parameters of vulnerability: vulnerability of people (Vp), vulnerability of economic activities (Ve) and vulnerability of environments and cultural-archaeological assets and protected areas (Va); all these parameters are estimated for every pixel, and their values are between 0 and 1. These values depend both on the intrinsic characteristics of the different exposed assets as well as on the hydraulic conditions (water level and water depth) that are established during the flood, and they can affect the capacity of the response. In other words, Vulnerability is dependent on the specific nature of the element, which can be related to land use, and simultaneously by the flood hazard. In the FRMP, a detailed description behind the definition of these rules is provided [44].

Vulnerability of people (Vp): The physical vulnerability associated with people considers the values of flow velocity (Water Velocity—v) and Water Depth (h) that produce “instability” with respect to remaining in an upright position [58]. FRMP proposes a semi-quantitative equation that links a flood hazard index, referred to as the Flood Hazard Rating (FHR), to h, v and a factor related to the amount of transported debris, i.e., the Debris Factor (DF). According to this algorithm, the land use type classes are grouped in order to calculate the Debris Factor (DF) concerning the possibility of floating materials which can harm the population.
After the calculation of DF, the estimation of the Flood Hazard Rating (FHR) is carried out by utilising the Water Depth and Water Velocity according to the following formula:

$FHR = h * (v + 0.5) + DF$

(11)

where h is the Water Depth, v is the Water Velocity and DF is the Debris Factor. Vp is estimated according to FHR (Table 2).
Vulnerability of economic activities (Ve): The vulnerability associated with economic activities considers buildings, network infrastructure and agricultural areas [58]. It is a pixel-by-pixel function of the Water Depth (height) and Water Velocity (flow velocity). The vulnerability function depends on the specific nature of the assets and thus different functions are applied to land use types.
Vulnerability of environments and cultural-archaeological assets and protected areas (Va): Environmental flood susceptibility is described using contamination/pollution and erosion as indicators. Contamination is caused by industry, animal/human waste and stagnant flooded waters. Erosion can produce disturbance to the land surface and to vegetation but can also damage infrastructure [58]. From AAWA’s FRMP [44,55], the value of Va in certain land use is 1, while assuming a residual Va value for all others.

4.2.2. Exposure Estimation

Exposure depends on the spatial collocation of the assets, which is strictly related to the land use, and on the evaluation of the potential negative consequence for each category of the exposed element. Flood risk algorithm sets three different exposure parameters: exposure of people (Ep), exposure of economic activity (Ee) and exposure of environment and cultural elements (Ea). All these parameters are estimated for every pixel, and their values are between 0 and 1. For more detailed information about the literature behind the definition of these rules, we refer to the FRMP [44,55].

Exposure of people (Ep): First step to calculating the Ep is to estimate the population of the area of interest per pixel, which is divided into census areas by the Italian national Institute of Statistics (ISTAT). The dataset of population is given to us via shapefiles, which are a form of geospatial vectors, so we can calculate per pixel according to geolocation data. The calculation of Ep can be produced by:

$Ep = F_{d} * F_{t}$

(12)

where $F_{d}$ is a factor characterizing the density of the population in relation to the number of people present. For the population estimations in specific areas, census data have been employed. $F_{t}$ is the proportion of time spent in different locations (e.g., houses and schools) using the land use classes.
Exposure of economic activity (Ee): The Ee calculation depends solely on the land use of the area of interest.
Exposure of environment and cultural elements (Ea): As with Ee, exposure of environment and cultural elements, Ea is estimated solely by land use.

4.2.3. Hydraulic Flood Risk Assessment

Considering we have all the estimations (Hazard, Vulnerability, Exposure) per pixel, we can calculate the Hydraulic Flood Risk [44,55,58] using the following formula:

R = \frac{p_{p} H * Ep * Vp + p_{e} H * Ee * Ve + p_{a} H * Ea * Va}{p_{p} + p_{e} + p_{a}}

(13)

where H is the Flood Hazard, E is the Exposure, V is the Vulnerability and

p_{p}

,

p_{e}

and

p_{a}

are the weight parameters derived from FRMP [44,55]:

$p_{p}$ = 10, if there are inhabitants;
$p_{e}$ = 1, if there are economic activities;
$p_{a}$ = 1, if there are environments and cultural-archaeological assets and protected areas.

The Hydraulic Flood Risk categorization is performed using Table 3 [44,55].

In order to create the corresponding Flood Risk Map for the area of interest, the assessments of the Hydraulic Flood Risk correspond to specific colors in RGB scale.

5. Results and Discussion

In order to evaluate the performance of the Dynamic Flood Hazard Assessment algorithm in terms of its accuracy, firstly, the machine learning models need to be created. This takes place in the Training/Testing phase (Section 4.1.4) of the proposed methodological framework. Then, in the evaluation phase, the trained models are validated in terms of their precision, namely, to estimate the class of Flood Hazard Index over “unknown” data.

For this purpose, a series of experiments were realised in order to find out the best set of parameters during the training of machine learning models, which will result in the choice of the best model. The dataset that we used in this phase were formed based on satellite images and DEM data over specific dates where floods occurred due to the appearance of extremely high sea tides and heavy rains that were observed in the municipality of Trieste.

As mentioned above, the dataset was divided into two sets, 70% of the entries were used for training purposes and the remaining 30% for testing the accuracy of the models. We used k-fold cross-validation in order to evaluate the machine learning models. In our case, the parameter k is set equal to 10 for choosing the best model with the help of the average results. A set of parameters for each one of the machine learning models that have been employed and evaluated is presented in the Table 4.

Table 5 presents the experimental results over the evaluation metrics Precision, Recall and

F 1

-Score achieved during the training of the machine learning models. Based on these metrics, the selection of the best model was performed using the methodology of best_estimator (sklearn library). Random Forest was selected as the best model, using the following hyperparameters: (Criterion: Gini; Max features: Auto; n_Estimator: 50) as these achieved the best performance, the average precision being approximately

0.9999995

. The evaluation of the model with the most efficient hyperparameters in relation to 30% of the data as a test set is shown in Figure 4 below, which depicts the Confusion Matrix.

Furthermore, the relative importance of the features, namely, the significance of each one of the attributes that participated in the training of a machine learning model, was examined, and the results are illustrated in the following figure (Figure 5).

The estimation of the features’ relative importance was carried out by employing the best ML model, namely, the Random Forest method. The features Water Velocity and Water Depth exhibit a significant role in the training and the inference of the ML model as, in total, their relative scores approximate

66 %

(Table 6). The Slope and Roughness of the terrain indicate quite high importance so that the trained model can classify in terms of the severity levels the input patterns. The other geomorphological factors, such as Elevation (DEM), TRI, TPI and Aspect, as well as the Water Mask, do not appear to be so importance in the training process. An explanation of this could be the fact that the study area is coastal, smooth and without significant differences in elevation. Moreover, the lack of variability in the values that the Water Mask receives is another reason to justify the low relative importance of this feature. The Water Mask implies the existence of water or not in a pixel; consequently, the inundated pixels are significantly less than the dry ones in the dataset.

5.1. Evaluation of Dynamic Flood Hazard/Risk Algorithm

The goal of these experiments is to evaluate the performance of Dynamic Flood Hazard algorithm concerning its capability to produce accurate flood hazard maps when the flood hazard assessment is carried out using the best trained RF model.

For this purpose, the dataset that we employed was generated by satellite images and GIS data in the areas of Trieste, Muggia and Monfalcone following a similar process as that we have already presented above. The satellite images refer to historical flood events, due to the high sea tides, “unknown” to the trained RF model.

Similarly, to evaluate the performance of the Dynamic Flood Risk Algorithm, we extend the former analysis over the evaluation datasets that have been created by utilising the satellite imagery in the areas of interest for various dates. The goal is to estimate the Hydraulic Flood Risk (R) for each entry in the dataset, assign its value to a corresponding risk level and create the corresponding Flood Risk Map.

5.1.1. Trieste, 23 September 2019

The confusion matrix (Figure 6) implies the efficacy of the proposed approach as the algorithm manage to inference correctly the entries of the validation dataset into the corresponding flood hazard labels (Predicted labels). In Figure 7 and Figure 8, the flood hazard and risk map in the Trieste area at 23 September 2019 are exhibited, respectively.

5.1.2. Muggia, 29 October 2018

Similarly, the results of the application of the proposed approach are also examined in the Muggia area on 29 October 2018. The confusion matrix (Figure 9) indicates the efficiency of the proposed approach. The flood hazard and risk map in the specific area and date are illustrated in the following figures (Figure 10 and Figure 11, respectively).

5.1.3. Monfalcone, 24 September 2019

The proposed approach managed to correctly classify the pixels that shape the evaluation set in the Monfalcone area on 24 September 2019. The results are depicted in the corresponding confusion matrix (Figure 12). The Figure 13 and Figure 14 illustrate the flood hazard and risk map in the Monfalcone area on 24 September 2019, respectively.

5.2. Discussion

In this work, the proposed framework aims to provide to the relevant authorities a methodology for evaluating and mapping the level of the risk of a specific flood event using free data from widely available sources, namely, the satellite (Sentinal-1) data and GIS-related data. Initially, four well-known machine learning approaches, namely, Naïve Bayes (NB), Random Forest (RF), Support Vector Machines (SVM) and Neural Networks (NN), have been employed to fuse the available information and estimate the flood hazard levels in near-real-time. From the experimental evaluation process, Random Forest has exhibited slightly better performance in terms of the F1-score compared with the others. Therefore, we used this approach as a predictor in order to create flood hazard maps in the region of the three Municipalities (Trieste, Muggia and Monfalcone) during the evaluation process. The high-precision scores achieved during the training and evaluation process by machine learning algorithms are mainly due to the pixel-based approach that we followed, instead of analysing a sampling of pixels. Hence, the trained machine learning algorithms are able to correctly classify areas in terms of their flood hazard levels. Going a step further, a rule-based approach has been applied, based on the AAWA’s FRMP, which combines the flood hazard assessments with flood exposure and vulnerability estimations from the region of interest. The final goal was to produce a near-real-time flood risk map.

Concerning the flood conditioning factors, it should be mentioned that the importance of the flood conditioning factors depends on the geomorphological characteristics in the area of interest, as well as the historical flood events that were examined [22,59]. In this work, the Water Velocity, Water Depth, Slope and Roughness have a dominant role (approx.

91.5 %

) to the training and evaluation of the machine learning approaches that were applied. This is a rational conclusion due to the fact that these factors affect the propagation of flood and are the most important hydrodynamic parameters. Slope and roughness affect the flow velocity and the water depth. The smoother and steeper an area is, the more higher the velocity of the flood is. On the other hand, high roughness slows the water flow but increases the water level. Moreover, as described in the Section 3.1, the study areas are characterized by the low slope and elevation of the ground above sea level (coastal areas), which are factors that favour floods due to high tides.

Furthermore, water depth and water velocity, as described in Section 4, are the basis for both hazard and vulnerability estimations. These two factors participate in the annotation process in order to classify each pixel in one of the severity level categories (Section 4.1.3). The lack of annotated datasets to train machine learning models that will enable the assessment of the flood hazard levels is considered a crucial issue for the development of a robust system [5,16]. In this work, to overcome this limitation, an automated rule-based approach has been adopted, inspired by the AAWA’s FRMP.

In general, the proposed framework enables authorities to evaluate the flood risk in near-real-time by utilising low-cost or free-of-charge satellite data, and thus it can be used to overcome the gap of information in the areas with an irregular diffusion of hydrometeorological sensors. Additionally, even in the presence of legacy Decision Support Systems such as monitoring water distribution networks or forecasting systems, the proposed framework can provide useful complementary information.

For example, hydrometers record a punctual measure of water level inside a fluvial section. Thus, in the case of river overtopping, they cannot offer any useful information about the extension of the flood external to the river, as well as on its impact on the exposed assets. Similar consideration applies to flood forecasting systems based on 1D hydraulic models. Even in the case of the availability of 2D hydraulics models, the information provided is limited to a hazard estimation, while the concept of risk is really crucial for effective response to an emergency situation and mitigating the consequences. Flood Risk, in fact, links together not only the intensity of the event itself (hazard) but also the potential impacts of the communities, economic assets, environment and cultural heritage.

For this reason, the Flood Directive (2007/60/EC) highlights the importance of the redaction of flood risk maps as part of flood management plans. However, flood risk maps should be referred to a set of predefined hydraulic and hydrological scenarios (floods of certain return times), which may be different from the ones that occur during a real extreme event. From this perspective, this work aims to provide to the authorities, as an integration to the ‘static’ flood risk maps, a ‘dynamic’ tool for having a quick and reliable estimation of the level of risk referred to a specific flood event when it occurs. Moreover, the proposed methodology can be used to assess the risk caused by different flooding mechanisms, including the ones that are currently not dealt by the Flood Directive (e.g., urban flood).

Finally, the proposed approach can be used to help the calibration of 2D hydraulic models, which is a challenging and time-consuming process. That means the operators have to simulate a flood event based on the past events for whom hydrometer’s recordings/measurements are available. Then, they should confirm whether the results of the model are coherent with those measurements. However, measurements are punctual (a hydrometer measures the water level in a specific place, called river section) whereas the 2D model covers a broader area. Hence, the calibration of a 2D model that covers a vast area by using only spare punctual values is not an easy task. However, although it is very important to calibrate a 2D model in surrounding areas of the river, the hydrometers are located inside the river, and as a result, the water level measurements in the flooding areas (areas outside the river due to overtopping) are not available.

6. Conclusions

In flood management studies, the creation of accurate flood hazard and risk maps is essential for the preparedness and mitigation of an extreme flood incident. In the last decade, numerous studies have been published aiming to assess flood hazards and create more reliable hazard maps. State-of-the-art methodologies utilise advanced remote sensing techniques including satellite imagery analytical tools and GIS-related data along with machine learning techniques, aiming to estimate flood susceptibility and develop the corresponding maps. In this work, a flood hazard assessment algorithm is proposed which deals with the problem of flood monitoring and mapping. It develops a machine learning model which is able to assess the severity levels of flood hazard. The utilisation of satellite imagery along with the flood conditioning factors that are generated by GIS, providing the opportunity to create an extensive flood inventory. The proposed approach attempts to resolve the two main challenges, which are as follows:

The domain lack of annotated datasets for the training and evaluation of the machine learning techniques able to detect and monitor the flood event by using remote sensing techniques;
The low temporal frequency of satellite imagery acquisition, which hinders the real-time monitoring of an evolving flood.

Furthermore, in this paper, an extension of the Dynamic Flood Hazard algorithm was realised in order to estimate the hydraulic flood risk, combining vulnerability and exposure information from impacted areas. Both approaches are evaluated in terms of their accuracy and their capability to create accurate flood hazard and flood risk maps. The results are quite promising and encouraging. However, improvements should be made in the direction of the integration social media information into the Flood Risk algorithm.

Another aspect that we should deal with is reducing the processing time and computational effort. These are mainly affected by the resolution of the satellite imagery, the DEM and the other derived flood-conditioning factors. Due to the pixel-based approach that was followed in the analysis, higher resolutions of the images generate bigger scale datasets, which are demanding to resources. On the other hand, a poor resolution of the images affects the quality of the flood hazard and risk assessments and the generated maps. Hence, we should find a trade-off between the quality of images and framework robustness. A potential solution to increase the quality of the DEM or its unavailability is the adoption of low-cost UAV applications.

Author Contributions

Conceptualization, G.A. and I.-O.K.; methodology, G.A.; software, G.A., M.B. and I.-O.K.; validation, G.A., I.-O.K., M.B., A.M., F.L. and R.F.; formal analysis, G.A., A.M. and I.G.; resources, M.B., F.L., R.F., D.N. and M.F.; data curation, G.A., I.-O.K., M.B. and A.M.; writing—original draft preparation, G.A., I.-O.K., M.B., F.L. and R.F.; writing—review and editing, G.A., I.-O.K., A.M., F.L. and I.G.; visualization, G.A., I.-O.K., F.L. and R.F.; supervision, I.G., A.K., M.F., A.S., S.V. and I.K.; project administration, I.G. and S.V.; funding acquisition, S.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by European Union’s Horizon 2020 Research and Innovation Programmes aqua3S, under Grant Agreement No 832876, and WQeMS, under Grant Agreement No 101004157.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AAWA	Alto Adriatico Water Authority
AHP	Analytical Hierarchy Process
AoI	Area of Interest
ANNs	Artificial Neural Networks
CART	Classification and Regression Trees
CRCL	Crisis Classification
CLC	Corine Landcover Codex
CNN	Convolutional Neural Network
DEM	Digital Elevation Model
DNN	Deep Neural Network
DRR	Disaster Risk Reduction
EaR	Elements at Risk
Ep	Exposure of people
Ee	Exposure of economic activity
Ea	Exposure of environment and cultural elements
FFPI	Flash-Flood Potential Index
FHR	Flood Hazard Rating
FR	Frequency Ratio
FRMP	Flood Risk Management Plan
GIS	Geographical Information System
LIDAR	Laser Imaging, Detection And Ranging
LR	Logistic Regression
LULC	Land Use Land Cover
MDA	Multivariate Discriminant Analysis
NNs	Neural Networks
RF	Random Forest
SAR	Synthetic Aperture Radar
SNAP	Sentinel Application Platform
SVMs	Support Vector Machines
TRI	Terrain Ruggedness Index
TWI	Topographic Wetness Index
UAVs	Unmanned Aerial Vehicles
Vp	Vulnerability of people
Ve	Vulnerability of economic activities
Va	Vulnerability of environments and cultural-archaeological assets and protected areas

References

Pinos, J.; Quesada-Román, A. Flood Risk-Related Research Trends in Latin America and the Caribbean. Water 2022, 14, 10. [Google Scholar] [CrossRef]
van Loenhout, J.; McClean, D. Human Cost of Disasters. An Overview of the Last 20 Years 2000–2019; UN Office for Disaster Risk Reduction (UNDRR) and Centre for Research on the Epidemiology of Disasters (CRED): Brussels, Belgium, 2020. [Google Scholar]
Quesada-Román, A.; Ballesteros-Cánovas, J.A.; Granados-Bolaños, S.; Birkel, C.; Stoffel, M. Dendrogeomorphic reconstruction of floods in a dynamic tropical river. Geomorphology 2020, 359, 107133. [Google Scholar] [CrossRef]
Quesada-Román, A.; Ballesteros-Cánovas, J.A.; Granados-Bolaños, S.; Birkel, C.; Stoffel, M. Improving regional flood risk assessment using flood frequency and dendrogeomorphic analyses in mountain catchments impacted by tropical cyclones. Geomorphology 2022, 396, 108000. [Google Scholar] [CrossRef]
Said, N.; Ahmad, K.; Riegler, M.; Pogorelov, K.; Hassan, L.; Ahmad, N.; Conci, N. Natural disasters detection in social media and satellite imagery: A survey. Multimed. Tools Appl. 2019, 78, 31267–31302. [Google Scholar] [CrossRef] [Green Version]
Yu, M.; Yang, C.; Li, Y. Big Data in Natural Disaster Management: A Review. Geosciences 2018, 8, 165. [Google Scholar] [CrossRef] [Green Version]
Arshad, B.; Ogie, R.; Barthélemy, J.; Pradhan, B.; Verstaevel, N.; Perez, P. Computer Vision and IoT-Based Sensors in Flood Monitoring and Mapping: A Systematic Review. Sensors 2019, 19, 5012. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Dottori, F.; Kalas, M.; Salamon, P.; Bianchi, A.; Thielen Del Pozo, J.; Feyen, L. A near real-time procedure for flood hazard mapping and risk assessment in Europe. In Proceedings of the 36th IAHR World Congress, The Hague, The Netherlands, 28 June–3 July 2015; International Association for Hydro-Environment Engineering and Research (IAHR): Thessaloniki, Greece, 2015; pp. 4968–4975. [Google Scholar]
EXCIMAP. Atlas of Flood Maps. Examples from 19 European Countries, USA and Japan; Ministerie V&W: Den Haag, The Netherlands, 2007; p. 197. [Google Scholar]
EXCIMAP. Handbook on Good Practices for Flood Mapping in Europe; European Commision: Den Haag, The Netherlands, October 2007.
Constantinescu, G.; Garcia, M.; Hanes, D. River Flow 2016: Iowa City; CRC Press: Boca Raton, FL, USA, 2016. [Google Scholar] [CrossRef]
Ekeu-wei, I.; Blackburn, G. Applications of Open-Access Remotely Sensed Data for Flood Modelling and Mapping in Developing Regions. Hydrology 2018, 5, 39. [Google Scholar] [CrossRef] [Green Version]
Díez-Herrero, A.; Lain-Huerta, L.; Llorente, M. A Handbook on Flood Hazard Mapping Methodologies; Geological Survey of Spain: Madrid, Spain, 2009. [Google Scholar]
Spachinger, K.; Dorner, W.; Metzka, R.; Serrhini, K.; Fuchs, S. Flood Risk and Flood hazard maps—Visualisation of hydrological risks. IOP Conf. Ser. Earth Environ. Sci. 2008, 4, 012043. [Google Scholar] [CrossRef] [Green Version]
Wagenaar, D.; Curran, A.; Balbi, M.; Bhardwaj, A.; Soden, R.; Hartato, E.; Mestav Sarica, G.; Ruangpan, L.; Molinario, G.; Lallemant, D. Invited perspectives: How machine learning will change flood risk and impact assessment. Nat. Hazards Earth Syst. Sci. 2020, 20, 1149–1161. [Google Scholar] [CrossRef]
Global Facility for Disaster Reduction and Recovery (GFDRR). Machine Learning for Disaster Risk Management. 2018. Available online: https://www.gfdrr.org/sites/default/files/publication/181222_WorldBank_DisasterRiskManagement_Ebook_D6.pdf (accessed on 17 January 2020).
Klemas, V. Remote Sensing of Floods and Flood-Prone Areas: An Overview. J. Coast. Res. 2015, 31, 1005–1013. [Google Scholar] [CrossRef]
Kuenzer, C.; Guo, H.; Huth, J.; Leinenkugel, P.; Li, X.; Dech, S. Flood Mapping and Flood Dynamics of the Mekong Delta: ENVISAT-ASAR-WSM Based Time Series Analyses. Remote Sens. 2013, 5, 687–715. [Google Scholar] [CrossRef] [Green Version]
Quesada-Román, A.; Villalobos-Chacón, A. Flash flood impacts of Hurricane Otto and hydrometeorological risk mapping in Costa Rica. Geogr. Tidsskr.-Dan. J. Geogr. 2020, 120, 142–155. [Google Scholar] [CrossRef]
Van Ackere, S.; Verbeurgt, J.; De Sloover, L.; Gautama, S.; Wulf, A.; De Maeyer, P. A Review of the Internet of Floods: Near Real-Time Detection of a Flood Event and Its Impact. Water 2019, 11, 2275. [Google Scholar] [CrossRef] [Green Version]
Costache, R.; Pham, Q.B.; Sharifi, E.; Linh, N.T.T.; Abba, S.; Vojtek, M.; Vojteková, J.; Nhi, P.T.T.; Khoi, D.N. Flash-Flood Susceptibility Assessment Using Multi-Criteria Decision Making and Machine Learning Supported by Remote Sensing and GIS Techniques. Remote Sens. 2020, 12, 106. [Google Scholar] [CrossRef] [Green Version]
Pham, B.T.; Phong, T.V.; Nguyen, H.D.; Qi, C.; Al-Ansari, N.; Amini, A.; Ho, L.S.; Tuyen, T.T.; Yen, H.P.H.; Ly, H.B.; et al. A Comparative Study of Kernel Logistic Regression, Radial Basis Function Classifier, Multinomial Naïve Bayes, and Logistic Model Tree for Flash Flood Susceptibility Mapping. Water 2020, 12, 239. [Google Scholar] [CrossRef] [Green Version]
Pham, B.; Avand, M.; Janizadeh, S.; Tran, P.; Al-Ansari, N.; Lanh, H.; Das, S.; Le, H.; Amini, A.; Bozchaloei, S.; et al. GIS Based Hybrid Computational Approaches for Flash Flood Susceptibility Assessment. Water 2020, 12, 683. [Google Scholar] [CrossRef] [Green Version]
Tehrany, M.; Pradhan, B.; Mansor, S.; Ahmad, N. Flood susceptibility assessment using GIS-based support vector machine model with different kernel types. Catena 2015, 125, 91–101. [Google Scholar] [CrossRef]
Mind’je, R.; Li, L.; Amanambu, A.; Nahayo, L.; Nsengiyumva, J.B.; Gasirabo, A.; Mindje, M. Flood susceptibility modeling and hazard perception in Rwanda. Int. J. Disaster Risk Reduct. 2019, 38, 101211. [Google Scholar] [CrossRef]
Rahman, M.; Ningsheng, C.; Islam, M.M.; Dewan, A.; Iqbal, J.; Washakh, R.M.A.; Shufeng, T. Flood susceptibility assessment in Bangladesh using machine learning and multi-criteria decision analysis. Earth Syst. Environ. 2019, 3, 585–601. [Google Scholar] [CrossRef]
Saleem, N.; Huq, M.E.; Twumasi, N.Y.D.; Javed, A.; Sajjad, A. Parameters Derived from and/or Used with Digital Elevation Models (DEMs) for Landslide Susceptibility Mapping and Landslide Risk Assessment: A Review. ISPRS Int. J. Geo-Inf. 2019, 8, 545. [Google Scholar] [CrossRef] [Green Version]
Vojtek, M.; Vojteková, J. Flood Susceptibility Mapping on a National Scale in Slovakia Using the Analytical Hierarchy Process. Water 2019, 11, 364. [Google Scholar] [CrossRef] [Green Version]
Quesada-Román, A. Landslide and flood zoning using geomorphological analysis in a dynamic basin of Costa Rica. Cartogr. Mag. 2021, 102, 125–138. [Google Scholar] [CrossRef]
Swain, K.C.; Singha, C.; Nayak, L. Flood Susceptibility Mapping through the GIS-AHP Technique Using the Cloud. ISPRS Int. J. Geo-Inf. 2020, 9, 720. [Google Scholar] [CrossRef]
Jacinto, R.; Grosso, N.; Reis, E.; Dias, L.; Santos, F.D.; Garrett, P. Continental Portuguese Territory Flood Susceptibility Index – contribution to a vulnerability index. Nat. Hazards Earth Syst. Sci. 2015, 15, 1907–1919. [Google Scholar] [CrossRef] [Green Version]
Giordan, D.; Notti, D.; Villa, A.; Zucca, F.; Calò, F.; Pepe, A.; Dutto, F.; Pari, P.; Baldo, M.; Allasia, P. Low cost, multiscale and multi-sensor application for flooded area mapping. Nat. Hazards Earth Syst. Sci. 2018, 18, 1493–1516. [Google Scholar] [CrossRef] [Green Version]
Ahamed, A.; Bolten, J.; Doyle, C.; Fayne, J. Near Real-Time Flood Monitoring and Impact Assessment Systems. In Remote Sensing of Hydrological Extremes; Lakshmi, V., Ed.; Springer International Publishing: Cham, Switzerlands, 2017; pp. 105–118. [Google Scholar] [CrossRef]
Kwak, Y.j. Nationwide Flood Monitoring for Disaster Risk Reduction Using Multiple Satellite Data. ISPRS Int. J. Geo-Inf. 2017, 6, 203. [Google Scholar] [CrossRef]
Erdelj, M.; Natalizio, E.; Chowdhury, K.R.; Akyildiz, I.F. Help from the Sky: Leveraging UAVs for Disaster Management. IEEE Pervasive Comput. 2017, 16, 24–32. [Google Scholar] [CrossRef]
Kyrkou, C.; Theocharides, T. Deep-Learning-Based Aerial Image Classification for Emergency Response Applications Using Unmanned Aerial Vehicles. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–17 June 2019; pp. 517–525. [Google Scholar] [CrossRef] [Green Version]
Granados-Bolaños, S.; Quesada-Román, A.; Alvarado, G.E. Low-cost UAV applications in dynamic tropical volcanic landforms. J. Volcanol. Geotherm. Res. 2021, 410, 107143. [Google Scholar] [CrossRef]
Nandi, A.; Mandal, A.; Wilson, M.; Smith, D. Flood hazard mapping in Jamaica using principal component analysis and logistic regression. Environ. Earth Sci. 2016, 75, 465. [Google Scholar] [CrossRef]
Choubin, B.; Moradi, E.; Golshan, M.; Adamowski, J.; Sajedi-Hosseini, F.; Mosavi, A. An ensemble prediction of flood susceptibility using multivariate discriminant analysis, classification and regression trees, and support vector machines. Sci. Total Environ. 2019, 651, 2087–2096. [Google Scholar] [CrossRef]
Rizeei, H.; Pradhan, B.; Nampak, H.; Ahmad, N.; Ghazali, A. Ensemble machine-learning-based geospatial approach for flood risk assessment using multi- sensor remote-sensing data and GIS Ensemble machine-learning-based geospatial approach for flood risk assessment using multi-sensor remote-sensing data and GIS. Geomat. Nat. Hazards Risk 2017, 8, 1080–1102. [Google Scholar] [CrossRef] [Green Version]
Opella, J.M.A.; Hernandez, A.A. Developing a Flood Risk Assessment Using Support Vector Machine and Convolutional Neural Network: A Conceptual Framework. In Proceedings of the 2019 IEEE 15th International Colloquium on Signal Processing Its Applications (CSPA), Penang, Malaysia, 8–9 March 2019; pp. 260–265. [Google Scholar]
Mpakratsas, M.; Moumtzidou, A.; Gialampoukidis, I.; Vrochidis, S.; Kompatsiaris, I. A Deep Neural Network Slope Reduction Model on Sentinel-1 Images for Water Mask Extraction. In Proceedings of the 40th Asian Conference on Remote Sensing (ACRS 2019), Daejeon, Korea, 14–18 October 2019. [Google Scholar] [CrossRef]
Friuli Venezia Giulia Region. Piano Stralcio per l’assetto Piano Stralcio per l’assetto Idrogeologico dei Bacini di Interesse Regionale (Bacini Idrografici dei Tributari della Laguna di Marano—Grado, ivi Compresa la Laguna Medesima, del Torrente Slizza e del Levante). 2016. Available online: https://www.regione.fvg.it/rafvg/export/sites/default/RAFVG/ambiente-territorio/geologia/FOGLIA24/allegati/PAIR_Allegato_01_relazione_illustrativa.pdf (accessed on 5 November 2021).
Eastern Alps River Basin District Authority—AAWA. Flood Risk Management Plan of the Eastern Alps Hydrographic District. Decree of the President of the Italian Council of Ministers of 27 October 2016. 2017. Available online: https://va.minambiente.it/en-GB/Oggetti/Info/1456 (accessed on 5 November 2021).
Rahmati, O.; Yousefi, S.; Kalantari, Z.; Uuemaa, E.; Teimurian, T.; Keesstra, S.; Pham, T.D.; Tien Bui, D. Multi-Hazard Exposure Mapping Using Machine Learning Techniques: A Case Study from Iran. Remote Sens. 2019, 11, 1943. [Google Scholar] [CrossRef] [Green Version]
Filipponi, F. Sentinel-1 GRD Preprocessing Workflow. Proceedings 2019, 18, 11. [Google Scholar] [CrossRef] [Green Version]
Lee, J.S. Refined filtering of image noise using local statistics. Comput. Graph. Image Process. 1981, 15, 380–389. [Google Scholar] [CrossRef]
Lee, J.S.; Wen, J.H.; Ainsworth, T.L.; Chen, K.S.; Chen, A.J. Improved sigma filter for speckle filtering of SAR imagery. IEEE Trans. Geosci. Remote Sens. 2008, 47, 202–213. [Google Scholar]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Tin Kam, H. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995; Volume 1, pp. 278–282. [Google Scholar] [CrossRef]
Kron, W. Flood Risk = Hazard • Values • Vulnerability. Water Int. 2005, 30, 58–68. [Google Scholar] [CrossRef]
Wannous, C.; Velasquez, G. United Nations Office for Disaster Risk Reduction (UNISDR)—UNISDR’s Contribution to Science and Technology for Disaster Risk Reduction and the Role of the International Consortium on Landslides (ICL). In Advancing Culture of Living with Landslides; Sassa, K., Mikoš, M., Yin, Y., Eds.; Springer International Publishing: Berlin/Heidelberg, Germany, 2017; pp. 109–115. [Google Scholar] [CrossRef]
Poljansek, K.; Marin Ferrer, M.; De Groeve, T.; Clark, I. Science for Disaster Risk Management 2017: Knowing Better and Losing Less; Number EUR 28034 in JRC102482; Publications Office of the European Union: Brussels, Belgium, 2017. [CrossRef]
UNISDR. Sendai Framework for Disaster Risk Reduction 2015–2030. 2015. Available online: https://www.undrr.org/publication/sendai-framework-disaster-risk-reduction-2015-2030 (accessed on 1 March 2021).
Eastern Alps River Basin District Authority (AAWA). Project of Update of Flood Risk Management Plan of the Eastern Alps Hydrographic District, II Cycle. 2020. Available online: https://sigma.distrettoalpiorientali.it/portal/index.php/pgra (accessed on 5 November 2021).
European Environment Agency. Copernicus Land Monitoring Service—CORINE Land Cover. 2021. Available online: https://land.copernicus.eu/pan-european/corine-land-cover (accessed on 1 February 2021).
Kerle, N. Remote Sensing of Natural Hazards and Disasters. In Encyclopedia of Natural Hazards. Encyclopedia of Earth Sciences Series; Bobrowsky, P.T., Ed.; Springer: Dordrecht, The Netherlands, 2013; pp. 837–847. [Google Scholar] [CrossRef]
Ferri, M.; Wehn, U.; See, L.; Monego, M.; Fritz, S. The value of citizen science for flood risk reduction: Cost–benefit analysis of a citizen observatory in the Brenta-Bacchiglione catchment. Hydrol. Earth Syst. Sci. 2020, 24, 5781–5798. [Google Scholar] [CrossRef]
Wang, Q.; Li, W.; Wu, Y.; Pei, Y.; Xie, P. Application of statistical index and index of entropy methods to landslide susceptibility assessment in Gongliu (Xinjiang, China). Environ. Earth Sci. 2016, 75, 599. [Google Scholar] [CrossRef]

Figure 1. Location of the case study areas (the square boxes). The coordinates are expressed in the Reference system WGS84—EPSG 4326.

Figure 2. Elevation of the case study area in meters above sea level. Referred to vertical Datum EPSG 32632 (WGS84/UTM Zone 32), while the horizontal coordinates are expressed in the Geographic Reference System WGS84—EPSG 4326 (Source of data INGV http://tinitaly.pi.ingv.it/, accessed date: 15 December 2021 elaborated by AAWA).

Figure 3. Flowchart of the Dynamic Flood Hazard Assessment Algorithm.

Figure 4. Confusion Matrix of the best Random Forest model.

Figure 5. Features’ Relative Importance of the best Random Forest model.

Figure 6. Confusion Matrix for best trained Random Forest model over Trieste, 23 September 2019 dataset in Validation Phase.

Figure 7. Flood Hazard map for Trieste at 23 September 2019.

Figure 8. Flood Risk map for Trieste at 23 September 2019.

Figure 9. Confusion Matrix for best trained Random Forest model over Muggia, 29 October 2018 dataset in Validation Phase.

Figure 10. Flood Hazard map for Muggia on 29 October 2018.

Figure 11. Flood Risk map for Muggia on 29 October 2018.

Figure 12. Confusion Matrix for best trained Random Forest model over Monfalcone, 24 September 2019 dataset in Validation Phase.

Figure 13. Flood Hazard map for Monfalcone on 24 September 2019.

Figure 14. Flood Risk map for Monfalcone on 24 September 2019.

Table 1. Confusion matrix representation.

	Actually Positive	Actually Negative
Predicted Positive	True Positives (TPs)	False Positives (FPs)
Predicted Negative	False Negatives (FNs)	True Negatives (TNs)

Table 2. Estimation of Vulnerability of people according to FHR.

FHR	Vp (0 ≤ Vp ≤ 1)
FHR < 0.75	0.25
0.75 ≤ FHR < 1.25	0.75
FHR ≥ 1.25	1

Table 3. Classification of Hydraulic Risk into four classes.

Risk R	Level of Risk	Color
$0 \leq R < 0.2$	Moderate	Very light lime green
$0.2 \leq R < 0.5$	Medium	Soft yellow
$0.5 \leq R < 0.9$	High	Soft orange
$0.9 \leq R \leq 1.0$	Very High	Very light red

Table 4. Set of parameters per machine learning model.

Model	Set of Parameters
Random Forest	Criterion: {Gini, Entropy}, Maxfeatures: {Auto, Log2, Sqrt, None}, n_Estimator: {50, 100, 200, 500}
Naïve Bayes	$α : {0.01, 0.1, 1}$
SVM	Kernel Functions: { rbf, poly, sigmoid }
Neural Network	Activation Function: {ReLu, Sigmoid}, #Neurons: {1, 2, 4, 6, 8}, Epochs: {100, 300, 500}

Table 5. Summary table of results of the best-trained machine learning models over the test set.

Model	Categories	Precision	Recall	F1-Score
Random Forest	High Hazard	0.99	0.99	0.99
(Criterion: Gini; Max features: Auto; n_Estimator: 50)	Medium Hazard	0.99	0.99	0.99
	Moderate Hazard	0.99	0.99	0.99
Naïve Bayes	High Hazard	0.93	0.91	0.92
( $α : 0.01$ )	Medium Hazard	0.91	0.97	0.94
	Moderate Hazard	0.00	0.00	0.00
SVM	High Hazard	0.96	0.98	0.97
(Kernel Function: poly)	Medium Hazard	0.96	0.99	0.98
	Moderate Hazard	0.98	0.97	0.98
Neural Network	High Hazard	0.99	0.99	0.99
(Act.Fun.: ReLu; #Neur.: 8; Epochs: 500)	Medium Hazard	0.99	0.99	0.99
	Moderate Hazard	0.99	0.99	0.99

Table 6. Relative Importance scores of the features.

Feature	Relative Importance Score
Water Velocity	43.75939
Water Depth	22.99143
Slope	13.60606
Roughness	11.90979
DEM	3.87064
TRI	2.73504
TPI	0.23988
Water Mask	0.84437
Aspect	0.04339

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Antzoulatos, G.; Kouloglou, I.-O.; Bakratsas, M.; Moumtzidou, A.; Gialampoukidis, I.; Karakostas, A.; Lombardo, F.; Fiorin, R.; Norbiato, D.; Ferri, M.; et al. Flood Hazard and Risk Mapping by Applying an Explainable Machine Learning Framework Using Satellite Imagery and GIS Data. Sustainability 2022, 14, 3251. https://doi.org/10.3390/su14063251

AMA Style

Antzoulatos G, Kouloglou I-O, Bakratsas M, Moumtzidou A, Gialampoukidis I, Karakostas A, Lombardo F, Fiorin R, Norbiato D, Ferri M, et al. Flood Hazard and Risk Mapping by Applying an Explainable Machine Learning Framework Using Satellite Imagery and GIS Data. Sustainability. 2022; 14(6):3251. https://doi.org/10.3390/su14063251

Chicago/Turabian Style

Antzoulatos, Gerasimos, Ioannis-Omiros Kouloglou, Marios Bakratsas, Anastasia Moumtzidou, Ilias Gialampoukidis, Anastasios Karakostas, Francesca Lombardo, Roberto Fiorin, Daniele Norbiato, Michele Ferri, and et al. 2022. "Flood Hazard and Risk Mapping by Applying an Explainable Machine Learning Framework Using Satellite Imagery and GIS Data" Sustainability 14, no. 6: 3251. https://doi.org/10.3390/su14063251

APA Style

Antzoulatos, G., Kouloglou, I.-O., Bakratsas, M., Moumtzidou, A., Gialampoukidis, I., Karakostas, A., Lombardo, F., Fiorin, R., Norbiato, D., Ferri, M., Symeonidis, A., Vrochidis, S., & Kompatsiaris, I. (2022). Flood Hazard and Risk Mapping by Applying an Explainable Machine Learning Framework Using Satellite Imagery and GIS Data. Sustainability, 14(6), 3251. https://doi.org/10.3390/su14063251

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Flood Hazard and Risk Mapping by Applying an Explainable Machine Learning Framework Using Satellite Imagery and GIS Data

Abstract

1. Introduction

2. Relevant Literature

3. Materials and Methods

3.1. Study Area

Digital Elevation Model in the Study Area

3.2. Flood Conditioning Factors

3.3. Satellite Imagery Analysis

3.4. Machine Learning Techniques

3.5. Model Evaluation Metrics

4. Methodology

4.1. Dynamic Flood Hazard Assessment Algorithm

4.1.1. Study Area and Historical Flood Events

4.1.2. Data Acquisition and Feature Extraction

4.1.3. Data Preprocessing

4.1.4. Training, Testing and Validation

4.1.5. Flood Hazard Assessment and Mapping

4.2. Dynamic Flood Risk Assessment Algorithm

4.2.1. Vulnerability Estimation

4.2.2. Exposure Estimation

4.2.3. Hydraulic Flood Risk Assessment

5. Results and Discussion

5.1. Evaluation of Dynamic Flood Hazard/Risk Algorithm

5.1.1. Trieste, 23 September 2019

5.1.2. Muggia, 29 October 2018

5.1.3. Monfalcone, 24 September 2019

5.2. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI