Large Scale Flood Risk Mapping in Data Scarce Environments: An Application for Romania

: Large-scale ﬂood risk assessment is essential in supporting national and global policies, emergency operations and land-use management. The present study proposes a cost-e ﬃ cient method for the large-scale mapping of direct economic ﬂood damage in data-scarce environments. The proposed framework consists of three main stages: (i) deriving a water depth map through a geomorphic method based on a supervised linear binary classiﬁcation; (ii) generating an exposure land-use map developed from multi-spectral Landsat 8 satellite images using a machine-learning classiﬁcation algorithm; and (iii) performing a ﬂood damage assessment using a GIS tool, based on the vulnerability (depth–damage) curves method. The proposed integrated method was applied over the entire country of Romania (including minor order basins) for a 100-year return time at 30-m resolution. The results showed how the description of ﬂood risk may especially beneﬁt from the ability of the proposed cost-e ﬃ cient model to carry out large-scale analyses in data-scarce environments. This approach may help in performing and updating risk assessments and management, taking into account the temporal and spatial changes in hazard, exposure, and vulnerability.


Introduction
Flood damage assessment and analysis is a key component of any strategy for flood risk mitigation and management [1][2][3], especially considering the potential consequences of climate change [4], increasing human activities and high-value assets in vulnerable areas [5]. Methods and tools for estimating and mapping economic damage [6][7][8] are essential for comparing the efficiency and sustainability of a portfolio of flood mitigation measures to support decision-makers in delineating flood risk management plans as required by the Flood Directive [9]. Therefore, a comprehensive approach that appropriately quantifies the three components of risk, hazard, exposure and vulnerability, is essential to identify exposed areas, design the most appropriate strategies for flood management, aid the decision-making process to ensure preparedness, response and recovery [10][11][12], and, thus, improve the sustainability and resilience of risk-based flood management practices (e.g., [13,14]).
In this context, the need for national and transboundary flood risk assessment is gaining attention at all levels [15]. Indeed, large-scale flood risk assessments can offer support for national and global policies. For example, governments can use risk information to prioritize investments (e.g., cost-benefit analysis), and take measures for flood risk reduction, for emergency operations and for land-use planning, while insurance companies can improve their estimations of flood risk-based insurance premiums [16].
Large-scale analyses often face a range of practical difficulties due to: (i) the large amount of data and parameters needed for the calibration and validation of traditional models; (ii) the moderate/coarse resolution of data accessible at a global scale and the limited availability of high-resolution data, which may affect the accuracy of results and (iii) the computational demand of hydraulic models. Therefore, in most countries across the globe, existing hazard and risk maps are incomplete, and there is a wide variety of models used and level of detail adopted [17][18][19]. There is a need to identify efficient and inexpensive ways to develop flood risk maps for large areas.
In this context, the present study proposes a simple and cost-effective model for large scale flood economic damage quantification and mapping in data-scarce environments that aims to improve the completeness of existing flood risk maps.

Background of the Study
Recent decades have seen a growing availability of data from new technologies of earth observation (EO) and environmental monitoring [20]. At the same time, the high-computational power and number of newly developed algorithms to analyse Big Data have increased significantly (e.g., machine learning techniques). Their aim is to provide cost-effective solutions for large scale flood hazard estimation that are critical to economies and organizations with limited resources. The use of free and open source Geographic Information System (GIS) software has also increased significantly and has become a consolidated technique for the analysis, visualisation and transparent communication of flood risk worldwide [21]. Such advances have increased the range of possibilities for geo-scientists, updating and re-inventing the way highly resource-and data-intensive processes, such as risk mapping and management, are carried out [22].
Several fast-processing methods have been proposed for a preliminary delineation of flood-prone areas using EO information that is readily available. Generally, these approaches are based on indicators of the geomorphologic, climatic, hydrologic, geologic and land-use characteristics of the basins. Among these, the approaches that appear the most suitable for the purpose of this study are those based on the use of Digital Elevation Models (DEMs) for simplicity of application and reliability of results [23][24][25][26][27][28]. These methods represent a valid alternative to traditional hydrological/hydraulic modelling where detailed analyses are not possible due to limited data and/or resource availability.
Several studies have been carried out (for example [29][30][31]) to identify the most influential DEM-retrieved geomorphological features for the delineation of areas with high flood susceptibility and to understand how to use these descriptors for large-scale applications. These investigations highlighted the strong performance of a linear binary classifier based on a morphological index named the Geomorphic Flood Index (GFI). GFI was recognized as the most suitable morphological indicator among those examined for preliminary mapping over large areas and under limited data conditions [31]. The intrinsic properties of the GFI and the analysis of the DEM were further investigated [32] to obtain an immediate estimate of the spatial distribution of the water surface elevation. Importantly, water surface elevations can be coupled with the vulnerability of exposed assets to obtain an assessment of the direct economic damage in a simple and effective way [33].
A wide range of damage models, e.g., empirical (data-driven approaches) or synthetic (i.e., expert-based/what-if-analysis approaches), for different sectors (agriculture, industry, residential, etc.) and levels of detail (micro, meso and macroscale), is available in the literature [1,3,[34][35][36][37]. These methods are based on damage functions and vulnerability curves (e.g., depth-damage curves based on flood water level) in order to describe component-by-component analysis of physical damage to buildings, which considers available information of damage mechanisms. To support the European (EU) policies on flood risk management, the European Commission-Joint Research Centre (JRC) collected and harmonized information of damage models developed specifically for some EU countries (e.g., the Rhine Atlas and FLEMO (Flood Loss Estimation MOdel) [36] in Germany, the Standard Method [38] and the Damage Scanner model [39] in the Netherlands), in order to develop vulnerability curves and maximum damage values for all EU member countries [40]. These curves can be effectively used since the maximum damage values have been collected for all European states. For each damage class, an average value has been calculated, which can be scaled based on per capita GDP and applied for each nation or region [41]. Recently, all the above-mentioned curves were integrated into an open-source and user-friendly QGIS tool called FloodRisk [42] that is able to automatize the damage assessment and mapping routine. These damage curves and their relative economic values vary according to the land-use type. Exposure analysis, which consists of recognizing potentially vulnerable elements and assigning them an economic value on the basis of land-use or considering individual units, is important to explore prior to the application of the damage curves.
It is widely recognized that satellite remote sensing is able to provide useful information quickly and economically regarding the classification of elements potentially exposed to flood risk based on land-use analysis. In particular, medium-resolution Landsat-8 satellite sensors show high potential for timely analyses and regional scale studies with reduced costs [43,44]. In analysing the built environment using Landsat images, the different land-use classes are generally determined by the alignment of buildings, roads and open spaces, thus they cannot be effectively described only by the spectral values of a single pixel. Recently, object-based image analysis has shown better results compared to pixel-based approaches; in addition, object-based image analysis can be usefully combined with machine learning techniques to improve the description of urban models [45].
Based on the scientific advances described above, we propose a simplified and practical framework for the quantification of direct economic flood damage through the integration of geomorphic methods, land-use classification with machine learning techniques, and the damage curves method (Section 4).
The proposed integrated framework was tested in Romania for a 100-year return time (country attributes are described in Section 3.) The resulting map, with a resolution of 30 m, covers the entire country, including minor streams which are often not considered in large-scale analyses. The demonstration application (Section 5) reveals how the flood risk description may particularly benefit from the integrated use of geomorphic and geospatial methods, object-based image analysis combined with machine learning algorithms, and EO freely available monitoring data. A discussion and conclusion are provided in Section 6.

Case Study
According to CRED EM-DAT [46], Romania is one of the European countries most affected by floods. Flooding was responsible for 86% of the economic damage caused by natural hazards in the last 30 years in Romania [47].
The total area of Romania is about 238,000 km 2 . The Danube basin covers 97.8% of Romania and 30% of the entire Danube basin falls in the administrative area of Romania [48] (Figure 1).
In the last two decades, agricultural areas in Romania have decreased in favor of built-up areas, which increased by 176 km 2 from 1990 to 2006 [49]. Furthermore, the built-up areas increased by 112 km 2 from 2006 to 2012, with an average annual growth of 19 km 2 (0.15% increase from 2006 to 2012) [50]. As land uses shift, flooding is causing additional serious damage, due to the greater concentration of people and economic assets in areas crossed by the waterways.

Methodology
The proposed methodological framework for flood damage mapping (for a single scenario event) is based on the three main steps described below and shown in Figure 2.
The first step (see 4.1 of Figure 2) was the application of a supervised linear binary classification model based on the Geomorphic Flood Index (GFI) [30][31][32]. This step aimed to derive the maps of flood extent and water depth for Romania (including minor order basins) using Shuttle Radar Topography Mission (SRTM) [51] elevation data. The SRTM is one of the most widely used altimetric data sources in the world, due to its quality, accessibility and global coverage. Since 2000, a number of SRTM DEMs have been created and made available to the public with different ground sampling. In this study, we adopted the highest resolution dataset SRTM-1 Arc-Second Global elevation data (1 arc-second/30 m). The second step (see 4.2 of Figure 2) aimed to realize an exposure map obtained by supervised land-use classification, by using a machine learning technique combined with an objectbased EO image analysis. In particular, the information obtained from Landsat-8 remotely sensed optical images [52] was used together with the discontinuous (i.e., available for a few large cities in Europe) existing high resolution land-use map Urban-Atlas [53], to obtain a harmonized and consistent land-use map with a resolution of 30 m for the areas estimated as flooded in the previous step.
Finally, the flood economic damage (see 4.3 of Figure 2) mapping was carried out for a given return time by using vulnerability curves and maximum damage values proposed by [41] contained in the free and open-source GIS FloodRisk tool [42], by combining the water depth map (estimated in step 1) with information on exposure and vulnerability (evaluated in step 2).

Methodology
The proposed methodological framework for flood damage mapping (for a single scenario event) is based on the three main steps described below and shown in Figure 2.
The first step (see 4.1 of Figure 2) was the application of a supervised linear binary classification model based on the Geomorphic Flood Index (GFI) [30][31][32]. This step aimed to derive the maps of flood extent and water depth for Romania (including minor order basins) using Shuttle Radar Topography Mission (SRTM) [51] elevation data. The SRTM is one of the most widely used altimetric data sources in the world, due to its quality, accessibility and global coverage. Since 2000, a number of SRTM DEMs have been created and made available to the public with different ground sampling. In this study, we adopted the highest resolution dataset SRTM-1 Arc-Second Global elevation data (1 arc-second/30 m). The second step (see 4.2 of Figure 2) aimed to realize an exposure map obtained by supervised land-use classification, by using a machine learning technique combined with an object-based EO image analysis. In particular, the information obtained from Landsat-8 remotely sensed optical images [52] was used together with the discontinuous (i.e., available for a few large cities in Europe) existing high resolution land-use map Urban-Atlas [53], to obtain a harmonized and consistent land-use map with a resolution of 30 m for the areas estimated as flooded in the previous step.
Finally, the flood economic damage (see 4.3 of Figure 2) mapping was carried out for a given return time by using vulnerability curves and maximum damage values proposed by [41] contained in the free and open-source GIS FloodRisk tool [42], by combining the water depth map (estimated in step 1) with information on exposure and vulnerability (evaluated in step 2). Water 2020, 12, x FOR PEER REVIEW 5 of 20

Flood Hazard Analysis
A hydrogeomorphic method based on the Geomorphic Flood Index (GFI) [31] was adopted to derive flooding characteristics in terms of inundation extent and expected flood water levels. The GFI is defined as the natural logarithm of the ratio between a variable water depth hr and the elevation difference H: For each basin location, hr represents the river stage at the point of the river network closest to the one under examination, which is considered as the most probable source of flood hazard. The

Flood Hazard Analysis
A hydrogeomorphic method based on the Geomorphic Flood Index (GFI) [31] was adopted to derive flooding characteristics in terms of inundation extent and expected flood water levels. The GFI is defined as the natural logarithm of the ratio between a variable water depth h r and the elevation difference H: For each basin location, h r represents the river stage at the point of the river network closest to the one under examination, which is considered as the most probable source of flood hazard. The parameter H represents the difference in elevation between the two above-mentioned points. The estimate of the river stage h r is derived as a function of the upslope contributing area A r using a hydraulic scaling relationship [54]: A simplified method to derive flood-susceptible areas based on the GFI has been recently proposed and implemented in a QGIS plugin called the Geomorphic Flood Area tool (GFA tool) [48]. This tool performs a linear binary classification based on the GFI to identify flood-prone areas by combining morphological information extracted from digital descriptions of the earth's surface (DEMs), along with flood hazard information derived by existing inundation maps (e.g., official flood hazard maps derived by hydraulic simulations). These maps are usually available for limited sections of a basin, and are used as references to train the classifier by identifying the optimal discrimination threshold of the GFI, τ. It represents the linear boundary of decision between the two classes (flood-prone areas and flood-free areas) that best approximates the reference flood map. After the classifier has been trained, it is able to detect flood-prone areas over unstudied portions of the basin.
Recently, Ref. [32] further explored the GFI approach to obtain a rapid estimation of floodwater level. Specifically, since paired values of h r and A r are generally not available to calibrate the hydraulic scaling relationship of Equation (2), the authors suggest the GFI can be computed by estimating the exponent n from literature values (see, e.g., [47]) and assuming the coefficient a using the threshold calibrated with the linear binary classification. In particular, the coefficient a can be obtained as shown in Equation (3): Consequently, the scaling relationship of Equation (2) can be used for a more reliable estimate of the floodwater level at each point of the river network. From the geomorphic analysis previously performed, the difference in elevation H that separates all basin locations from the closest river is known. Therefore, the h r values can be used to estimate the water depth (WD) within the delineated flood-prone areas as follows: The main advantages of this approach are its limited requirements in terms of computational time and input data (a DEM and a flood hazard map of at least 2% of the basin area). This makes it particularly useful in poor data environments and for large scale analyses, where hydraulic simulations are prevented by data unavailability, computational complexity and excessive costs.

Exposure Analysis
A central issue in exposure analysis is the availability of up-to-date information on the extent, type and composition of the land-use map [55]. Nowadays, satellite imagery can provide, in a time and cost-efficient manner, the information needed to derive land-cover information over large areas [56]. Thus, in this study, medium-resolution satellite images with a spatial resolution comparable to the hazard information estimated above were used to derive proxies (land use/land cover) to support an indirect characterization of exposure over large areas. The latest Landsat-8 free images were adopted in this study for land cover classification due to their constant improvement in terms of richness in spectral, spatial, radiometric and temporal resolution thanks to new generations of satellites being launched with new and improved sensors [57,58].
This study used an object-based approach for image analysis and combined it with a machine learning technique in order to derive a land-use pattern covering the flood susceptible areas. Object-based image analysis (OBIA) uses geographic objects as basic units for land cover classification that allows for the incorporation of various sources of information, such as spectral and texture, as the basis for classification, reducing the within class variation and generally removing salt-and-pepper effects, which result from isolated pixels mainly due to misclassification [44,45]. To partition the multi-dimensional feature space into homogenous areas and label each image object/segment with respect to the desired land-use classes, an Artificial Neural Network (ANN) was used. The ANN is able to estimate the non-linear relationship between the input data and desired outputs, has a fast generalization capability and no a-priori knowledge requirements of the distribution model of the input data, as it is a non-parametric classifier [59][60][61]. To train and test the adopted supervised classifier, a reference dataset was compiled for the study area using the high-resolution Urban Atlas land-use data, resampled to a 30 m raster to be consistent with the hazard map and Landsat dataset, available for large cities (i.e., all urban areas with a population greater than 50K in Europe) as part of the Copernicus Land Monitoring Service [53].
The proposed image processing and analysis chain consisted of three main modules, described in the following sections: (i) image segmentation, (ii) feature-based description, and (iii) classification.

Image Segmentation
This study adopted an object-based approach to produce high classification accuracies based on Landsat images [44,62,63]. However, OBIA has limitations, such as choosing an appropriate segmentation scale: under-segmentation means that the image-objects are larger than the objects on the ground, thus two or more land covers will be included in one large image object, while over-segmentation results in more subdivision, in particular with low spatial resolution images [64].
Image pixels were clustered into segments in an unsupervised manner using image statistics through an algorithm [65], called Efficient Graph-based Image Segmentation. This algorithm delineates segment boundaries by minimizing intra-segment variability and maximizing inter-segment separability. It does this through a comparison of the feature space, defined on the basis of the brightness values of the pixels in the Landsat spectral bands, with that between neighboring pixels within each segment. The segments were produced with a parameter combination [45], in an application on Landsat-8 images, where the runtime parameter k was equal to 25 and the segment size m was equal to 75 to effectively set a scale of analysis. The segmentation parameters were tuned using a spatial autocorrelation metric, Global Moran's I [45]. This measured, on average, how similar a region was to its neighbors [66]. It quantified inter-segment heterogeneity and the standard deviation of the brightness values of the input image bands, weighted by each segment's size, and summed over all the segments in the image scene, to assess intra-segment homogeneity.

Feature-Based Description
A recursive feature selection algorithm [45,67] was used to identify the most significant image features using all spectral bands of Landsat-8. A pre-selected set of the 18 spectral features, 6 spectral band indices and 20 textural features derived from the grey-level co-occurrence matrix (GLCM), were calculated at the computational unit of segments for the most important spectral bands of Landsat-8. A summary list of the features is presented in Table 1.

Classification
The multi-layer perceptron (MLP) neural network [68], was used to recognize land cover patterns. The MLP consisted of an input layer that corresponded to individual data sources, such as the 44 image features (listed in Table 1), a hidden layer used for computation and an output layer that included a set of codes to represent the classes to be recognized. The inputs were introduced into the ANN in a feed forward manner, which propagated through the hidden layer and the output layer. The values associated with each node were estimated from the sum of the multiplications between input node values and weights of the links connected to that node [69]. The initial weights were selected randomly and then the back propagation (BP) training algorithm compared the calculated output for a given observation with the expected output for that observation. The differences between the expected and calculated output values across all observations were summarized using the mean squared error. This process of feeding forward signals and back-propagating the errors was repeated iteratively until the total error was minimized and distributed among the various nodes in the network [60]. Table 1. List of image features derived from Landsat-8 [45]. Note: All image features were computed per segment with i being the row number and j being the column number of the grey-level co-occurrence matrix (GLCM). P i, j was defined as: where V is the value of GLCM.

Image Feature Features Class
Mean spectral value in image bands 1, 2, 4, 5, 6, 7; Spectral Standard deviation in image bands 1, 3; Spectral Weighted brightness, with I being the number of image bands, J being the number of pixels per segment and p being the brightness value of the pixels; Mean derived from the GLCM in bands 1, 3, 4, 5, 6, 7, 9, 10; Variance derived from the GLCM in band 10; In this study, several ANNs in sequences, each with a binary output layer, were used to solve and decompose the multi-class problems, i.e., the clusterization of several land-use classes, as a flexible classification scheme that consisted of different hierarchical levels [45]. For example, first, the extraction of built-up areas and non-built-up areas was performed and, in order to get a more differentiated picture of the built environment, the class "built-up area" was further refined and split into more detailed classes (e.g., "residential" and "industrial/commercial").

Damage Analysis
The free and open source FloodRisk GIS tool [42] was used to calculate the possible direct tangible damage at the national level. This tool uses the powerful QGIS functions and python libraries to calculate and display the expected economic damage in the form of maps.
Specifically, the spatial distribution of the water depth estimated by the geomorphic method GFA for a 100-year return period (described in Section 4.1), and the land-use map, derived by the Landsat-8 image analysis that combined object-based image analysis with an ANN (Section 4.2), were overlapped and combined in the FloodRisk tool. Moreover, the asset values and vulnerability (i.e., the propensity of the asset exposed to suffer damage) were associated with each land-use class in the damage model. In particular, FloodRisk uses specific vulnerability functions, called depth-damage curves, to calculate the range of damage (0 = no damage to 1 = complete destruction) for the different type of land-use classes on the basis of the water depth in each cell of the land-use map. Considering that these curves are context-specific, (i.e., very sensitive to local characteristics [35] and that functions developed for the specific context of Romania were not available [33]), a set of average functions [40] based on data and functions collected and harmonized for all of Europe by the JRC, were used in this application. These can provide uniform and comparable results not only at the national level but also at a pan-European scale. It is notable that the use of depth-damage functions is well-suited to large scale analysis because the uncertainty of damage estimates decreases over larger sample areas due to the averaging effect [32]. The degree of damage in each land-use map pixel, obtained on the basis of these depth-damage curves, was multiplied for the asset value associated with each land-use class [41], and adjusted on the basis of inflation to estimate the total economic damage.

Results of the Application in Romania
The proposed integrated framework was tested in Romania for the 100-year return period. The demonstration application over the entire country (including minor order rivers) and resulting maps with a resolution of 30 m are presented in detail in the following sub-sections.

Flood Hazard Mapping in Romania
By applying the GFI method, a water depth map at 30 m spatial resolution was developed. To derive the geomorphic features of the study area, the SRTM-1 Arc-Second Global elevation data (1 arc-second/30 m) were adopted. Its horizontal accuracy, as estimated by the NASA Jet Propulsion Laboratory (JPL), ranges from 7.2 m (Australia) to 12.6 m (North America), while the absolute vertical accuracy ranges from 5.6 m (Africa) to 9 m (North America) [70]. Indeed, the performance of this hydrogeomorphic method, like almost all applications related to hydraulic and hydrological modelling, is influenced by the quality of elevation data. For example, finer-resolution bare-ground Digital Terrain Model (DTM) derived by airborne LiDAR (Light Detection and Ranging) would improve the terrain analysis, and consequently the accuracy of the results [71]. Although the availability of these kinds of high-resolution data is increasing, they are still limited in terms of extent, and in many cases cover only river courses and not full hydrographic basins. It was therefore not feasible for our study that has a country wide coverage. Thus, the SRTM dataset was chosen since, to the best of our knowledge, it represents the highest quality elevation dataset freely available at the scale of our interest. It also matches the spatial resolution of the Landsat-8 optical images used in the exposure analysis.
To train the classification, the 100-year pan-European flood hazard map derived by the JRC [18] was used. This flood hazard dataset, to the best of our knowledge, is the most complete and state-of-the-art dataset for the study area and, in addition, is open-access (https://data.jrc.ec.europa.eu/dataset/jrcfloods-floodmapeu_rp100y-tif).
In a preliminary study, the extent of the 100-year flood-prone areas had been derived by using the GFA tool over Romania, which sub-divided the country into five major basins (for more details please see [48]). Starting from these maps, the procedure described in Section 4.1 was carried out for the estimation of flood water levels assuming the exponent n was the average of literature values, with n = 0.3544 [48], and the values for the coefficient a listed in Table 2. Table 2. The calibrated optimal threshold for the linear binary classification based on the geomorphic flood index (GFI), and the relative estimated value of the scale factor for each of the five major basins identified in Romania (see [48] The obtained hazard map (Figure 3) includes minor order streams that are usually neglected in large scale analyses. Therefore, it offered a more extended identification of areas exposed to flood hazard, filling the gaps of the JRC pan-European map that essentially covers the major rivers. Additionally, the use of a resolution of 30 m represented a further improvement in the detail of the results.
Water 2020, 12, x FOR PEER REVIEW 11 of 20 the image features derived by Landsat-8 images, see, e.g., [44]). However, a validation test was performed to verify if the above-described methodologies could be effectively used in the proposed case study. The reference dataset, the Urban Atlas land-use map [53], was used to produce a large dataset for training (90%) and a small dataset for testing (10%). After the ANN training, the classification performance was evaluated on the testing dataset through an error matrix where the individual accuracy of each category was plainly described with the classification errors of both inclusion (commission errors) and exclusion (omission errors) [74]. Overall accuracy was defined as the total Unfortunately, we cannot provide a validation of these hydrogeomorphic flood maps, since real recorded flow depths are rarely available for validation purposes. However, the performance of the GFI method adopted here was validated against hydraulic flood hazard maps in previous studies [31,32,72].

Exposure Mapping in the Case Study
The exposure maps (Figure 4) at 30 m spatial resolution were derived by using a supervised land-use classification through the combination of a machine learning approach and an object-based image analysis on Landsat-8 optical images [52].
Water 2020, 12, x FOR PEER REVIEW 12 of 20 positives, which were correctly classified as positives. Specificity, also defined as the true negative rate or user's accuracy, was assessed as the proportion of actual negatives, which were correctly classified as negatives. For the testing dataset, the overall accuracy of the performed classification was around 85%. The user's and producer's accuracies of individual classes were consistently good except for the land-use type infrastructure (around 40%), due possibly to the resolution adopted in the study (e.g., 30 m resolution could be too coarse to effectively identify roads or railway lines). The user's accuracy of other land-use classes ranged from almost 71% for industrial uses to 90% for agricultural uses. The The use of a consistent and complete dataset for all of Romania with a finer resolution would enable rapid assessments of flood risk for local policymakers in regions where few data are available and would also allow for consistent assessment across the country. Landsat-8 has 10 bands with a resolution of 30 m for the visible spectrum, near infrared spectrum and shortwave infrared spectrum, 100 m for thermal infrared spectrum, plus an additional panchromatic band of 15 m resolution [52]. The images are available in processing Level 1 T-Terrain Corrected, which means they are orthorectified, radiometrically corrected and the TIRS bands are resampled to 30 m by using cubic convolution [45]. Moreover, these images were pre-processed by conversion to top-of-atmosphere reflectance, mosaicked and transformed into the ETRS89/LAEA Europe coordinate reference system. The use of global EO datasets, such as Landsat-8 images, for information on exposure enabled a rapid (due to the use of machine learning) flood damage assessment with a controlled degree of consistency in terms of accuracy and coverage. In addition, the analysis was done using a resolution of 30 m, which further improved accuracy.
To train and test the adopted supervised classifier, a reference dataset was compiled for the study area using the high-resolution Urban Atlas land-use data, available for large cities (i.e., all urban areas with a population greater than 50K in Europe) as part of the Copernicus Land Monitoring Service [53], resampled to a 30 m raster to be consistent with the hazard map and Landsat dataset. In particular, data available for the 35 main cities in Romania were acquired from the website of the European Union's Earth Observation Programme Copernicus [73]. The three main steps described in the methodological section for the land-use classification (i.e., image segmentation, feature-based description, and classification) were based on commonly used ANN algorithms (e.g., [60,68]) and extensively tested methodologies (e.g., for the choice of the segmentation method and parameters or the image features derived by Landsat-8 images, see, e.g., [44]). However, a validation test was performed to verify if the above-described methodologies could be effectively used in the proposed case study.
The reference dataset, the Urban Atlas land-use map [53], was used to produce a large dataset for training (90%) and a small dataset for testing (10%). After the ANN training, the classification performance was evaluated on the testing dataset through an error matrix where the individual accuracy of each category was plainly described with the classification errors of both inclusion (commission errors) and exclusion (omission errors) [74]. Overall accuracy was defined as the total number of correctly classified segments divided by the total number of test segments. Sensitivity, defined as the true positive rate or producer's accuracy, was estimated as the proportion of actual positives, which were correctly classified as positives. Specificity, also defined as the true negative rate or user's accuracy, was assessed as the proportion of actual negatives, which were correctly classified as negatives.
For the testing dataset, the overall accuracy of the performed classification was around 85%. The user's and producer's accuracies of individual classes were consistently good except for the land-use type infrastructure (around 40%), due possibly to the resolution adopted in the study (e.g., 30 m resolution could be too coarse to effectively identify roads or railway lines). The user's accuracy of other land-use classes ranged from almost 71% for industrial uses to 90% for agricultural uses. The producer's accuracy was reasonable for all other land-use classes (>62%). All results are shown in Table 3.

Economic Damage Mapping of Romania
The direct economic damage map ( Figure 5) for the 100-year return time was derived in the free and open-source GIS FloodRisk tool [42] by using vulnerability curves and maximum damage values [40], and adjusted on the basis of inflation (see Table 4).    Along with the results obtained using the hazard and exposure data described above, additional damage maps were produced for comparison. For the description of the flood hazard, the pan-European flood hazard map (available in the JRC data catalogue) produced by applying a combination of distributed hydrologic and hydraulic models for a return period of 100 years at a resolution of 100 m [18] (here resampled at 30 m), was adopted.
Regarding exposure, open data available for all of Europe, were adopted (Corine Land-Cover (CLC)). The CLC database is part of the Copernicus Land Monitoring Service and provides a consistent, comparable, pan-European land cover product [75]. The first CLC product was developed for the reference year 1990, with updates in 2000, 2006 and 2012. The products, generated in raster format at resolutions of 100 m and freely available for downloading from the EEA website [75], were resampled at 30 m in order to be consistent with the results proposed in this study.
The comparison analysis covered only major rivers since the maps developed by [18] do not cover minor order basins. The result of the combination of the different data and methods in the FloodRisk tool are presented below: Figure 6 and Table 5 show the results of economic flood damage analysis using (i) the GFA method and Landsat-8 data (as proposed in this study), (ii) the GFA method and the CLC, (iii) the JRC flood hazard map [18] and Landsat-8 image analysis, (iv) the JRC flood hazard map [18] and CLC.
Water 2020, 12, x FOR PEER REVIEW 15 of 20 eventually be optimized to reach the minimum error with respect to the reference hydraulic map, but our aim was to derive the water depth using only flood extent (for more detail please see [32]).
Considering that a proper validation of flood risk assessments was not possible at this scale due to data limitations in the study area, it was difficult to validate the performance of the combination of different methods and data. However, the results showed that the major advantages of the proposed method were its use of commonly available input data, the possibility of performing rapid analyses at large scales with a medium spatial resolution (30 m) that also considered minor order basins, and allowing for downscaling between spatial scales. This resulted in a rapid assessment of flood risk in a data and resource-limited environment, and a more consistent comparison of flood risks in cities across Romania.
Finally, the results of the methodology proposed in this study could support stakeholders in, for example, developing national insurance programs and communication and awareness campaigns, to help control local development and planning, and in prioritizing investments. While the central approach is quite general, there are notable novel aspects in this study in terms of the integration of different innovative methodologies and the applicability of the framework at different scales due to its simplicity and cost-efficiency in using parsimonious and commonly available data.  Table  5. Table 5. Results of economic flood damage analysis using (i) the GFA method and Landsat-8 data (as proposed in this study), (ii) the GFA method and the CLC, (iii) the JRC flood hazard map [18] and Landsat-8 image analysis, (iv) the JRC flood hazard map and CLC.

Industrial
Urban Roads  Table 5. The approach proposed in this study (i.e., (i) GFA method combined with Landsat-8 data) presents the highest total value of economic damage due to the higher level of detail of the land-use maps but also due to the greater water depths estimated by the GFA method. The Urban Atlas maps (used to train the adopted ANN for producing, on the basis of the Landsat-8 images, the Romanian exposure map), adopted a large number of classes to describe land-use with detailed classification. This could result in a refined identification of the urban and industrial areas, and an efficient separation from different classes with respect to CLC, where aggregated results could be affected by errors at this scale. Considering that parts of the urban and industrial areas were located close to the river, the increase in water depth values could strongly affect the final damage results.
Absolute damage values may not matter for prioritization of locations for investment, where the risk ranking may be the same irrespective of the absolute values. However, it is important to highlight that the difference was amplified by the very high maximum damage values proposed by [41] for the industrial and urban classes, and that tended to produce a large increase in total damage with a small increase in flooded areas or water depths.
Moreover, because the GFA approach neglected diffusion and transients of the flooding process, it presented structural differences with respect to the method proposed by [18], which could influence the final results. It should also be mentioned that the linear binary classifier was calibrated using only the extent of the reference inundation map, while the flow depths obtained from the JRC maps were not used to calibrate the geomorphic water depth. In fact, the spatial distribution of water depth could eventually be optimized to reach the minimum error with respect to the reference hydraulic map, but our aim was to derive the water depth using only flood extent (for more detail please see [32]).
Considering that a proper validation of flood risk assessments was not possible at this scale due to data limitations in the study area, it was difficult to validate the performance of the combination of different methods and data. However, the results showed that the major advantages of the proposed method were its use of commonly available input data, the possibility of performing rapid analyses at large scales with a medium spatial resolution (30 m) that also considered minor order basins, and allowing for downscaling between spatial scales. This resulted in a rapid assessment of flood risk in a data and resource-limited environment, and a more consistent comparison of flood risks in cities across Romania.
Finally, the results of the methodology proposed in this study could support stakeholders in, for example, developing national insurance programs and communication and awareness campaigns, to help control local development and planning, and in prioritizing investments. While the central approach is quite general, there are notable novel aspects in this study in terms of the integration of different innovative methodologies and the applicability of the framework at different scales due to its simplicity and cost-efficiency in using parsimonious and commonly available data.

Conclusions
The present study proposes a cost-efficient method for a large-scale analysis and mapping of direct economic flood damage at medium resolution in data-scarce environments. The proposed methodological framework consists of three main stages: (i) deriving a water depth map through a DEM-based geomorphic method using a linear binary classification; (ii) generating an exposure land-use map developed from multi-spectral Landsat 8 satellite images using a machine-learning classification algorithm; and (iii) performing a flood damage assessment using a GIS tool based on the vulnerability (depth-damage) curves method. The cost-effective model was based on commonly available datasets and innovative methodologies that took advantage of increased computing power.
The proposed integrated framework was tested in Romania for a 100-year return period. The resulting map, with a resolution of 30 m, covered all of Romania including minor order streams, which are often not considered in large-scale analyses. The proposed method can therefore enhance the completeness and spatial details of existing flood hazard and damage maps, allowing for a methodologically consistent assessment for Romania. The use of free and open data allows for more consistent comparisons on flood risks in cities across the EU and within individual countries. However, it is important to stress that these datasets may be subject to error that may certainly affect the final quality of the results. In the case of higher quality input data availability (for example, increased accuracy and spatial resolution), model results might be improved by including a validation against real event data at varying return periods. In the present case, the use of global/continental datasets enable rapid assessments of flood risk for policymakers in regions of the country where there are few data available and, in particular, it allows for quantitative comparisons of the damage in different regions/communities. On one hand, this is particularly important to identify the total risk facing the country in order to determine investment priorities, and the scale of government grants for flood risk management measures undertaken on the basis of consistent assessments. It is also necessary to demarcate the limits of the national insurance program. This is particularly important to identify the flood risk at the country level in order to determine investment priorities and flood risk management measures. On the other hand, this responds to the need of local decision-makers for controlling development where little data is available but a significantly higher level of detail is required-not least because decisions made based on that information are likely to be contested.
Although detailed and accurate flood inundation maps can be obtained by means of hydrological and hydraulic models, very few continental-scale studies that employ detailed hydraulic models are presented in the literature (e.g., [76,77]) due to the difficulties of data availability and computational expense. However, significant research advances and increasing computational and data resources are extending the range of possibilities for practical applications, filling the gap that exists between simplified large-scale approaches and detailed reach scale hydraulic models. The low complexity of the proposed approach provides a computationally inexpensive flood hazard assessment that does not account for the spatio-temporal dynamic of inundations especially in territories where numerous hydraulic infrastructures exist. However, hydrogeomorphic approaches benefit from the increasing availability of high-resolution DEMs, and our study has shown the potential of fast-processing DEM-based algorithms for consistent flood hazard characterization. Additionally, the results of the proposed integrated framework, due to its simplicity and cost-efficiency in using parsimonious and commonly available data, could be used to explore future scenarios at large scale to study temporal and spatial changes related to, for example, impacts of climate change, socio-economic growth or both. Taking into account these changes, the proposed model could perform future risk assessments in terms of hazard, exposure and vulnerability.