Integrating Remote Sensing Indices and Ensemble Machine Learning Model with Independent HEC-RAS 2D Model for Enhanced Flood Prediction and Risk Assessment in the Ottawa River Watershed

Oluwadare, Temitope Seun; Chen, Dongmei; McGrath, Heather

doi:10.3390/app16010070

Open AccessArticle

Integrating Remote Sensing Indices and Ensemble Machine Learning Model with Independent HEC-RAS 2D Model for Enhanced Flood Prediction and Risk Assessment in the Ottawa River Watershed

by

Temitope Seun Oluwadare

^1,*

,

Dongmei Chen

^1,*

and

Heather McGrath

²

¹

Geographic Information and Spatial Analysis Laboratory (GisaLab), Department of Geography and Planning, Queen’s University, Kingston, ON K7L 3N6, Canada

²

Canada Centre for Mapping and Earth Observation, Natural Resources Canada, Ottawa, ON K1A 0Y7, Canada

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2026, 16(1), 70; https://doi.org/10.3390/app16010070

Submission received: 30 October 2025 / Revised: 15 December 2025 / Accepted: 15 December 2025 / Published: 20 December 2025

(This article belongs to the Special Issue Spatial Data and Technology Applications)

Download

Browse Figures

Versions Notes

Abstract

Floods rank among the most destructive natural hazards worldwide. In Canada’s capital region—Ottawa and its surrounding areas—flood prediction is crucial, especially in flood-prone zones, to improve flood mitigation strategies, given its historical record-breaking events in 2017 and 2019, which resulted in substantial damage to homes and infrastructure in the region. Previous studies in these regions typically did not use remote sensing techniques or advanced methods to enhance flood susceptibility prediction and extent mapping. This study addressed the gap by incorporating 18 flood conditioning factors and integrating high-performance machine learning algorithms such as Random Forest, Support Vector Machines and XGBoost to develop ensemble flood susceptibility models. The HEC-RAS 2D model was used to simulate hydrodynamic variables based on a 100-year flood scenario. The developed ensemble model for flood susceptibility prediction achieved strong performance (Kappa, F1-score, and AUC all above 0.979) and demonstrated model transferability, maintaining high accuracy (Kappa > 0.850, F1-score > 0.920, AUC > 0.990) when applied to other sub-regions. The hydraulic model reveals that flood velocity and depth differ across sub-regions, reaching maximums of 15 m/s and 15 m, respectively. SHAP analysis indicates Elevation, Handmodel, MNDWI, NDWI, and Aspect are key factors influencing floods. These findings and methods help Natural Resources Canada develop tools and policies for effective flood risk reduction in the Ottawa River watershed and similar regions.

Keywords:

flood spatial modeling; HEC-RAS 2D; ensemble machine learning models; flood susceptibility mapping; natural hazards; flood prediction

1. Introduction

Floods are one of the most severe and frequent natural disasters on the planet, with grave consequences for human life, infrastructure, and ecosystems. Their frequency and intensity have increased, predominantly through impacts of climate change and land-use changes, making the early warning and management of them increasingly urgent to reduce their cataclysmic impacts [1]. In Canada, flooding represents the most frequent and costly natural hazard, causing damages estimated at billions of dollars annually, particularly in densely populated and ecologically sensitive regions [2], with severe events recorded in 2017 and 2019, as documented by Public Safety Canada and Environment and Climate Change Canada [3,4]. Flood susceptibility mapping (FSM) plays a crucial role in identifying regions at risk of flooding, providing essential insights to guide decision-making, reduce vulnerabilities, and strengthen disaster preparedness. Within this national context, the Ottawa River watershed is of particular importance due to its vast drainage basin, socio-economic relevance, and history of severe hydrological events, most notably in 2017 and 2019 [3,5,6]. These events have underscored the vulnerability of flood-prone areas in the National Capital Region and exposed gaps in prediction and mitigation strategies, emphasizing the urgent need for improved flood prediction models to support effective planning and targeted mitigation efforts.

FSM addresses this need by integrating topographic, hydrological, and land-use factors to identify areas that are most prone to flooding. By providing a robust, data-driven basis for delineating high-risk zones, FSM improves predictive accuracy and equips decision-makers with actionable insights for proactive management and long-term resilience planning. Although considerable efforts have been made in previous times, predicting flood susceptibility and mapping flood extent in the Ottawa River watershed continue to face key challenges, such as data uncertainties, hydrodynamic estimation limits, climate change variability, hydrological complexity, and integration of RS and suitable modeling techniques. Conventional hydrological and hydraulic models could be a potential approach to salvage this challenge but often provide limited predictive capacity, lack up-to-date topographic and bathymetric data, lack integration of diverse spatially distributed conditioning factors, and rely on computationally expensive simulations [7,8].

Remote sensing (RS) has revolutionized flood research by providing consistent, high-resolution, and spatially extensive datasets. Studies such as Rahmati et al. [9] and Khosravi et al. [10] have demonstrated the integration of hydrological models with RS and geographic information systems (GIS), incorporating topographical and climatic variables [11], as an effective approach for identifying, assessing, mapping, and predicting flood hazards. This strategy could help address flood challenges in the Ottawa River watershed. However, despite its potential, the approach often encounters critical limitations, including low processing efficiency and the complexity of result interpretation, both of which hinder timely and effective decision-making in real-world applications [12]. Further compounding these challenges is the underutilization of RS-derived indices and advanced machine learning (ML) techniques, which restricts the development of more accurate and reliable flood hazard assessments, as deduced from the major floods of 2017 and 2019 in the Ottawa–Gatineau region study [13,14,15]. RS indices, such as the Normalized Difference Water Index (NDWI) and Modified Normalized Difference Water Index (MNDWI), among others, enhance the spectral distinction between water and non-water surfaces, allowing more precise flood delineation [16,17]. Incorporating RS indices into flood susceptibility mapping captures hydrological and land surface conditions that strongly influence inundation dynamics, thereby improving model accuracy [18].

In recent times, ML has demonstrated significant promise in flood susceptibility modeling due to its capacity to handle high-dimensional datasets, nonlinear relationships, and complex interactions among flood conditioning factors. Algorithms such as Random Forest (RF) [13,16], Support Vector Machines (SVM) [19], and Extreme Gradient Boosting (XGBoost) [20], and many other ML have achieved high predictive accuracy in various geographic contexts [21,22,23]. Nevertheless, the performance of single ML algorithms is often limited by biases inherent in their structures and limitations, prompting the development of ensemble modeling. Ensemble models leverage the complementary strengths of multiple algorithms, yielding more stable and accurate predictions than any single model by reducing model-specific biases and uncertainty. Their application with RS indices within Canadian watersheds can enhance more reliable susceptibility mapping, improve predictive accuracy, and support flood risk assessment and informed management decisions in the Ottawa River basin. Alongside data-driven models, process-based hydraulic models such as the Hydrologic Engineering Center’s River Analysis System (HEC-RAS) provide a critical lens for simulating floodplain hydraulics.

The HEC-RAS 2D model supports the simulation of key flood hydrodynamic variables, such as floodwater depth and flow velocity, thereby providing valuable insights into flood propagation and hazard assessment [24,25]. Nonetheless, hydraulic models are often constrained by high computational demands, reliance on boundary condition assumptions, and sensitivity to uncertainties in input data [26]. HEC-RAS 2D simulations were incorporated as a complementary approach to provide process-based validation, simulate flow depth and velocity, and enable scenario testing. This simulation technique intends to enhance the quality of flood susceptibility mapping and analysis to be achieved through RS and ML integration, while also capturing the variability of hydrodynamic parameters and generating actionable outputs to support flood risk assessment and informed management decisions. Despite advances in flood modeling globally, significant gaps persist in the Canadian context. Many existing studies have relied predominantly on historical discharge records and statistical approaches [27,28,29], neglecting the integration of RS indices and ensemble ML models with hydraulic simulations. Furthermore, few studies have examined the transferability of ensemble models across sub-watersheds, a key consideration for broad-scale flood risk management. Addressing these gaps is relevant for the watershed of the Ottawa River, where effective flood prediction tools are critical in protecting urban centers, cultural assets, and vulnerable populations.

This study responds to these gaps by developing an integrated framework that combines RS techniques and ensemble ML models with independent usage of HEC-RAS 2D hydraulic simulations as an augmented approach to enhance flood analysis, prediction, and estimation of flood hydrodynamic variables (depth and velocity) in the Ottawa River watershed. Shapley Additive Explanations (SHAP) is applied to improve interpretability and quantify the contribution of conditioning factors, supporting more accurate spatial predictions alongside HEC-RAS modeling. This approach aligns with Natural Resources Canada’s commitment to evidence-based resilience planning. The study makes five major contributions: (1) demonstrating the superiority of ensemble ML models over single algorithms in Canadian watersheds; (2) presenting a novel framework integrating ML and RS and independent hydraulic modeling; (3) showing spatial model transferability; (4) quantifying flood drivers and identifying the topmost influential factors; and (5) providing actionable insights to guide flood mitigation strategies across diverse sub-regions, supporting evidence-based decision-making and advancing Canada’s broader climate adaptation objectives.

2. Materials and Methods

2.1. Study Area

The study area is centered on the National Capital Region, encompassing Ottawa, the capital city of Canada, located in Ontario, and its neighboring city, Gatineau, in Quebec (Figure 1a). The Ottawa River, one of the largest rivers in eastern Canada, stretches 1271 km and forms the boundary between the provinces of Ontario and Quebec [13,30]. To optimize classification performance and ensure robust model development, two additional training sites were incorporated upstream and downstream: Training Site II (covering an 8.6 km stretch of the Ottawa River and 72.18 km² west of downtown Ottawa–Gatineau) and Training Site III (covering a 4.3 km stretch of the Ottawa River and 22.49 km² east of downtown Ottawa–Gatineau) [13]. Together with the primary site, the three training areas encompass a total of 28.3 km of the Ottawa River and span 292.23 km² in total (see Figure 1b). Its watershed covers over 140,000 square kilometers, with 65% of the area located in Quebec and 35% in Ontario [30,31].

The 2017 and 2019 flood disasters that happened in the region were caused by a combination of heavy rainfall, rapid snowmelt, and saturated soils in the watershed area. In 2017, total rainfall during April and May reached 174% of normal levels, generating exceptionally high runoff volumes and peak river flows that surpassed historical records, particularly within the unregulated sub-basins of the Ottawa River watershed [32]. The 2019 (mid-April to mid-May) flood surpassed the 2017 peak by approximately 25 cm, resulting in the inundation of over 6000 homes and extensive infrastructure damage [33]. The floods underscore the vulnerability of the Ottawa River watershed hydrodynamics and the critical need for continued monitoring and adaptive flood management in the face of changing climate patterns. Due to the climatic and hydrometeorological events mentioned above, the Ottawa–Gatineau region has become highly susceptible to flooding during heavy downpours.

2.2. Geospatial Data

2.2.1. Flood Inventory

Creating a flood inventory map is the initial step in developing an ensemble ML-based flood susceptibility model. Figure 1a displays a topographic map that highlights major roads and urban areas, sourced from the World Topo Map available in the QGIS Pro-Esri basemap. The inset red bounding box indicates the area of interest.

The area of interest exhibits a watershed map in white background color, which includes ML training areas and other significant locations as indicated in the legend provided in Figure 1b. The polygon shapefiles for sub-regions I, II, and III, designated as ML training sites for this study, were provided by the Canada Centre for Mapping and Earth Observation (CCMEO) at Natural Resources Canada (NRCan). In Figure 1b, polygons in red bounding box (c, d and e) represent external validation areas (EVA) obtained from the Flood Hazard Identification and Mapping Program (FHIMP), whereas polygons (a and b) are newly delineated EVA obtained from GisaLab (Geography Department., Queen’s University, Kingston, Canada) to provide a more comprehensive spatial representation of the study results. The primary purpose of these polygons was to delineate the specific boundaries of interest within which the predicted FSM results are presented. In this study, we utilized Sentinel-2 satellite imagery to derive selected RS indices (e.g., NDVI (Normalized Difference Vegetation Index), NDWI (Normalized Difference Water Index), MNDWI (Modified Normalized Difference Water Index), TWI (Topographic Wetness Index), Moisture, Water Index values). These indices were derived under pre-flood conditions for the Ottawa floods on randomly selected days for 2017 (15th April) and 2019 (5th April), respectively. The essence of deriving these indices was to capture baseline environmental controls that influence flood susceptibility, rather than transient flood signals. This approach ensures that the predictors represent vegetation and stable land surface or hydrological characteristics while avoiding meteorological or spectral biases that could introduce floodwater, turbidity, debris load, or short-term surface reflectance anomalies. A total of 40,000, 15,000, and 4000 flood inventory sample points were acquired to establish the ensemble model for the Ottawa–Gatineau region, where each sample point represents a 50% distribution of flood and 50% non-flood locations across sub-regions I, II, and III, respectively, as depicted in Figure 1b. The non-flood samples were generated randomly, then filtered to the required number using the RS water-sensitivity index (i.e., MNDWI < 0). Subsequently, sample points with MNDWI > 0 were removed. The synthetic minority oversampling technique (SMOTE) was applied to reduce class imbalance and improve ML model performance. Flood and non-flood inventory points were labeled as 1 and 0, respectively. Flooded and non-flooded locations within the model training area were randomly split into a training set (70%) and a testing set (30%) [13,16]; note that SMOTE was applied after the splitting to avoid data leakage. Flooded and non-flooded sample points outside the training area were regarded as the external validation area (EVA), which was used to test for ensemble ML model transferability. The EVA sub-regions, labeled (a–e), are indicated using different colors in Figure 1c.

2.2.2. Flood Conditioning Factors

The spatial variability of flood susceptibility across regions is mostly influenced by a complex interplay of environmental and geospatial factors known as the flood conditioning factors (FCF). To rigorously identify areas most prone to flooding, a comprehensive set of 18 flood conditioning factors was assembled, drawing on the principal drivers reported across different contemporary literature. Hence, FCF are categorized into 4, namely: the Topographic variables (Elevation, Slope, Aspect, Curvature, Surface roughness (SR), TRI (Topographic Roughness Index), TPI (Topographic Position Index)), Remote sensing variables (NDVI (Normalized Difference Vegetation Index), NDWI (Normalized Difference Water Index), MNDWI (Modified Normalized Difference Water Index), Moisture, Water Index), Anthropogenic variables (Land cover (LC), DTRoad (Distance to road), DTRiver (Distance to river)) and Hydrological variables (SPI (Stream power index), Height above nearest drainage (HandModel), TWI (Topographic Wetness Index)). The selection and integration of the above-listed variables are considered essential in constructing robust flood susceptibility models and ensuring reliable spatial prediction. The topographical factors in this study were derived from a LiDAR digital elevation model (DEM) at a spatial resolution of 1 m obtained from the NRCan. The RS indices and LC were all obtained from high-resolution Level 2, Sentinel-2 cloud-free optical satellite imagery with a spatial resolution of 10 m downloaded from https://apps.sentinel-hub.com/eo-browser (accessed on 6 April 2025). The road topology shapefiles were obtained from the City of Gatineau Road Network and Ottawa Road Centerlines, respectively [34,35], which are vectorized line networks of all roads. The river shapefile was generated by delineating the major rivers and tributaries within the Ottawa–Gatineau region using high-resolution LiDAR-derived DEM data. This process involved extracting hydrologically significant flow paths and stream networks from the elevation surface to ensure accurate representation of the regional drainage system. This approach has been widely used in flood susceptibility and hydrological studies to analyze river networks and their impact on flooding. The spatial relationship between sample points and river channels—quantified as DTRiver—was subsequently calculated and analyzed to assess flood susceptibility using QGIS. This method leverages DEM-derived flow direction and accumulation algorithms to accurately map river networks and has become a standard approach in hydrological and flood susceptibility studies [36]. This metric enhances flood susceptibility assessment by integrating proximity to hydrological features, which significantly influences local flood risk [36].

Elevation and slope strongly influence flood susceptibility by controlling how rainfall is redistributed across the landscape. Steep, high-elevation areas generate rapid runoff, whereas low-lying and gently sloped terrains retain water and are therefore highly vulnerable during extreme rainfall and soil saturation, as seen in the 2017 and 2019 Ottawa floods [37,38]. Terrain characteristics can modulate how rainfall and antecedent moisture contribute to flooding. Aspect influences soil wetness through differential solar exposure [39], while curvature governs flow routing—runoff converges on concave surfaces, increasing flood susceptibility, and diverges on convex surfaces, reducing flood risk [16,40]. Indices such as TWI and SPI identify zones prone to water accumulation or high runoff energy [41,42]. The HAND model refines this understanding by mapping each location’s elevation relative to the nearest drainage channel, making it particularly suitable for predicting the river overbanking and flood inundation extents [43], which was the case observed during the Ottawa 2017 and 2019 events.

SR and TRI influence how excess rainfall is routed across the landscape by slowing overland flow and temporarily storing water in uneven terrain—conditions that intensified runoff during the 2017 and 2019 Ottawa floods [44,45]. The TPI further distinguishes valleys, which accumulate runoff under saturated soil conditions, from ridges that shed water more easily [46]. LC regulates infiltration and surface retention; impervious urban areas accelerate runoff during these events, while vegetated and wetland areas promote infiltration and moderate flood peaks [47]. Proximity to rivers (DTRiver) is also critical, as areas closer to the Ottawa River were most affected when prolonged rainfall led to overbanking and elevated discharges in both 2017 and 2019 [48]. Similarly, DTRoad captures how road networks disrupt natural drainage, concentrate stormwater, and create localized flood pathways that become evident during periods of extreme rainfall and soil saturation [49].

Integrating RS indices into geospatial analysis strengthens the identification of flood-prone zones, especially under conditions like those that triggered the 2017 and 2019 Ottawa floods, where intense rainfall and saturated soils dominated the hydrologic response. Vegetation-related indices such as NDVI help indicate the landscape’s capacity to absorb rainfall: positive NDVI values (~+0.3 to +0.8) reflect dense vegetation that slows runoff, whereas negative values (bare soil or water) correspond to surfaces that readily generate overland flow under heavy rainfall [50]. Thus, NDVI is widely used to infer vegetation cover and its moderating role during flood events [51]. Water-sensitive indices such as NDWI and MNDWI further highlight active inundation zones; values above zero typically signify surface water, which expanded significantly during the prolonged rainfall episodes in 2017 and 2019. MNDWI improves water detection in built-up areas by reducing spectral interference from vegetation and soil, making it particularly valuable across Ottawa’s mixed urban–rural landscape [52]. Because NDWI and MNDWI respond differently to spectral properties, using both reduces misclassification—an essential step when storm-induced turbidity, shadowing, or dynamic water levels occur during extreme events [53]. Additionally, soil moisture indices derived from SWIR and thermal bands reveal areas with limited infiltration capacity; high index values denote saturated surfaces that contributed to rapid runoff generation during both Ottawa flood events [54].

3. Model Feature Importance (MFI) and ML Ensemble Models

MFI assesses how important each input feature is in predicting the target variable. Finding the features that are most significant in building ML predictive models is helpful. A “Recursive Feature Elimination” (RFE) was proposed because of its efficiency in model-driven feature selection and addressing features’ nonlinear relationships [16].

3.1. Recursive Feature Elimination Algorithm

Algorithm 1 describes the RFE algorithm in pseudocode with bars for routines.

Algorithm 1. RFE Algorithm in Pseudocode with Bars for Routines and Tabulation.

Input:
   I—Feature matrix (q_samples x q_features)
   J—Target vector
   model: ML Model with feature importance
   n: Desired number of features to select
Initialize:
I_remaining ← I // Start with the full dataset
         feature_set ← All feature indices

while len (I_remaining) > n: // Continue until n features remain
               Train Phase:
      Model.fit (I_remaining, j)
Ranking Phase:
      importance_scores ← model.feature_importances
      ranked_features ← argsort (importance_scores)
Elimination Phase:
      least_important ← ranked_features [0]
      feature_set ← feature_set\{least_important}
      I_remaining ← I [:, feature_set]

Output:
      I_remaining—Feature matrix with top n features

RFE addresses multicollinearity challenges in model development. It helps to eliminate correlated features and ensures stability and interpretability of coefficients, enhancing the model’s generalizability.

3.2. Methodology Flowchart and ML Models

This subsection describes the different model types involved in building the ensemble models for the FSM methodology. This study typically follows a systematic flowchart, as it begins with 18 variable extractions from data sources described in Section 2.2.2, followed by preprocessing procedures shown in Figure 2a. All RS indices and terrain derivatives were normalized to a 0–1 range to enhance model generalization across varying terrains. A balanced dataset with equal flooded (1) and non-flooded (0) samples was created to reduce class imbalance and ensure robust training. Flood probability was then generated using the “predict_proba” function within the ensemble ML framework, providing pixel-level likelihoods that support a probabilistic, confidence-based flood susceptibility assessment. Three different ML algorithms were deployed in developing the ensemble ML models, namely, random forest (RF), support vector machines (SVM), and eXtreme Gradient Boosting (i.e., XGBoost (XGB)).

3.2.1. Random Forest (RF) Model

RF algorithm is a robust ML algorithm used for classification, regression, and predictive modeling. It is well-suited for complex, nonlinear relationships and has been widely applied, including FSM [13,16,55]. RF builds multiple decision trees on different data subsets and combines their outputs, improving accuracy and reducing overfitting. Let y_i represent the prediction output for an input N from a decision tree, where N might include input features like HandModel, slope, NDVI, etc., where each tree provides a prediction, denoted as y_i, and the final predicted value Y is obtained by averaging the predictions from all the trees. Y is calculated as follows:

y_{i} = T_{i}

(1)

Y = \frac{1}{x} \sum_{i = i}^{x} {(y}_{i})

(2)

where y_i is the predicted value from the i-th individual decision tree, T_i is the i-th decision tree, Y is the final predicted value from the RF model, and x is the total number of decision trees in the forest, while RF output Y is determined by majority voting:

Y = m o d e {T_{1} (N), T_{2} (N), T_{3} (N), \dots \dots ., T_{X} (N)}

(3)

where “mode” signifies the most frequently predicted class among the trees.

3.2.2. Support Vector Machines (SVM) Model

SVM is one of the supervised ML techniques commonly used in mapping flood susceptibility because it can handle high-dimensional and nonlinear data [16,19]. With SVM, flood and non-flood areas are accurately classified as distinct regions because optimal hyperplanes are formed, achieving the maximum separation of the classes. The performance of SVM is even better with the use of kernel functions, which makes SVM very applicable in areas with complicated hydrology and geomorphology. Another significant aspect of SVM in hydrology and studies involving flood risk assessment is their robustness against overfitting [16,56]. SVMs are mathematically represented as:

J (x) = s i g n (\sum_{i = 1}^{n} a_{i} j_{i} X (I, I_{i}) + b)

(4)

where I, I_i, and j_i are the input features, support vectors (critical data points), and labels of support vectors (e.g., flood/no-flood), respectively, while X(I, I_i) is the kernel function used in mapping the inputs into a higher-dimensional space. Lastly, a_i are Lagrange multipliers, depicting the support vectors’ importance, while b is the bias term.

3.2.3. Extreme Gradient Boosting (XGBoost) Model

XGBoost, referred to as XGB in this study, was introduced by Chen et al. in 2016 [57]. XGB utilizes the gradient boosting technique to combine multiple classification and regression trees for modeling purposes. It is an ensemble method that boosts decision trees iteratively and consists of three key components: shrinkage and column subsampling to prevent overfitting, gradient tree boosting for additive training, and a regularized objective function aimed at enhancing generalization. XGB is often noted for its strong performance and efficiency, especially in applications related to flood hazards. More information about XGB can be found in [57,58].

3.3. Ensemble Modeling

Ensemble ML models integrate two or more algorithms to leverage their complementary strengths, thereby improving prediction accuracy and generalization in flood susceptibility mapping [59]. In this study, independent models (i.e., RF_model, SVM_model, & XGB_model) were trained separately, and their predictions were combined using a meta-learning model to produce the final prediction. Below are mathematical expressions where Equation (5) is a base learner’s matrix.

X ϵ R^{n x d} = [\begin{matrix} x_{11} & x_{12} & x_{1 d} \\ x_{21} & x_{22} & x_{2 d} \\ ⋮ & ⋮ & ⋮ \\ x_{n 1} & x_{n 2} & x_{n d} \end{matrix}]

(5)

where n and d, are the number of samples and features, respectively. Introducing f₁,f₂ and f₃ to denote each of the independent base-learner models RF, SVM and XGB, respectively, with each column-vector predictions (p₁,p₂ and p₃) made by respective base models on the dataset X. Equations (6) and (7) are the column-vector predictions (p) from each of the base models, and general representation of the prediction column vectors, respectively, while Equation (8) is the Meta-Learning Weight (W) Matrix.

{p_{1} = f}_{1} (X) ϵ R^{n} = [\begin{matrix} p_{11} & p_{12} & p_{13} \\ p_{21} & p_{22} & p_{23} \\ ⋮ & ⋮ & ⋮ \\ p_{n 1} & p_{n 2} & p_{n d} \end{matrix}], {p_{2} = f}_{2} (X) ϵ R^{n} = [\begin{matrix} p_{11} & p_{12} & p_{13} \\ p_{21} & p_{22} & p_{23} \\ ⋮ & ⋮ & ⋮ \\ p_{n 1} & p_{n 2} & p_{n d} \end{matrix}],

{p_{3} = f}_{3} (X) ϵ R^{n} = [\begin{matrix} p_{11} & p_{12} & p_{13} \\ p_{21} & p_{22} & p_{23} \\ ⋮ & ⋮ & ⋮ \\ p_{n 1} & p_{n 2} & p_{n d} \end{matrix}]

(6)

P = [p_{1} |p_{2}| p] ϵ R^{n x 3}

(7)

W ϵ R^{3} = [\begin{matrix} w_{1} \\ w_{2} \\ w_{3} \end{matrix}]

(8)

The final ensemble model prediction vector is:

\tilde{y} = P W + {b 1}_{n}

, where

w_{1}, w_{2} a n d w_{3}

, are weight matrices for RF, SVM, and XGB, respectively. b is the bias (or intercept) term; it is scalar and gets added to every predicted value, shifting the entire prediction vector up or down. 1_n is the n-vector of ones. Consider Equation (9) for each sample i:

{\tilde{y}}_{i} = w_{1} p_{1}, i + w_{2} p_{2}, i + w_{3} p_{3}, i + b

(9)

Kindly note that three different ensemble models (Esm-1, Esm-2 and Esm-3) were developed for each of the training geographical areas (I, II and III), as shown in Figure 1b. Separate models were developed for training in geographical areas because each site was treated as an independent training domain. Assigning one model per site allowed us to systematically evaluate how well each model performed not only on its corresponding local site but also on geographically distant test sites. This design enabled a rigorous assessment of model transferability and prediction accuracy across varying spatial contexts.

3.4. Numerical Modeling

Numerical models offer valuable insights into flood dynamics and enhance understanding of the potential impacts of flooding. HEC-RAS Two-Dimensional (2D), version 6.6, was used to generate a rectangular computational mesh comprising 634,942 cells to represent the study area, without input from either the RS or ML outputs to aid in flood risk assessment, obtain spatial information on flood inundation, create a watershed geometry, and simulate hydraulic behavior [24,25]. Hydraulic data are essential inputs for 2-D flow simulation; therefore, flow hydrographs, upstream boundary conditions, and Manning’s roughness coefficient (n) were incorporated to represent the river–floodplain hydraulics. The upstream boundary was defined using a 100-year return period flood hydrograph with a peak discharge of 5980 m³/s obtained from Ottawa River hydrometric station (see Section Computation Procedures). The Manning’s coefficient, which characterizes flow resistance in natural channels, typically ranges from 0.03 to 0.05 in settings with moderate bed irregularities or sparse vegetation [60]. In the absence of spatially calibrated roughness data or detailed information, a uniform value of n = 0.035 was adopted across the model domain—consistent with documented values for moderately rough natural channels [60,61].

Although spatially variable or site-specific roughness could enhance accuracy, the use of a uniform n ensures numerical stability and provides a robust baseline for preliminary flood extent modeling in morphologically similar environments. This approach aligns with previous HEC-RAS 2D studies that achieved reliable hydraulic predictions without field calibration [24,25,62,63]. Future work will incorporate observed data and conduct sensitivity testing across alternative Manning’s n-values to assess sensitivity and improve accuracy. In this study, a numerical model was developed using HEC-RAS 2D to simulate the 2019 flood event and derive the spatial distribution of water depth and flow velocity. The model is governed by the two-dimensional Saint-Venant equations, which describe the conservation of mass and momentum in unsteady open-channel flow [63,64].

\frac{\partial h_{w s}}{\partial t} + \frac{\partial (h u)}{\partial x} + \frac{\partial (h v)}{\partial y} + Q = 0

(10)

\frac{\partial u}{\partial t} + u \frac{\partial u}{\partial x} + v \frac{\partial u}{\partial y} = - g \frac{\partial h_{w s}}{\partial x} + v_{t} (\frac{\partial^{2} u}{\partial x^{2}} + \frac{\partial^{2} u}{\partial y^{2}}) - c_{f} u + f_{v}

(11)

\frac{\partial v}{\partial t} + u \frac{\partial v}{\partial x} + v \frac{\partial v}{\partial y} = - g \frac{\partial h_{w s}}{\partial y} + v_{t} (\frac{\partial^{2} v}{\partial x^{2}} + \frac{\partial^{2} u}{\partial y^{2}}) - c_{f} v + f_{u}

(12)

While Equation (10) is the continuity equation,

h_{w s}

is referred to as the water surface elevation, h is the water depth, u and v are the flow velocities in the x and y, component, respectively. Q is the source/sink flux term,

v_{t}

is the horizontal eddy viscosity coefficient,

c_{f}

and f are the bottom friction coefficient and the Coriolis parameter.

3.5. Estimating Flood Risk at 100-Year Return Periods

This section estimates the magnitude and frequency of flood events corresponding to a 100-year return period to support the flood risk assessment. Such estimates are critical because they quantify flood risk by expressing the statistical likelihood of floods of varying magnitudes over time. A 100-year return period represents a 1% (0.01) probability of a flood occurring in any given year, most especially in the Ottawa–Gatineau region.

Computation Procedures

The annual peak water discharge (m³/s) data from the hydrometric station (Ottawa River [station ID-02KF005; Lat.: 45°21′04″ N, Long: 75°49′35″ W], https://wateroffice.ec.gc.ca/map/index_e.html?type=historical (accessed on 15 May 2025) nearest to the study area was obtained, covering the period from 1960 to 2024. The data were ranked in descending order, assigning Rank 1 to the highest flow and Rank n to the lowest. The return period for each rank was then calculated using the Weibull formula:

T = \frac{n + 1}{m}

(13)

where T is the return period (years), n is the total number of years, and m is the rank of the event. Gumbel distribution line (GDL) was fitted to the ranked data to estimate the discharge value corresponding to the return period. Find below the GDL equation:

x_{T} = \bar{x} + K_{T} . S

(14)

K_{T} = - \frac{\sqrt{6}}{π} . [l n (l n (\frac{T}{T - 1}))]

(15)

Substituting Equation (15) into (14), where T = 100, to obtain Equation (16)

x_{100} = \bar{x} + (- \frac{\sqrt{6}}{π} . [l n (l n (\frac{T}{T - 1}))]) . S

(16)

where

x_{T}

is estimated discharge for return period T,

\bar{x}

mean of the annual maximum flows, and S is standard deviation of the series, while Equation (16) is the 100-year flood discharge.

3.6. Model Performance Metrics

In flood susceptibility and geohazard studies, the F1-score and AUC have proven to be standard performance metrics [16,65] because they provide balanced evaluations of predictive accuracy in the presence of uneven class distributions, while Cohen’s kappa coefficient (i.e., Kappa) often provides stronger evidence of models that yield reliable spatial flood predictions [55,65,66]. The three metrics take values between 0 and 1, where 1 denotes a perfectly accurate model, 0 indicates complete inaccuracy, and 0.5 corresponds to performance equivalent to random prediction [16].

A U C = \frac{\sum T P + \sum T N}{P + N}, F 1 - s c o r e = \frac{2 T P}{2 T P + F P + F N}, K a p p a = \frac{P_{0} - P_{e}}{1 - P_{e}}

(17)

where TP, TN, FP, and FN are true positives, true negatives, false positives, and false negatives, respectively. P denotes the presence of floods, while N indicates their absence. P_o and P_e are observed agreement and hypothetical probability of chance agreement, respectively.

4. Results and Discussion

4.1. Model Performance Comparison and Validation

Model performance was assessed using key metrics, including F1-score, AUC, and Kappa, to evaluate predictive effectiveness across tasks. Models like RF, SVM, XGB, and ensemble models (Esm-1, Esm-2, Esm-3) demonstrated varying levels of metric performance, influenced by differences in data structure, RFE input features, unique responses to class imbalance and feature sensitivity, and hyperparameter configurations.

Listing every model-specific hyperparameter would require reproducing large sections of Python code (version 3.10.16), since each classifier relies on a unique set of optimized parameters. For brevity, only the common hyperparameter-tuning (e.g., GridSearchCV, Bayesian optimization, and cross-validated hyperparameter tuning) strategies applied across all models are provided [16]. These approaches optimize the hyperparameter space, ensuring robust model performance and unbiased model selection. The models were compared on the same dataset (test dataset), which helps to identify the best-performing model for deployment (see Figure 3).

The results indicate that the models are highly effective in distinguishing between flooded and non-flooded areas based on feature importance, with all model predictions yielding metric values over 0.972. Notably, all ensemble models achieved higher metric values than XGB and RF, while SVM exhibited the lowest performance. The F1-score and AUC indicated effective classification of flooded and non-flooded areas; however, the Kappa statistic provided stronger evidence of model reliability in spatial flood prediction [55]. The ensemble approach produced accuracy metrics comparable to, and in some cases exceeding, those reported in flood susceptibility studies across Asia, South America, and Europe, where typical AUC values range from 0.85 to 0.95 [55,65,67,68]. The superior performance observed is likely due to (i) the inclusion of 18 FCFs representing terrain, hydrological, and RS-based factors; (ii) RFE-based feature optimization; and (iii) training across three geographically distinct sub-regions, enhancing model transferability. Given the consistent outperformance of ensemble models, subsequent analyses focused on ensemble methods applied to the dataset derived from the randomly selected EVA categories (A, B, D, and E), as shown in Figure 1c.

Figure 4 presents a bar chart summarizing the EVA prediction performance metrics. All ensemble models achieved prediction accuracy exceeding 0.850, which is an indication of an impressively strong predictive performance, with AUC scores ranging from 0.970 to 0.999, F1-scores between 0.930 and 0.988, and Kappa scores from 0.860 to 0.999.

Visual observation indicates that the ensemble models achieved slightly higher predictive accuracy in EVAs located closer to the main study area (B, D, and E). The exception is area A, where higher-elevation terrain may have influenced model behavior. Across all EVAs, AUC and F1-scores remained consistently close to 1.0, reflecting strong overall performance. Some variability in Kappa values was observed, but this is expected given the metric’s sensitivity in flood susceptibility mapping; these fluctuations reflect natural differences across independent test samples rather than model overfitting.

Similar patterns of high AUC or F1 and slightly lower Kappa values have been reported in recent flood susceptibility studies, supporting the reliability of our results [55,65]. All metrics indicate strong predictive performance beyond the training regions. Notably, Esm-3 outperformed the other models despite its smaller training dataset, likely due to higher data quality and stronger feature relevance in Sub-region III. Key predictors—Elevation, Slope, and HandModel—showed clearer variability and lower noise, resulting in larger SHAP contributions than in Sub-region II (Figure 5 and Figure 6). This stronger feature signal improved Esm-3’s accuracy, demonstrating that data quality and feature informativeness can outweigh dataset size in flood susceptibility modeling.

4.2. Models’ Uncertainties and Limitations

While RF, XGBoost, and SVM are widely recognized for their strong predictive performance in flood susceptibility modeling, each model carries inherent uncertainties and limitations [7,10]. RF and XGBoost can be sensitive to noisy or highly correlated predictors, which may introduce instability in feature importance estimates and lead to overfitting if hyperparameters are not properly tuned [10,59]. XGBoost is often known to overfit small or imbalanced datasets without careful regularization [56,57]. SVM, while effective for nonlinear classification, depends heavily on kernel choice and parameter settings, and its performance can degrade when classes are highly imbalanced or when feature scaling is sub-optimal [7]. All three models operate as varying degrees of “black-box” learners, with complex internal interactions that introduce uncertainty—particularly in spatially heterogeneous flood environments [16,59]. These limitations underscore the need for strong model validation, interpretability tools like SHAP, and ensemble methods to reduce individual biases and improve overall predictive reliability [19,22].

4.3. Model Feature Importance Assessment

The SHAP technique as a post-modeling tool was applied to evaluate the contribution of each FCF to flood susceptibility, thereby addressing the common “black box” challenge of ML-based modeling [16,55]. SHAP plot provides a rich overview of how the input variables impact the model’s predictions across the dataset. Figure 5 and Figure 6 present the SHAP-based analyses for the Esm-1, Esm-2, and Esm-3 models.

Figure 5 and Figure 6 provide an integrated synthesis of all FCFs and their combined impact on flood susceptibility. Figure 5 shows the SHAP summary (beeswarm) plots, illustrating each factor’s contribution, variability, and direction of influence across all samples, with red and blue indicating higher and lower feature values, respectively. While informative, the beeswarm plots appear visually dense, but Figure 6 complements them with SHAP bar plots that rank the FCFs by their maximum absolute SHAP values, offering a clearer comparison and quantification of each factor’s influence.

The x-axis in both plots represents the SHAP values, while the y-axis displays the input variables, which are FCFs. Across the SHAP plots, Elevation, HandModel, MNDWI, NDWI, and Aspect consistently emerged as the most influential predictors in all ensemble models (Esm-1, Esm-2, Esm-3), with LC and Water Index remaining important in Esm-1 and Esm-2, and NDVI exerting consistent influence in Esm-2 and Esm-3. Several studies have applied similar FCFs and reported comparable feature rankings, particularly the dominant roles of elevation, proximity to rivers, and land cover (LC) [9,16,36,55]. The strong performance of HAND as shown in the SHAP is consistent with Hu et al. [43] and others who emphasized its effectiveness in representing floodplain connectivity, while the importance of water-related spectral indices (NDWI, MNDWI) aligns with findings by Amarnath et al. [17], Sajjad et al. [18], and Kafi et al. [19], who showed that these indices markedly improve water–non-water discrimination in urban and mixed landscapes. Collectively, eight FCFs appear to be the key determinants of flood susceptibility across the Ottawa–Gatineau watershed, while additional variables such as TRI, Roughness, Curvature, and Moisture also contributed meaningfully, though in more model-specific contexts. Overall, twelve FCFs showed varying importance across the three models, while others had minimal or insignificant influence (Figure 6) (check Section 2.2.2 for more geographic context discussion). Nine FCFs made major contributions, whereas the “sum of 9 other features” reflects the aggregated impact of predictors with very low SHAP values. Because these values fell below the visualization threshold, the SHAP function automatically grouped them to summarize their combined—but limited—contribution.

4.4. Flood Susceptibility for Ottawa–Gatineau Sub-Region

Although the three ensemble models each achieved high accuracy with only minimal prediction differences across sub-regions, they were subjected to an additional round of ensemble model development, as described in Section 3.3, to produce a single integrated ensemble model (ESM) capable of deployment across all geographical areas.

This approach leverages the complementary strengths of individual models to produce a generalized Ensemble Susceptibility Model (ESM), whose prediction results are presented in Figure 5 and Figure 6. The ESM reduces bias and provides a more robust, generalized framework for achieving consistent flood susceptibility predictions across diverse regions. The flood susceptibility maps shown in Figure 7 were generated using the single ESM. Panels (a–d) present the spatial distribution of flood susceptibility, classified into five quantile-based categories: very low (0–0.2), low (0.2–0.4), moderate (0.4–0.6), high (0.6–0.8), and very high (0.8–1.0) [12,57]. This study highlights the critical importance of model transferability (Figure 7), demonstrating the ability to predict flood susceptibility in regions beyond the original training area. The final integrated flood susceptibility map (Figure 8) was produced by merging sub-regional outputs, providing a comprehensive overview of flood-prone zones across the watershed. Check the map key of Figure 8a in Figure 7. The flood inundation boundary, delineating the spatial extent of flooding (Figure 8a), was derived using HEC-RAS 2D to address the limitations of the ESM-ML approach in the FSM study. Visual observation of Figure 8a indicates that the southern and southwestern areas exhibit the highest flood vulnerability, while Figure 8b presents a quantitative, percentage-based analysis of flood susceptibility classes for the ESM across the entire region of interest.

Figure 8b shows that approximately 13–23% of the Ottawa–Gatineau sub-region of interest may be at very high risk of flooding. The SHAP analysis indicates that the distribution of these high-risk zones is largely driven by the key factors highlighted in Section 4.2, whose contributions to flood susceptibility were discussed in Section 2.2.2. Furthermore, this study demonstrates the potential to support policymakers and decision-makers in developing targeted flood mitigation strategies, guiding land-use planning, and enhancing infrastructure resilience.

4.5. Flood Frequency Analysis and 100-Year Return Period in the Ottawa–Gatineau Sub-Region

Flood frequency analysis (FFA), which applies statistical models to historical data, is used to estimate flood occurrence probabilities and the likelihood of different flood magnitudes over time. The FFA results served as an independent hydrological validation, providing physically based insights into flood magnitude and recurrence (e.g., 100-year return periods). It is simply a complementary use to enhance the flood analysis in addition to the FSM. Figure 9a,b exhibit the flood frequency curve for 100-year flood return periods across two different time spans.

This has critical policy implications, as infrastructure such as bridges, dams, and residential areas built under outdated design standards along the region of interest may no longer be adequately protected. Given the influence of climate change and evolving river dynamics on flood frequency and magnitude, regularly updating flood risk models is essential for effective planning and resilience. Table 1 shows the flood flow estimation result for different annual exceedance probabilities for 1960 to 2024 using the Gumbel distribution model in Section Computation Procedures. As the return period increases, the estimated discharge increases, reflecting the greater severity of infrequent flood events. This trend is expected, as low-frequency events represent the upper tail of the flood distribution that drives design and hazard assessments.

4.6. Ottawa–Gatineau Sub-Region Flood Simulation

The HEC-RAS 2D simulation offers detailed and realistic hydraulic outputs, including water velocity and depth, although it requires greater processing time and computational resources than the ESM-based ML model.

The 2019 annual maximum and minimum daily water discharge (m³/s) recorded at the Ottawa River hydrometric station was used to estimate the 100-year flood return period. The 2019 data was selected over 2017 because the higher peak discharge represented a more extreme hydrological event and provided a stronger basis for calibrating flood extents.

Using the most severe discharge enhances model robustness and supports conservative, risk-averse flood hazard assessment. Figure 10 presents the flood depth map visualized in HEC-RAS 2D. To enhance clarity and interpretability, the flood depth and other hydrodynamic outputs were further re-visualized in QGIS (see Figure 11 and Figure 12), which offers more advanced cartographic, symbology, and layout tools, including improved axis labeling, spatial analysis capabilities, and high-quality visualization.

While Ottawa experienced severe flooding, additional flood-affected zones are observed approximately within the longitude ranges 76.32°–76.05° W and 75.78°–75.65° W. According to Table 2, a 1% (0.01) probability of a flood scenario poses a high risk of severe impacts, including potential loss of life and widespread property damage.

HEC-RAS 2D simulations revealed spatial patterns of flood depth and velocity that align with expected hydraulic behavior, with higher velocities occurring in upstream or steeper, confined channels and deeper inundation forming across low-relief floodplains. These trends are consistent with observations reported for the Krishna River Basin [24] and with comparative hydrodynamic evaluations by Djafri et al. [25] and Merwade et al. [44]. The modeled flood depths (Figure 11), which range from approximately 9.02 m to 15 m under longer return-period inflows, indicate substantial water accumulation in low-lying or poorly drained areas and possible overtopping of the Ottawa River during the 2019 extreme rainfall event. Depth–velocity interactions follow expected dynamics: water spreading across broad floodplains exhibits lower velocities (Figure 12), whereas confined channels or steeper terrain can produce depths of ~7.61–9.02 m accompanied by increased hydraulic head and faster flows. Peak velocities between 6 and 15 m/s (Figure 12) reflect the strong influence of topography and the high-energy runoff generated during the 2017 and 2019 floods [5,69,70]. Velocities below ~1.34 m/s signify slow-moving water with greater potential for infiltration or temporary storage, while the extreme velocities observed in some reaches pose risks of erosion, sediment transport, and infrastructure damage. Overall, the flood depth and velocity exhibiting varying intensities could be due to differences in topography and hydraulic conditions. Both flood events (2017 and 2019) approached 50- to 100-year recurrence intervals, reinforcing their capacity to induce severe geomorphic impacts and suggesting the value of mitigation measures such as widening narrow channel sections to reduce localized erosion hazards.

Flood Vulnerability Assessment

Flood vulnerability was assessed using post-processed HEC-RAS 2D outputs in Figure 11, with floodwater depth as the key variable. Floodwater depths were classified into five hazard categories following the Japan Ministry of Land, Infrastructure, and Transport (MLIT) [71,72]. While Japan differs geographically from Canada (e.g., in climate and urban density), both countries experience major riverine flooding driven by snowmelt and intense rainfall. In Canada, for instance, the Ottawa–Gatineau floods of 2017 and 2019 were triggered by combined snowmelt and rainfall [5,70], mirroring flood-generating mechanisms observed in many Japanese basins [73,74,75]. Japan’s MLIT developed its flood depth hazard classification from decades of integrated hydrological modeling, long-term observations, and post-disaster analyses that capture snowmelt- and rainfall-driven flood dynamics [71,72,76]. Table 2 presents the flood depth hazard (H_z) categories and their corresponding remarks following the MLIT framework.

To corroborate the MLIT flood depth hazard classification, several independent studies have established comparable depth–velocity thresholds for flood hazard assessment under diverse hydrological settings. For instance, Alvarez et al. [77] conducted a two-dimensional dam-break flood analysis in Mozambique using twenty (20) real-time kinematic global navigation satellite systems (RTK-GNSS) ground control points, a turbulent unsteady-flow numerical model, and SRTM DEM data. They reported low hazard conditions when h < 0.5 m (or vel < 0.5 m/s), moderate when 0.5 m ≤ h ≤ 1 m (or 0.5 m ≤ vel ≤ 1 m/s), and high when h > 1 m (or vel > 1 m/s). Similarly, Khaing et al. [78] applied a coupled two-dimensional hydrology–inundation model using field observations and satellite remote sensing to assess flood hazards in Nyaungdon, Myanmar, reporting low hazard at h ≤ 0.5 m, moderate at 0.5 m < h ≤ 1 m, high at 1 m < h ≤ 2 m, and very high at 2 m < h ≤ 3 m. Etan et al. [79], using the FLO-2D hydrodynamic model in Dire Dawa, Ethiopia, identified analogous thresholds: low hazard when h < 0.5 m, moderate when 0.5 m ≤ h ≤ 1.5 m, and high when h > 1.5 m. The consistency across these selected studies around the globe reinforces the applicability of the MLIT-defined ranges for classifying flood hazard severity.

The findings suggest that the MLIT flood depth hazard classification aligns with empirically derived thresholds from different hydrodynamic investigations around the world. However, it is important to note that, at the time of this study, no standardized Canadian flood depth–based hazard classification framework applicable across the study area was identified. Consequently, the MLIT system was adopted as a standardized and empirically grounded proxy for interpreting floodwater depth severity, complementing rather than replacing existing Canadian hydrological practices. Nonetheless, applying MLIT thresholds within the Canadian context may introduce some latent-inherent uncertainties. Future research should explore region-specific flood depth hazard classifications to strengthen contextual relevance.

5. Conclusions

FSM is essential for proactive disaster planning and timely response. This study advances the understanding, prediction, and model transferability of flood hazards over the Ottawa–Gatineau area by developing a generalized and transferable ensemble model (ESM) by integrating remote sensing data, machine learning algorithms, and an independent hydraulic model (HEC-RAS 2D) for independent numerical evaluation. Performance metrics, including F1-score, AUC, and Kappa, ranged from 0.85 to 0.99. While F1 and AUC values (0.93–0.99) were slightly higher, Kappa (0.85–0.99) is considered a more robust indicator of spatial prediction quality. The successful transfer of trained ESM to independent sub-regions highlights strong model generalizability and confirms that physically meaningful flood conditioning factors—rather than sheer training data volume—are the primary drivers of predictive performance. This transferability underscores the potential of the proposed framework for operational flood susceptibility mapping across partially monitored basins across Ontario and other regions of Canada, particularly in data-limited contexts.

The SHAP-based explainability analysis provides critical insights into model behavior and hydrological realism. SHAP analysis further identified eight key conditioning factors (Section 4.2), out of which five factors (elevation, HandModel, MNDWI, NDWI, and aspect) are predominant and consistently influence flood susceptibility in the Ottawa watershed along the Ottawa–Gatineau area. Importantly, the analysis also revealed how secondary terrain and land surface attributes contribute conditionally, depending on regional geomorphology and data quality, thereby offering a nuanced understanding of spatial flood drivers. The percentage quantification of flood susceptibility reveals approximately 13–23% of the study area may be at very high risk of flooding. The flood frequency analysis reveals that extreme floods in the Ottawa–Gatineau sub-region may occur more frequently with water discharge of about 6000 m³/s and with at least a 1% annual probability.

The flood hydrodynamic analysis by HEC-RAS 2D hydraulic model shows varying intensities in floodwater depth and velocity values attributed to topographic differences and hydraulic conditions across the Ottawa–Gatineau sub-region of interest, with higher flow velocities typically occurring in upstream areas or in locations with confined channels. Following MLIT’s flood depth hazard classifications employed in this study, the floodwater depth value of over 2 m indicates a high risk of severe flood hazard, including potential loss of life and widespread property damage. It is worth noting that a comprehensive flood risk assessment requires integrating RS, ML, and hydraulic modeling, as the delineation of flood inundation boundaries illustrated in Figure 8a was only achievable through hydraulic modeling, which can enable hydrologists and engineers to identify floodway and flood fringe areas. Beyond hydraulic simulation, this study shows that independent derivation of flood extents with HEC-RAS 2D and integration of RS and ML significantly enhances predictive capability.

The study further demonstrates the potential of these models to support policymakers in developing targeted flood mitigation strategies and strengthening infrastructure resilience. Despite their effectiveness in identifying flood severity and hydrodynamic zones, a key limitation remains the high computational cost of running high-resolution HEC-RAS 2D simulations, mainly for large watersheds or extended time-series predictions. Future work will focus on developing an ensemble framework that integrates RS indices, a hydrological model (SWAT), and hydraulic model outputs within machine learning approaches for enhanced flood prediction. This framework will also target the estimation of hydrodynamic variables, such as depth and velocity, as a less computationally demanding alternative. The research hopes to enhance the applicability of the model, regardless of computational resources or the regional variation in floodplain characteristics.

Author Contributions

Conceptualization, T.S.O. and D.C.; methodology, T.S.O.; validation and formal analysis, T.S.O., D.C. and H.M.; investigation, T.S.O.; data curation, T.S.O. and H.M.; writing—original draft preparation, T.S.O.; writing—review and editing, all authors; visualization, T.S.O.; supervision, D.C.; project administration, D.C. and H.M.; funding acquisition, D.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Office of the Vice-Principal Research (VPR) Fund of Queen’s University, ON, Canada, and the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant.

Institutional Review Board Statement

Not Applicable.

Informed Consent Statement

Not Applicable.

Data Availability Statement

Flood hazard, Ottawa watershed, and training area polygon datasets were sourced from Natural Resources Canada (NRCan). The flood conditioning factors data are extracted from satellite remote sensing data, which are available in a publicly accessible repository at https://apps.sentinel-hub.com/eo-browser (accessed on 6 April 2025). The 1 m resolution elevation data for the provinces of Quebec and Ontario can be requested from the NRCan. The 30 m resolution elevation data was derived from a digital elevation model obtained from the SRTM DEM https://earthexplorer.usgs.gov (accessed on 29 April 2025). The hydrometric data was obtained online from https://wateroffice.ec.gc.ca/map/index_e.html?type=historical (accessed on 10 July 2025). Rivers, tributaries and stream networks (in shape file format) were derived from the 1 m resolution elevation data to estimate distance to river data. Distance to road can be derived from the road network data for Ontario at https://open.ottawa.ca/datasets/road-centrelines/explore (accessed on 13 April 2025) and Quebec at https://www.gatineau.ca/portail/default.aspx?p=publications_cartes_statistiques_donnees_ouvertes/donnees_ouvertes/jeux_donnees/details&id=872107914 (accessed on 13 April 2025). Land cover data can be found at https://apps.sentinel-hub.com/eo-browser (accessed on 10 May 2025). The polygon shapefiles of the training site are available upon request from Heather McGrath (heather.mcgrath@NRCan-RNCan.gc.ca) at Natural Resources Canada (NRCan). Likewise, the study area polygon datasets showing specific boundaries of interest within which the predicted FSM results are presented can be accessed through the Flood Hazard Identification and Mapping Program (FHIMP), with the support of McGrath at NRCan.

Acknowledgments

The authors thank the Sentinel satellite data hub for making the satellite optical data available through the EO web browser. We thank NRCan for collaborating with Queen’s University and providing the required data. We also appreciate the financial support from the Queen’s VPR Fund and NSERC.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Kundzewicz, Z.W.; Kanae, S.; Seneviratne, S.I.; Handmer, J.; Nicholls, N.; Peduzzi, P.; Mechler, R.; Bouwer, L.M.; Arnell, N.; Mach, K.; et al. Flood risk and climate change: Global and regional perspectives. Hydrol. Sci. J. 2013, 59, 1–28. [Google Scholar] [CrossRef]
Insurance Bureau of Canada. Severe Weather in 2023 Caused Over $3.1 Billion in Insured Damage. 2024. Available online: https://www.ibc.ca/news-insights/news/severe-weather-in-2023-caused-over-3-1-billion-in-insured-damage (accessed on 25 August 2025).
Government of Ontario. Independent Review of the 2019 Flood Events in Ontario: Region Specific Situations. 2019. Available online: https://www.ontario.ca/document/independent-review-2019-flood-events-ontario/region-specific-situations (accessed on 25 August 2025).
Ottawa River Regulation Planning Board. Spring 2019 Freshet and Flood Summary Report. 2019. Available online: https://ottawariver.ca/ (accessed on 25 August 2025).
Douglas McNeil: An Independent Review of the 2019 Flood Events in Ontario Available. 2019. Available online: https://files.ontario.ca/mnrf-english-ontario-special-advisor-on-flooding-report-2019-11-25.pdf (accessed on 29 May 2025).
Ontario Independent Review Report of Flood Events. “Region-Specific Situations-Including in 2017 and 2019”. Available online: https://www.ontario.ca/document/independent-review-2019-flood-events-ontario/region-specific-situations?utm_source=chatgpt.com (accessed on 29 September 2025).
Mosavi, A.; Ozturk, P.; Chau, K.-W. Flood Prediction Using Machine Learning Models: Literature Review. Water 2018, 10, 1536. [Google Scholar] [CrossRef]
Meyer Oliveira, A.; Fleischmann, A.S.; Paiva, R.C.D. On the Contribution of Remote Sensing-Based Calibration to Model Hydrological and Hydraulic Processes in Tropical Regions. J. Hydrol. 2021, 597, 126184. [Google Scholar] [CrossRef]
Rahmati, O.; Zeinivand, H.; Besharat, M. Flood hazard zoning in Yasooj region, Iran, using GIS and multi-criteria decision analysis (MCDA). Geomat. Nat. Hazards Risk 2015, 7, 1000–1017. [Google Scholar] [CrossRef]
Khosravi, K.; Shahabi, H.; Pham, B.T.; Adamowski, J.; Shirzadi, A.; Pradhan, B.; Dou, J.; Ly, H.-B.; Gróf, G.; Ho, H.L. A Comparative Assessment of Flood Susceptibility Modeling Using Multi-Criteria Decision-Making Analysis and Machine Learning Methods. J. Hydrol. 2019, 573, 311–323. [Google Scholar] [CrossRef]
Yin, J.; Yu, D.; Wilby, R.L. Modelling the Impact of Land Subsidence on Urban Pluvial Flooding: A Case Study of Downtown Shanghai, China. Sci. Total Environ. 2016, 544, 744–753. [Google Scholar] [CrossRef]
Liu, Y.; Wang, H.; Shao, J. Advances in Hydrological Modeling with GIS and RS Integration for Flood Assessment and Management. J. Hydrol. 2021, 598, 126456. [Google Scholar] [CrossRef]
Bhuiyan, S.A.; Bataille, C.P.; McGrath, H. Harmonizing and Extending Fragmented 100 Year Flood Hazard Maps in Canada’s Capital Region Using Random Forest Classification. Water 2022, 14, 3801. [Google Scholar] [CrossRef]
Farhadi, H.; Najafzadeh, M. Flood Risk Mapping by Remote Sensing Data and Random Forest Technique. Water 2021, 13, 3115. [Google Scholar] [CrossRef]
Cardi, J.; Dussel, A.; Letessier, C.; Ebtehaj, I.; Gumiere, S.J.; Bonakdari, H. Modeling Hydrodynamic Behavior of the Ottawa River: Harnessing the Power of Numerical Simulation and Machine Learning for Enhanced Predictability. Hydrology 2023, 10, 177. [Google Scholar] [CrossRef]
Oluwadare, T.S.; Ribeiro, M.P.; Chen, D.; Babadi Ataabadi, M.; Tabesh, S.H.; Daomi, A.E. Applying Machine Learning Algorithms for Spatial Modeling of Flood Susceptibility Prediction over São Paulo Sub-Region. Land 2025, 14, 985. [Google Scholar] [CrossRef]
Amarnath, G. An algorithm for rapid flood inundation mapping from optical data using a reflectance differencing technique. J. Flood Risk Manag. 2013, 7, 239–250. [Google Scholar] [CrossRef]
Sajjad, A.; Lu, J.; Chen, X. Rapid assessment of riverine flood inundation in Chenab floodplain using remote sensing techniques. Geoenvironmental Disasters 2023, 10, 9. [Google Scholar] [CrossRef]
Muhammad, K.K.; Ponrahono, Z.; Ash’aari, Z.H.; Barau, A.S. Flood risk prediction and modeling in Bauchi: Leveraging machine learning models and explainable AI for urban resilience. J. Clim. Change Health 2025, 26, 100490. [Google Scholar] [CrossRef]
Razavi-Termeh, S.V.; Sadeghi-Niaraki, A.; Abba, S.I. Flood-prone area mapping using a synergistic approach with swarm intelligence and gradient boosting algorithms. Sci. Rep. 2025, 15, 27924. [Google Scholar] [CrossRef]
Quang, N.H.; Hanh, L.N.; Van An, N. Boosting vs. traditional machine learning models for flood susceptibility mapping: Insights from a case study in central Vietnam. Adv. Space Res. 2025, 76, 5058–5084. [Google Scholar] [CrossRef]
Waleed, M.; Sajjad, M. Advancing flood susceptibility prediction: A comparative assessment and scalability analysis of machine learning algorithms via artificial intelligence in high-risk regions of Pakistan. J. Flood Risk Manag. 2025, 18, e13047. [Google Scholar] [CrossRef]
Shirmohammadi, M.; Pirasteh, S.; Li, W.; Mafi-Gholami, D. Flood risk mapping and performance efficiency evaluation of machine learning algorithms: Best practice in northern Iran. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2025, 48, 1347–1352. [Google Scholar] [CrossRef]
Vashist, K.; Singh, K.K. HEC-RAS 2D modeling for flood inundation mapping: A case study of the Krishna River Basin. Water Pract. Technol. 2023, 18, 831. [Google Scholar] [CrossRef]
Djafri, S.A.; Cherhabil, S.; Hafnaoui, M.A.; Madi, M. Flood modeling using HEC-RAS 2D and IBER 2D: A comparative study. Water Supply 2024, 24, 3061. [Google Scholar] [CrossRef]
Neal, J.C.; Bates, P.D.; Fewtrell, T.J.; Hunter, N.M.; Wilson, M.D.; Horritt, M.S. Distributed whole city water level measurements from the Carlisle 2005 urban flood event and comparison with hydraulic model simulations. J. Hydrol. 2009, 368, 42–55. [Google Scholar] [CrossRef]
Faghfouri, A.; Hentati, A.; Fortin, G.; Germain, D. A novel statistical model for flood prediction in the Eel River watershed, New Brunswick, Canada. Water Sci. 2023, 37, 251–268. [Google Scholar] [CrossRef]
Zhang, Z.; Stadnyk, T.A.; Burn, D.H. Identification of a preferred statistical distribution for at-site flood frequency analysis in Canada. Can. Water Resour. J./Rev. Can. Des Ressour. Hydr. 2019, 45, 43–58. [Google Scholar] [CrossRef]
Neri, A.; Villarini, G.; Napolitano, F. Statistically-based projected changes in the frequency of flood events across the U.S. Midwest. J. Hydrol. 2020, 584, 124314. [Google Scholar] [CrossRef]
Environment and Climate Change Canada. An Examination of Governance, Existing Data, Potential Indicators and Values in the Ottawa River Watershed; Environment and Climate Change Canada: Gatineau, QC, Canada, 2019; ISBN 9780660310534.
Water Science School. Impervious Surfaces and Flooding. Available online: https://www.usgs.gov/special-topics/water-science-school/science/impervious-surfaces-and-flooding (accessed on 22 July 2022).
Ottawa River Regulation Planning Board. Summary of the 2017 Spring Flood. 2017. Available online: https://ottawariver.ca/wp-content/uploads/2019/02/2017-Spring-Flood-Summary.pdf (accessed on 25 August 2025).
Ottawa Citizen. One Flood, Five Stories: Heartbreak, Battle and Hope on the Ottawa River. 19 May 2019. Available online: https://ottawacitizen.com/feature/one-flood-five-stories-heartbreak-battle-and-hope-on-the-ottawa-river (accessed on 25 August 2025).
City of Gatineau. Road Network. Available online: https://www.gatineau.ca/portail/default.aspx?p=publications_cartes_statistiques_donnees_ouvertes/donnees_ouvertes/jeux_donnees/details&id=872107914&ref=fil-d-Ariane (accessed on 13 April 2025).
City of Ottawa. Road Centrelines. Available online: https://open.ottawa.ca/datasets/road-centrelines/explore (accessed on 13 April 2025).
Tehrany, M.S.; Jones, S.; Shabani, F. Identifying the essential flood conditioning factors for flood prone area mapping using machine learning techniques. Catena 2019, 175, 174–192. [Google Scholar] [CrossRef]
Aristizabal, F.; Chegini, T.; Petrochenkov, G.; Salas, F.; Judge, J. Effects of High-Quality Elevation Data and Explanatory Variables on the Accuracy of Flood Inundation Mapping via Height Above Nearest Drainage. Hydrol. Earth Syst. Sci. 2024, 28, 1287–1315. [Google Scholar] [CrossRef]
Al-Juaidi, A.E.M. The Interaction of Topographic Slope with Various Geo-Environmental Flood-Causing Factors on Flood Prediction and Susceptibility Mapping. Environ. Sci. Pollut. Res. 2023, 30, 59327–59348. [Google Scholar] [CrossRef] [PubMed]
Nguyen, H.D. Spatial Modeling of Flood Hazard Using Machine Learning and GIS in Ha Tinh Province, Vietnam. J. Water Clim. Change 2023, 14, 200–222. [Google Scholar] [CrossRef]
Moazzam, M.F.U.; Vansarochana, A.; Rahman, A.U. Analysis of Flood Susceptibility and Zonation for Risk Management Using Frequency Ratio Model in District Charsadda, Pakistan. Int. J. Environ. Geoinform. 2018, 5, 140–153. [Google Scholar] [CrossRef]
Altunel, A.O. The Effect of DEM Resolution on Topographic Wetness Index Calculation and Visualization: An Insight to the Hidden Danger Unraveled in Bozkurt in August, 2021. Int. J. Eng. Geosci. 2022, 7, 153–164. [Google Scholar] [CrossRef]
Yochum, S.E.; Sholtes, J.S.; Scott, J.A.; Bledsoe, B.P. Stream Power Framework for Predicting Geomorphic Change: The 2013 Colorado Front Range Flood. Geomorphology 2017, 292, 178–192. [Google Scholar] [CrossRef]
Hu, A.; Demir, I. Real-Time Flood Mapping on Client-SideWeb Systems Using HAND Model. Hydrology 2021, 8, 65. [Google Scholar] [CrossRef]
Merwade, V.; Liu, Z. Investigating the Role of Model Structure and Surface Roughness in Generating Flood Inundation Extents Using One-and Two-Dimensional Hydraulic Models. J. Flood Risk Manag. 2019, 12, e12347. [Google Scholar] [CrossRef]
Khoshkonesh, A.; Nazari, R.; Nikoo, M.R.; Karimi, M. Enhancing Flood Risk Assessment in Urban Areas by Integrating Hydrodynamic Models and Machine Learning Techniques. Sci. Total Environ. 2024, 938, 175859. [Google Scholar] [CrossRef]
Bashir, B. Morphometric Parameters and Geospatial Analysis for Flash Flood Susceptibility Assessment: A Case Study of Jeddah City along the Red Sea Coast, Saudi Arabia. Water 2023, 15, 870. [Google Scholar] [CrossRef]
Brody, S.D.; Zahran, S.; Highfield, W.E.; Grover, H.; Vedlitz, A. Identifying the impact of the built environment on flood damage in Texas. Disasters 2008, 32, 1–18. [Google Scholar] [CrossRef] [PubMed]
Merz, B.; Hall, J.; Disse, M.; Schumann, A. Fluvial flood risk management in a changing world. Nat. Hazards Earth Syst. Sci. 2010, 10, 509–527. [Google Scholar] [CrossRef]
Alves, P.B.R.; Amanguah, E.; McNally, D.; Espinoza, M.; Ghaedi, H.; Reilly, A.C.; Hendricks, M.D. Navigating the Definition of Urban Flooding: A Conceptual and Systematic Review of the Literature. Water Sci. Technol. 2024, 90, 2796–2812. [Google Scholar] [CrossRef]
USGS. NDVI, The Foundation for Remote Sensing Phenology. 2018. Available online: https://www.usgs.gov/special-topics/remote-sensing-phenology/science/ndvi-foundation-remote-sensing-phenology (accessed on 15 September 2025).
De La Iglesia Martinez, A.; Labib, S.M. Demystifying normalized difference vegetation index (NDVI) for greenness exposure assessments and policy interventions in urban greening. Environ. Res. 2023, 220, 115155. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Zhao, H.; Fan, J.; Wang, C.; Ji, X.; Jin, D.; Chen, J. A Review of Earth’s Surface Soil Moisture Retrieval Models via Remote Sensing. Water 2023, 15, 3757. [Google Scholar] [CrossRef]
Wang, X.; Guo, Y. Spatio-temporal analysis of water area variability in Poyang Lake (2012–2021) using remote sensing. J. Comput. Methods Sci. Eng. 2024, 25, 1432–1447. [Google Scholar] [CrossRef]
Acharya, T.D.; Subedi, A.; Lee, D.H. Evaluation of Water Indices for Surface Water Extraction in a Landsat 8 Scene of Nepal. Sensors 2018, 18, 2580. [Google Scholar] [CrossRef]
Seleem, O.; Ayzel, G.; de Souza, A.C.T.; Bronstert, A.; Heistermann, M. Towards Urban Flood Susceptibility Mapping Using Data-Driven Models in Berlin, Germany. Geomatics Nat. Hazards Risk 2022, 13, 1640–1662. [Google Scholar] [CrossRef]
Pal, S.C.; Deswal, S. Flash Flood Susceptibility Modeling Using New Approaches of Hybrid and Ensemble Tree-Based Machine Learning Algorithms. Remote Sens. 2020, 12, 3568. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794. [Google Scholar] [CrossRef]
Gul, E. Urban flood hazard assessment using FLA-optimized boost algorithms in Ankara, Türkiye. Appl. Water Sci. 2025, 15, 78. [Google Scholar] [CrossRef]
Habibi, A.; Delavar, M.R.; Nazari, B.; Pirasteh, S.; Sadeghian, M.S. A novel approach for flood hazard assessment using hybridized ensemble models and feature selection algorithms. Int. J. Appl. Earth Obs. Geoinf. 2023, 122, 103443. [Google Scholar] [CrossRef]
Engineering ToolBox. Manning’s Roughness Coefficients for Open Channel Flow. 2025. Available online: https://www.engineeringtoolbox.com/mannings-roughness-d_799.html (accessed on 2 September 2025).
Ye, A.; Zhou, Z.; You, J.; Ma, F.; Duan, Q. Dynamic Manning’s roughness coefficients for hydrological modelling in basins. Hydrol. Res. 2018, 49, 1379–1395. [Google Scholar] [CrossRef]
Quiroga, V.M.; Kure, S.; Udo, K.; Mano, A. Application of 2D numerical simulation for the analysis of the February 2014 Bolivian Amazonia flood: Application of the new HEC-RAS version 5. RIBAGUA 2016, 3, 25–33. [Google Scholar] [CrossRef]
Qureshi, M.U.A.; Amiri, A.; Ebtehaj, I.; Guimere, S.J.; Cunderlik, J.; Bonakdari, H. Coupling HEC-RAS and AI for River Morphodynamics Assessment Under Changing Flow Regimes: Enhancing Disaster Preparedness for the Ottawa River. Hydrology 2025, 12, 25. [Google Scholar] [CrossRef]
Brunner, G. CEIWR-HEC. HEC-RAS 5.0 2D Modeling User’s Manual; US Army Corps Engineers, Hydrologic Engineering Center: Davis, CA, USA, 2016. [Google Scholar]
Kurugama, K.M.; Kazama, S.; Hiraga, Y.; Samarasuriya, C. A Comparative Spatial Analysis of Flood Susceptibility Mapping Using Boosting Machine Learning Algorithms in Rathnapura, Sri Lanka. J. Flood Risk Manag. 2024, 17, e12980. [Google Scholar] [CrossRef]
Viera, A.J.; Garrett, J.M. Understanding interobserver agreement: The kappa statistic. Fam. Med. 2005, 37, 360–363. [Google Scholar]
Sahraei, R.; Kanani-Sadat, Y.; Safari, A.; Homayouni, S. Flood Susceptibility Modelling using geospatial-based multi-criteria decision making in large scale areas. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, 10, 677–683. [Google Scholar] [CrossRef]
El Haou, M.; Ourribane, M.; Ismaili, M.; Abdelrahman, K.; Fnais, M.S.; Krimissa, S.; El Oudi, H.; Hajji, S.; El Bouzkraoui, M.; Tarchi, F.; et al. Advanced GIS-based modeling for flood hazards mapping in urban semi-arid regions: Insights from Beni Mellal, Morocco. Front. Environ. Sci. 2025, 13, 1585926. [Google Scholar] [CrossRef]
Ottawa River Regulation Planning Board. Summary of the 2017 Spring Flood. 2018. Available online: https://ottawariver.ca/wp-content/uploads/2024/02/2017-Spring-Flood-Summary-vers.a.pdf (accessed on 30 August 2025).
Ottawa Riverkeeper. 2019. Available online: https://ottawariverkeeper.ca/6-things-you-should-know-about-the-2019-flooding/ (accessed on 30 August 2025).
Gomez, V.M.Q.; Kure, S.; Udo, K.; Mano, A. Analysis of exposure to vector-borne diseases due to flood duration, for a more complete flood hazard assessment: Llanos de Moxos, Bolivia. Ribagua 2017, 5, 48–62. [Google Scholar] [CrossRef]
Udmale, P.; Tachikawa, Y.; Kobayashi, K.; Sayama, T. Flood hazard mapping in Japan. Cat. Hydrol. Anal. Asia Pac. 2019, 1, 16–36. [Google Scholar]
Guo, Y.; Yang, Y.; Yang, D.; Zhang, L.; Zheng, H.; Xiong, J.; Ruan, F.; Han, J.; Liu, Z. Warming leads to both earlier and later snowmelt floods over the past 70 years. Nat Commun. 2025, 16, 3663. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Gu, X.; Cheng, N.; Guan, Y.; Kong, D. Seasonal diversity of global flood changes and their drivers. J. Hydrol. 2025, 662, 133976. [Google Scholar] [CrossRef]
Liang, W.; Duan, W.; Chen, Y.; Fang, G.; Zou, S.; Li, Z.; Qiu, Z.; Lyu, H. Shifted dominant flood drivers of an alpine glacierized catchment in the Tianshan region revealed through interpretable deep learning. npj Clim. Atmos. Sci. 2025, 8, 33. [Google Scholar] [CrossRef]
Huang, W.; Zhang, Y.J.; Liu, Z. Simulation of compound flooding in Japan using a nationwide model. Nat. Hazards 2023, 117, 2693–2713. [Google Scholar] [CrossRef]
Álvarez, M.; Puertas, J.; Peña, E.; Bermúdez, M. Two-Dimensional Dam-Break Flood Analysis in Data-Scarce Regions: The Case Study of Chipembe Dam, Mozambique. Water 2017, 9, 432. [Google Scholar] [CrossRef]
Khaing, Z.M.; Zhang, K.; Sawano, H.; Shrestha, B.B.; Sayama, T.; Nakamura, K. Flood hazard mapping and assessment in data-scarce Nyaungdon area, Myanmar. PLoS ONE 2019, 14, e0224558. [Google Scholar] [CrossRef] [PubMed]
Haile, E.S.; Worku, H.; De Paola, F. Flood hazard mapping using FLO-2D and local management strategies of Dire Dawa city, Ethiopia. J. Hydrol. Reg. Stud. 2018, 19, 224–239. [Google Scholar] [CrossRef]

Figure 1. Map of the study area. (a) Inset map of Ottawa–Gatineau topographic map showing main roads and urban areas, highlighting the study area in red bounding box. (b) Ottawa watershed map showing 3 highlighted sub-regions (I, II, and III) designated for ML model training and other features (see legend). (c) The spatial distribution of flooded and non-flooded sample points over the digital elevation model (DEM) map and the external validation area is illustrated using different colors and labeled as (a–e) in red bounding box (see legend).

Figure 2. Flowchart of the methodology. (a) This section presents data preparation and preprocessing procedures. (b) This section outlines the procedures of the ML ensemble model development.

Figure 3. Bar charts showing the computed performance metric scores for each model tested on the 30% testing dataset across the three geographic areas: (a) Area I, (b) Area II, and (c) Area III.

Figure 4. Bar charts of the ensemble model showing the computed performance metric scores on the EVA.

Figure 5. Comparative SHAP summary plots illustrating feature contributions in Esm-1, Esm-2, and Esm-3 models.

Figure 6. SHAP bar plot illustrates max absolute SHAP values of feature contributions in Esm-1, Esm-2, and Esm-3 models.

Figure 7. Flood susceptibility maps derived from ESM model predictions and generated using GIS mapping tools.

Figure 8. (a) Flood susceptibility map for Ottawa–Gatineau, derived from ESM predictions. (b) The percentage quantification of flood-susceptible results shown in Figure 8a.

Figure 9. Flood frequency curve for 100-year flood return periods across two different time spans. The peak discharge (y-axis) versus return period (x-axis) is shown in Figure 9, where the Gumbel model (red dashed line) is fitted to historical flood peaks (blue circles). (a) illustrates extreme flood peaks approaching 5500 m³/s, with recent events, with peaks up to ~6000 m³/s as shown in (b), indicating that extreme floods may now occur more frequently than predicted. The results show that extreme floods in the Ottawa–Gatineau sub-region may occur more frequently, with at least a 1% annual probability.

Figure 10. Two-dimensional representation of the Ottawa–Gatineau sub-region generated in RAS Mapper within HEC-RAS.

Figure 11. Numerical simulation of the flood depth for return period 100 years along the Ottawa–Gatineau sub-region.

Figure 12. Numerical simulation of the flood velocity for return period of 100 years along the Ottawa–Gatineau sub-region.

Table 1. Hypothetical Flood Flows with Annual Exceedance Probability.

Return Period (Years)	Annual Probability (%)	Hypothetical Flow (m³/s) 1960–2024
2	50	3165.50
5	20	3929.16
10	10	4434.77
25	4	5073.61
50	2	5547.54
100	1	6017.97
150	0.67	6292.29
200	0.50	6486.68
250	0.40	6637.37
300	0.33	6760.43

Table 2. Flood depth hazard classification.

Flood Depth Hazard	Depth [m]	Hazard Category	Flood Hazard Implications/Remarks
H_z1	<0.5	Low	Floodwater depths do not pose hazard to people and on-foot evacuation is possible.
H_z2	0.5–1.0	Medium	Floodwater poses a hazard for infants, and on-foot evacuation of adults becomes difficult; evacuation becomes more complicated.
H_z3	1.0–2.0	High	Flood depth is capable of drowning people. However, people may be safe inside their homes.
H_z4	2.0–5.0	Very High	People are exposed to flood hazard even inside their homes. It is suggested to evacuate people via the roof of their homes.
H_z5	>5.0	Extreme	Built-up structures may get covered by the flood; people may drown, even if they evacuate through the roof of their homes.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Oluwadare, T.S.; Chen, D.; McGrath, H. Integrating Remote Sensing Indices and Ensemble Machine Learning Model with Independent HEC-RAS 2D Model for Enhanced Flood Prediction and Risk Assessment in the Ottawa River Watershed. Appl. Sci. 2026, 16, 70. https://doi.org/10.3390/app16010070

AMA Style

Oluwadare TS, Chen D, McGrath H. Integrating Remote Sensing Indices and Ensemble Machine Learning Model with Independent HEC-RAS 2D Model for Enhanced Flood Prediction and Risk Assessment in the Ottawa River Watershed. Applied Sciences. 2026; 16(1):70. https://doi.org/10.3390/app16010070

Chicago/Turabian Style

Oluwadare, Temitope Seun, Dongmei Chen, and Heather McGrath. 2026. "Integrating Remote Sensing Indices and Ensemble Machine Learning Model with Independent HEC-RAS 2D Model for Enhanced Flood Prediction and Risk Assessment in the Ottawa River Watershed" Applied Sciences 16, no. 1: 70. https://doi.org/10.3390/app16010070

APA Style

Oluwadare, T. S., Chen, D., & McGrath, H. (2026). Integrating Remote Sensing Indices and Ensemble Machine Learning Model with Independent HEC-RAS 2D Model for Enhanced Flood Prediction and Risk Assessment in the Ottawa River Watershed. Applied Sciences, 16(1), 70. https://doi.org/10.3390/app16010070

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Integrating Remote Sensing Indices and Ensemble Machine Learning Model with Independent HEC-RAS 2D Model for Enhanced Flood Prediction and Risk Assessment in the Ottawa River Watershed

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Geospatial Data

2.2.1. Flood Inventory

2.2.2. Flood Conditioning Factors

3. Model Feature Importance (MFI) and ML Ensemble Models

3.1. Recursive Feature Elimination Algorithm

3.2. Methodology Flowchart and ML Models

3.2.1. Random Forest (RF) Model

3.2.2. Support Vector Machines (SVM) Model

3.2.3. Extreme Gradient Boosting (XGBoost) Model

3.3. Ensemble Modeling

3.4. Numerical Modeling

3.5. Estimating Flood Risk at 100-Year Return Periods

Computation Procedures

3.6. Model Performance Metrics

4. Results and Discussion

4.1. Model Performance Comparison and Validation

4.2. Models’ Uncertainties and Limitations

4.3. Model Feature Importance Assessment

4.4. Flood Susceptibility for Ottawa–Gatineau Sub-Region

4.5. Flood Frequency Analysis and 100-Year Return Period in the Ottawa–Gatineau Sub-Region

4.6. Ottawa–Gatineau Sub-Region Flood Simulation

Flood Vulnerability Assessment

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI