Leveraging Machine Learning Flood Forecasting: A Multi-Dimensional Approach to Hydrological Predictive Modeling

Al-Rawas, Ghazi; Nikoo, Mohammad Reza; Sadra, Nasim; Al-Wardy, Malik

doi:10.3390/w18020192

Open AccessArticle

Leveraging Machine Learning Flood Forecasting: A Multi-Dimensional Approach to Hydrological Predictive Modeling

¹

Department of Civil and Architectural Engineering, Sultan Qaboos University, Muscat 123, Oman

²

School of Mathematical and Computational Sciences, Massey University, Palmerston North 4410, New Zealand

³

Center for Environmental Studies and Research, Sultan Qaboos University, Muscat 123, Oman

^*

Author to whom correspondence should be addressed.

Water 2026, 18(2), 192; https://doi.org/10.3390/w18020192

Submission received: 7 November 2025 / Revised: 22 December 2025 / Accepted: 6 January 2026 / Published: 12 January 2026

(This article belongs to the Section Hydrology)

Download

Browse Figures

Versions Notes

Abstract

Flash flood events are some of the most life-threatening natural disasters, so it is important to predict extreme rainfall events effectively. This study introduces an LSTM model that utilizes a customized loss function to effectively predict extreme rainfall events. The proposed model incorporates dynamic environmental variables, such as rainfall, LST, and NDVI, and incorporates additional static variables such as soil type and proximity to infrastructure. Wavelet transformation decomposes the time series into low- and high-frequency components to isolate long-term trends and short-term events. Model performance was compared against Random Forest (RF), Support Vector Machines (SVMs), Artificial Neural Networks (ANNs), and an LSTM-RF ensemble. The custom loss LSTM achieved the best performance (MAE = 0.022 mm/day, RMSE = 0.110 mm/day, R² = 0.807, SMAPE = 7.62%), with statistical validation via a Kruskal–Wallis ANOVA, confirming that the improvement is significant. Model uncertainty is quantified using a Bayesian MCMC framework, yielding posterior estimates and credible intervals that explicitly characterize predictive uncertainty under extreme rainfall conditions. The sensitivity analysis highlights rainfall and LST as the most influential predictors, while wavelet decomposition provides multi-scale insights into environmental dynamics. The study concludes that customized loss functions can be highly effective in extreme rainfall event prediction and thus useful in managing flash flood events.

Keywords:

flood prediction; LSTM; MCMC; wavelet transformation; rainfall forecasting

1. Introduction

Extreme events, such as floods, have increased in intensity and frequency worldwide over the past few decades due to both climate and land use change [1,2,3]. Flash flooding is one of the most dangerous types of flooding because it has devastating impacts and occurs rapidly, typically within minutes, leaving little opportunity for evacuation or a protective response [4]. Coastal and mountain regions are particularly vulnerable because they have complex hydrological features, and also, they are more vulnerable to high-intensity rain and storm events [5,6,7]. The Al-Batina floodplain on the northeastern coast of Oman is recognized as having a high-risk flood zone. The area has a complex topography that includes a coastal plain and steep foothills, which would cause rapid runoff in case there is a flood. For instance, recent cyclonic storms have caused rainfall of over 200 mm within 24 h, triggering extreme flooding that poses risks to urban infrastructure, agricultural lands, and local communities. Flood events during the period 2010–2020 in Al-Batina cost the economy several million dollars and destroyed people’s homes in the area [8,9,10]. The most critical challenge here is not just the prediction of rainfall, but the prediction of those extreme precipitation events that cause flash floods, a challenge that requires special modeling techniques capable of handling the inherent rarity and high impact of the events [2,11]. This topographic and climatic combination makes Al-Batina an ideal case study to drive flood prediction and management solutions through innovative loss functions that focus on extreme event prediction.

Popular loss functions such as Mean Squared Error (MSE) and Mean Absolute Error (MAE) count for overall accuracy at the expense of underestimating rare but critical extreme events [12,13]. Recent advances in loss function engineering have been effective at addressing this imbalance, with techniques such as focal loss, weighted loss functions, and custom penalty structures showing better performance for rare event prediction across a variety of fields [14,15,16,17].

LSTM models have proven to be effective at identifying intricate temporal patterns in environmental time series [18]. Large regional hydrological forecasting has been carried out by these networks with high accuracy, capturing nonlinear dynamics and temporal dependencies in river flow predictions [19]. However, in forecasting rare extreme events, studies confirm that while LSTM models can reproduce overall hydrograph patterns [20], they often struggle to simulate the extreme hydrological events.

Standard LSTM implementations suffer from exactly the same class imbalance issues that affect all other machine learning approaches. Their training procedure will inevitably prioritize correctly forecasting the more frequent normal precipitation instances over performing poorly on the rarer extreme instances [21,22,23,24]. Coupling wavelet decomposition with LSTM networks has been found to have tremendous potential for hydrological prediction by multi-scale temporal decomposition [25,26,27,28]. Wavelet transforms turn initial time series into different frequency components, which allows the models to learn long-term trends as well as short-term fluctuations that are crucial for extreme event prediction [29,30]. Coupled with the use of customized loss functions, these features provide a more nuanced representation of hydrological signals, with the potential to enhance the model’s sensitivity to extreme precipitation patterns.

Recent works have investigated loss functions specifically designed to enhance extreme event forecasting. For example, one recent work proposed self-adaptive extreme penalized losses combined with LSTM networks for better modeling of rare but high-magnitude events within time series prediction [31]. Another used a tail-aware LSTM framework that statistically turns the probability distribution into separate main body and tail with specialized loss designs and demonstrated improved forecasts of extreme precipitation relative to training objectives in the standard formulation [32]. Alternative loss functions, such as quantile loss and weighted MSE, placed additional emphasis on tail errors during training and showed improved peak flow predictions [12,33].

Recent developments in forecasting extreme events have highlighted the value of integrated modeling approaches that combine multiple methods to address different aspects of the forecasting issue. For example, the combination of Long Short-Term Memory (LSTM) and 2D Convolutional Neural Networks (CNNs) has been demonstrated to improve the performance of the LSTM by integrating spatial information and the simulation of peak occurrences effectively [34]. Additionally, the use of multiple tasks, like the actual evaporation rate, can increase the accuracy level of the volume prediction tasks [34]. Finally, another indication of the flexibility of the model is its large-scale application that combined runoff simulation using the LSTM with a river routing model, representing the practical applications for more precise resolution discharge simulation and improved representation capacity based on the model structure [35].

Customized loss function design, multi-scale feature representation through wavelet decomposition, and uncertainty quantification through Bayesian formulations are the approaches that can be integrated to improve extreme rainfall forecasting. But little research has targeted particularly the building of optimized loss functions for rare precipitation events in arid and semi-arid regions, where the rarity of such events makes conventional modeling methods particularly complicated. A temporal dimension is contributed by dynamic variables, such as land surface temperature, vegetation, and atmospheric conditions, while static attributes, such as soil type, topography, and distance from infrastructure, contribute spatial and physical limitations. The correlation between Land Surface Temperature (LST), vegetation indexes (such as SAVI), and the Normalized Difference Built-up Index (NDBI) emphasizes the importance of these dynamic and static land surface parameters, especially since developed and vegetated areas have an important effect on LST, thereby influencing runoff generation processes [36].

This study addresses these research gaps in the following ways: using the development and evaluation of a custom extreme loss function specifically designed for LSTM networks to improve accuracy in predicting extreme rainfall in Oman’s Al-Batina region. The primary innovation involves designing a loss function with the aim of accurately predicting extreme precipitation events without sacrificing overall model performance. The method combines this custom loss function with multi-scale feature extraction by wavelet decomposition and environmental data, both dynamic variables (vegetation indices, land surface temperature, precipitation) and static features (soil properties, distance to infrastructure). Basically, we assume the static site characteristics like soil texture, porosity, bulk density, and topography potentially have an influence on hydrological processes, such as infiltration, runoff generation, and flood propagation. Optimization of the loss function is further enhanced with Bayesian estimation of uncertainty with Metropolis-Hastings MCMC sampling, providing probabilistic outputs that measure prediction uncertainty for improved decision-making. Also, a baseline model performance analysis is used for Random Forest, Support Vector Machines, Artificial Neural Networks, and ensemble procedures to test if the custom loss function strategy is effective, particularly in extreme event detection quality and accuracy.

2. Methodology

2.1. Study Area

The Al-Batinah area is located in the northeast part of Oman, extending between 23.49° N and 24.98° N latitudes and 56.08° E and 57.65° E longitudes (Figure 1). The area has a total size of roughly 7979 km². Topography using NASADEM digital elevation data [37] indicates a varied terrain with a highest elevation up to 1795 m above mean sea level. The area is classified into five elevation classes. Lowland areas near the coastline lying below 50 m above mean sea level cover 22.7% of the total area, low foothills of between 50 and 200 m height cover around 32.8%, mid-foothills between 200 and 500 m in height cover 24.1%, upper foothills between 500 and 1000 m in height cover 18.4%, and mountainous areas above 1000 m above mean sea level take up 1.5% of the region. The area has a relatively moderate slope, with a mean of 9.7°. The results show that flat ground occupies 15.7%, with slopes between 0 and 2°. The area with slopes between 2 and 5° occupies 34.1%, slopes between 5 and 10° cover 17.8%, slopes between 10 and 15° cover 8.4%, and slopes steeper than 15° take up 23.6% (Figure 2).

2.2. Data Collection and Preprocessing

This dataset contains two main categories of variables throughout the study (Appendix A), both with distinct roles throughout analysis: static and time-varying data [38,39,40,41,42,43,44,45]. Time-varying data are defined as variables that change over time, including precipitation (Rainfall), land surface temperature (LST), and vegetation index (NDVI), and derived variables such as lagged features and rolling means. The combined dataset was limited to an overlapping time frame from 2006 to 2020, where all variables are available. Examples include soil characteristics; distance to infrastructure, such as nearest road distance; and seasonal NDVI means. The static data contribute spatial context and are used as predictors for various environmental processes. Missing values for time-varying features were addressed through linear interpolation to maintain the continuity of time sequences. If the missing values occurred at the beginning or end of the series, then such samples were excluded to avoid introducing any bias. Missing values in static features were imputed with median values determined from the existing data.

Table 1 below shows the continuous data used in this research, their categories, and descriptions.

Lagged features at 1, 3, and 7 months were computed for the critical features of rainfall and land surface temperature. These lagged variables will allow for an easier analysis of the delayed effects of precipitation and temperature on vegetation dynamics, along with hydrological processes. Additionally, the Water Stress Index (WSI) was calculated as an additional feature based on the data [44] (Equation (1)):

W S I = \frac{N D V I}{R a i n + ϵ}

(1)

The static features consist of soil properties, proximity metrics, and seasonal NDVI means. Some key soil properties important in hydrological processes include sand and silt fractions, bulk density, and porosity. In this work, porosity was obtained using the equation by ref [36]:

Φ = 1 - \frac{B u l k D e n s i t y}{p a r t i c l e D e n s i t y}

(2)

The particle density was estimated using the standard value of 2.65 g/cm³ for mineral soil particles, as commonly cited in the literature [44,45,46,47,48], and was used for all samples uniformly because site-specific particle density data were not available from the data given.

2.3. Wavelet Transformation of the Time-Varying Data

WT is a mathematical method that decomposes time-series data into several frequency components [49]. Unlike the Fourier Transform, which analyzes signals solely in the frequency domain, WT provides a joint time-frequency representation given by [50,51]. This property makes WT particularly suitable for non-stationary signals, such as hydrological time series, which usually show temporal variability at several scales. WT primarily decomposes the monthly time-series variables into approximation and detail components, which represent the low-frequency and high-frequency components, respectively. This decomposition enhances the detectability of long-term trends and short-term variations, both highly crucial in flood prediction models. In this study, the Discrete Wavelet Transform (DWT) is employed to decompose a signal X(t) into approximation (A) and detail (D) components at various levels. For a monthly time series X(t), the decomposition is represented in Equation (3):

X (t) = A_{J} (t) + \sum_{j = 1}^{J} D_{j} (t)

(3)

The approximation component, A_J(t), shows the low-frequency (long-term trend) information at the J-th level of decomposition. The detail component, D_j(t) captures high-frequency variations at progressively finer scales for each level j. Approximation and detail components are computed using recursive convolution with the scaling function ϕ(t) and wavelet function ψ(t), respectively:

A_{j} = \sum_{t} X (t) . ϕ_{j} (t)

(4)

D_{j} = \sum_{t} X (t) . ψ_{j} (t)

(5)

The wavelet used for decompositions in this work was Daubechies-4 (db4). This is primarily due to its well-defined characteristics, such as being especially well-suited for the analysis of environmental and hydrological time series, making it an ideal candidate [52]. db4 wavelets offer a good balance between signal localization in both the time and frequency domains, offering the power to detect subtle shifts and transient characteristics inherent in environmental data without causing too much computational expense [53,54]. Moreover, db4 is a standard and steady choice applied in environmental studies to enable a platform of equal comparison with earlier literature. Signal reconstruction was not assessed in this work because the purpose of using a wavelet transform was mainly to extract features rather than reconstruct signals. Therefore, the approximation and detail coefficients extracted were used directly as input independent features in the models. The wavelet-transformed features were then used as inputs for the flood-forecasting model to capture both the short-term changes and long-term trends efficiently. D1 and D2 are the high-frequency detail coefficients, capturing rapid localized changes in the signal, usually vital for recognizing flood events. On the contrary, A4, the lower-frequency level approximation coefficient, keeps broader-scale climatic and seasonal trends.

2.4. LSTM Model Implementation and Evaluation

The LSTM model represents one of the architectures of RNNs that are suitable for time series because of their ability to capture temporal trends [53]. The time series data was split based on a sliding-window approach of seven timesteps. Each dataset had time-varying features concatenated with static features so that the model would learn from both dynamic and stationary inputs simultaneously. Based on the features of the last seven timesteps, the target variable, rainfall y, was forecasted for time step t + 1. Let the time-varying feature matrix be X_time ∈

R

^T×F_time, where T is the number of timesteps and F_time is the number of time-varying features. Also, let the static feature matrix be F_static, and let ∈

R

^T×F_static be the number of static features. For each timeseries of n timesteps, the input matrix X was formed as follows:

Xi = [\begin{matrix} x_{i}^{t i m e} & , & x_{i}^{s t a t i c} \\ ⋮ & ⋮ \\ x_{t + n_{t i m e s t e p s} - 1}^{t i m e} & , & x_{t + n_{t i m e s t e p s} - 1}^{s t a t i c} \end{matrix}] \in R^{T \times {(F}_{t i m e} + F_{s t a t i c})}

(6)

The resulting dataset was split into training (80%) and testing (20%) sets. Variables with different temporal resolutions, like NDVI (16-day) and precipitation (daily), were combined by interpolating NDVI to a monthly resolution. Temporal trends were processed by creating rolling averages of 7- and 30-day intervals, and lagging features by 1-, 3-, and 7-day intervals. Standardization of all continuous variables, along with the addition of seasonal and interaction features, was performed in order to maintain meaningful relationships between the variables. To capture intricate temporal and nonlinear relationships in rainfall timeseries, a range of machine learning models were employed, including a custom-built two-layer long short-term memory network (LSTM) [54], Random Forest (RF) [55], Support Vector Machine (SVM) [56], Artificial Neural Network (ANN) [50], and a hybrid model using LSTM and RF prediction. RF, SVM, and ANN were employed as baseline models for giving a performance reference, while the weighted quantile loss-based customized LSTM was optimized to increase predictive accuracy and better handle extreme rainfalls.

The adaptive LSTM (Figure 3) consisted of two stacked layers, 256 and 128 units, respectively, with dropout preceding each in order to avoid overfitting. Both layers received L1/L2 kernel regularization. The weighted quantile loss function was employed to prioritize the under-prediction errors, as follows:

L (y, \hat{y}) = \frac{1}{n} \sum_{n}^{1} [1 + α \cdot \max (0, y_{i})] \cdot (y_{i} - {\hat{y}}_{i})

(7)

where

y_{i}

and

{\hat{y}}_{i}

are the true and predicted rainfall values, respectively, and α is a weight factor controlling the emphasis on larger rainfall events. Hyperparameters for the LSTM, including dropout rate, L1/L2 regularization strengths, and underprediction weight, were tuned through grid search based on validation performance [57]. Early stopping and learning rate scheduling were employed to promote training efficiency and prevent overfitting.

The RF base model contained 200 trees with a depth of 10. The SVM model used a radial basis function kernel and had hyperparameters C = 100, γ = 0.1, and ϵ = 0.1. The ANN model consisted of two dense layers of 128 and 64 units with dropout layers after each and a single output unit for rainfall prediction. The ensemble model combined the proprietary LSTM and RF predictions with a 60:40 weighting. Model performance was evaluated using mean absolute error (MAE), root mean squared error (RMSE), coefficient of determination (R²), and symmetric mean absolute percentage error (SMAPE), defined in Table 2, where n is the number of observations,

y_{i}

and

{\hat{y}}_{i}

are the observed and predicted rainfall, respectively, and

{\bar{y}}_{i}

is the mean observed rainfall.

To statistically compare model performance, the Kruskal–Wallis H-test [58], a non-parametric alternative to one-way ANOVA, is applied to the absolute prediction errors of all models where error_model =

|y_{t r u e} - {\hat{y}}_{m o d e l}|

.

This test assesses whether differences in prediction accuracy among models are significant:

H, p = Kruskal (errors_RF, errors_LSTM, errors_SVM, errors_ANN, errors_Ensemble)

(8)

2.5. Forecasting and Uncertainty Estimation

Metropolis–Hastings Markov Chain Monte Carlo (MCMC) approach was used for sampling from the posterior distribution of σ, given a Gaussian likelihood function [59,60]. The prior for the standard deviation parameter σ was assigned as a half-normal distribution as a positivity constraint. To assess the adequacy of the number of iterations, convergence diagnostics were used. These diagnostics altogether confirm that the specified number of iterations was satisfactory for stable and consistent estimation of the posterior distribution of σ.

This systematic approach to uncertainty quantification, along with the complete data processing and model training workflow described above, is summarized in the workflow diagram shown in Figure 4.

3. Results

3.1. Model Performance Overview

The forecasting capacity of five algorithms, Random Forest (RF), LSTM with a custom loss function, Support Vector Machine (SVM), Artificial Neural Network (ANN), and the ensemble of LSTM and RF, was evaluated based on four metrics: MAE, R², and SMAPE. These combined are a good measure of the models’ accuracy and are able to predict extreme values of the models. Table 3 summarizes the performance results.

The LSTM model registered the minimum MAE (0.0222 mm/day) and RMSE (0.1098 mm/day), and maximum R² (0.8068) and minimum SMAPE (7.62%). The LSTM + RF combination model also performed well, with slightly better MAE (0.0275 mm/day) and RMSE (0.1125 mm/day), R² = 0.7971, and SMAPE = 13.65%, showing that the combination model not only has the strengths of LSTM and RF but also decreases individual model bias. Random Forest and ANN had average performances, but SVM performed poorly as reflected by negative R² and by extremely high RMSE (15.1356 mm/day) and SMAPE (51.77%). These results suggest that SVM, even with hyperparameter tuning, was not able to acquire the complex nonlinear and time-dependent patterns in rainfall time series data. The SMAPE values highlight the robustness of the LSTM and ensemble models to reduce percentage errors, particularly at high rainfall intensities.

A Kruskal–Wallis H-test was conducted to see if the prediction error distributions were significant between the different models. The output shows a highly significant difference (H = 14,021.12, p < 0.001), which shows that at least one of the models performs significantly better than the others. A visual analysis of the predictions is presented in Figure 5, with both scatter plots of predicted vs. actual rainfall and QQ plots of observed vs. predicted quantiles. The scatter plot shows how LSTM predictions closely follow the 1:1 line with minimal spread, indicative of improved predictions for average as well as high values of rainfall. The ensemble model also performs well, with minor deviations at higher levels of rainfall. Random Forest and ANN show moderate deviations that are underestimates of extreme events, while SVM predictions are extremely scattered against the 1:1 line, which reflects its weak performance compared to others. These results can also be seen in the QQ plots, which detect the ability of the models to preserve the overall shape of rainfall distribution. LSTM and the ensemble model both manage to simulate the observations, particularly in the upper tails, which shows reliability for predicting extreme events. Random Forest and ANN show deviations at higher quantiles, which suggests that extreme rainfall has been underestimated.

Apart from model accuracy, another point is how different loss functions affect the LSTM model in terms of predicting extreme events. Four different loss functions were used to evaluate this: MSE, MAE, Quantile Loss with q = 0.9, and the custom weighted loss function. Table 4 highlights RMSE on all samples and RMSE/MAE on the top 10% of rainfall samples. Based on the results, it can be concluded that the custom weighted loss performed better in forecasting rainfall compared to other loss functions. Additionally, quantile loss with q = 0.9 performed poorly since it focuses more on accurate forecasting of extreme rainfall rather than on accuracy.

In summary, the overview of model performance shows that the LSTM model with a custom-designed loss function has the most predictive accuracy by all metrics, with a focus on extreme rainfall events. The traditional machine learning algorithms, particularly SVM, were not capable of explaining the nonlinear, complex rainfall processes in this case study. Statistical comparison by the Kruskal–Wallis H-test confirms the differences in model performance observed are significant, which shows the model’s performance has been satisfactory. Visual diagnostics, including scatter plots of predicted and observed rain and QQ plots, further confirm the results, highlighting the consistency and accuracy of LSTM and ensemble predictions in both picking up average and extreme rain events.

3.2. Uncertainty Analysis and Bayesian Estimation

Evaluating uncertainty is especially needed in interpreting a rainfall prediction model on datasets designed for extreme events. In this work, the posterior distribution of the Gaussian uncertainty parameter σ is estimated using MCMC. Table 5 presents the outputs and convergence diagnostics from the MCMC simulation of the parameter σ. Figure 6 presents a comprehensive graphical summary of the posterior distribution and MCMC diagnostics, including trace plots, running mean, posterior density, and the autocorrelation function. The Monte Carlo Standard Error (MCSE) is very small (0.0004), reflecting good estimation of posterior quantities. Alao, the Geweke diagnostic Z-score of 0.262 is in the span of the standard convergence criterion (|Z| < 2), which points to acceptable convergence. Posterior summary statistics for the parameter σ indicate that the mean value is 0.118, with a standard deviation of 0.001. The 95% credible interval is between 0.117 and 0.120, which measures the uncertainty in the parameter estimate from the sampled posterior distribution.

The positively skewed posterior distribution of σ reflects the model’s adaptive ability to capture variability due to extreme rainfall events. The limited range of observed values of σ under normal rainfall conditions is an indication that the model balances predictive precision and sensitivity to extremes, which would minimize uncertainty under typical conditions. The diagnostics for full convergence in effective sample size, acceptance fraction, and autocorrelation analyses confirm the validity of posterior estimates despite some indication by the Geweke diagnostic requiring future research.

These results further pinpoint the importance of the use of advanced Bayesian methods within hydrological modeling, especially in those dealing with datasets that have non-Gaussian distributions and extremes. This is essential for quantifying uncertainty to obtain proper insight into decision-making processes relevant to flood risk management and climate adaptation. This provides a straightforward contribution to modeling uncertainty in water management design. For example, the confidence interval of σ may provide probabilistic thresholds for flood warnings or infrastructure design criteria.

3.3. Sensitivity Analysis Insights and Limitations

Sensitivity analysis will focus on how to determine the sensitivity of the model prediction, considering the variation in certain input features. This will be useful in understanding which factor influences the model’s ability to predict the flood risk the most. The results showed that Rainfall and LST became the most informative variables. The LSTM model was very sensitive to rainfall; any change in rainfall data greatly changed the flood risk predicted. Similarly, LST also played a major role in deciding the model’s outputs, which means that temperature fluctuations have a notable effect on flood predictions. In contrast, NDVI, while an important feature, is less influential compared to rainfall and LST. This would mean that the accuracy and reliability of the model are closely related to the precision of rainfall and temperature data; hence, in scenarios of either extreme rainfall or temperature change, it might drastically affect the prediction from the model.

While the sensitivity analysis showed that the most important features in the model for predicting flood risk were rainfall and LST, it is important to explore other variables as well. Some of the features presented a uniform distribution in the dataset. This may have contributed to the model not considering it very important. Other variables may require a larger scale of analysis, spatially or temporally, to clearly show their effects, for example, soil composition and terrain slope. We have to keep in mind that the absence or reduced effect of these variables in this model does not imply irrelevance. Instead, it just suggests that the current dataset or study scope may not represent the conditions under which the variables would significantly impact flood risk. For example, infrastructure-related variables such as road distance could become highly important in areas that are more urbanized or heavily populated. In areas of environmental diversity, the characteristics of soil and terrain may be more important. It is suggested that future studies move beyond small-scale geographical units toward larger or more diverse ones to capture variations in terrain, soil, and infrastructure that the current study underrepresents.

Further, the use of high-resolution datasets for features like road networks and soil properties might be useful in better capturing those features in the model, hence giving a more realistic evaluation of their contributions. Scenario-specific analyses that are focused on extreme conditions, for example, high-intensity rainfall events or highly urbanized regions, may reveal the hidden influence of these variables. These would be complementary steps in deepening the insight into flood risk factors, thus allowing for the enhancement of predictive models’ robustness.

3.4. Multi-Scale Insights from Wavelet Decomposition

The wavelet decomposition of the analyzed variables, namely WSI, Rain, and LST, allows for a deep understanding of their temporal behavior by separating transient fluctuations from long-term trends. In Figure 7, the wavelet decomposition for the Water Stress Index shows high-frequency variations that are very significant during the initial stages of the time series. These fluctuations are captured by the detail 1 and detail 2 components, suggesting intense short-run disturbances that may well be driven by localized environmental stressors or rapid shifts in hydrological conditions. Over time, these high-frequency fluctuations decrease in size to show that stabilization has occurred in this system. The trend approximation component shows a gradual decline, which shows that water stress has shown a consistent reduction and could be part of climate-driven drought or reduced water availability.

Rainfall decomposition follows a very vivid pattern. The high-frequency variability concentrates at the beginning of the time series (the sharp decline inside the detail 1 and detail 2 components). This behavior shows intense rainfall events at the beginning of the record, while the following stages show reduced variability and stabilization. This long-term stabilization, combined with decreasing trends, suggests changing rainfall patterns that could have far-reaching implications for water resource management and agricultural planning.

In hydrological terms, the short-term and high-frequency components (detail 1 and detail 2) often reflect short-term and episodic events, such as flash rain, runoff surges, or abrupt shifts in water availability from local disturbances. Such data show symptom patterns of water systems’ stress that may be associated with land use, storm events, or irrigation cycles. At the same time, the A4 approximation component relates to low-frequency trends that capture the impacts of environmental alterations, such as persistent drought over the long term, climate fluctuation, or long-term warming in land surface temperature. These results allow us to differentiate rare extreme hazards from slow systemic shifts, with a more sophisticated understanding of environmental dynamics and water stress evolution. The decomposition of LST Mean Lag1 provides a different perspective, where the high-frequency components exhibit smaller amplitude fluctuations compared to WSI and Rain. There are transient variations; however, the magnitude of these variations is relatively small and so more predictable over time. On the contrary, the trend component has been on a steady, gradual, long-term behavior, which is a sign of broader environmental temperature changes. This points out the stability of land surface temperature dynamics on a lagged timescale and, simultaneously, its role as one of the key indicators of gradual shifts in the underlying environmental system.

These results highlight wavelet decomposition as a valuable analytical tool for the examination of complex environmental data. By separating noise and transient fluctuations from the long-term trend, this approach can provide a more appropriate interpretation of time series data and reveal hidden structures and patterns within different temporal scales.

4. Discussion

This paper is an important contribution to rainfall forecasting through the integration of deep learning and domain-specific tailoring. The LSTM framework, supported by a precision-designed loss function, exhibited excellent accuracy both in normal and extreme rainfall. In contrast to the tendency of most prior models to average extremes, our model places them at center stage, delivering a tailored solution to the face of flood forecasting and disaster preparedness. Performance measure accomplishment here is not only statistically appreciable but also pragmatically beneficial. Small error values and high R² confirmation affirm the model’s capacity to learn complex temporal patterns and its validity for dynamic and high-risk environments.

The innovation here is less the use of LSTM, an old, tried-and-true tool for time series modeling, and more the careful redefinition of its learning target. The custom loss function does not treat all mistakes equally; instead, it mimics the way we, as forecasters or emergency planners, think about them: a 10 mm mistake in light rain is irrelevant, but the same mistake in a large extreme rainfall event would be catastrophic [61,62]. By explicitly incorporating this one-sided importance into the training process, the model has more reason to learn from significant deviations than it would with conventional training paradigms. Such an outlier sensitivity emphasis is in line with real-world needs and serves to fill a methodological gap often overlooked in rainfall modeling research.

As compared to previous rainfall prediction models, our model using LSTM with a customized loss function shows a significant improvement. For example, in a previous study using conventional LSTM and deep learning methods, an accuracy level of approximately 76% was achieved with six environmental parameters [63], while other approaches such as Holt–Winters, ELM, ARIMA, and RNN models have shown higher errors in predicting rainfall extremes [64,65]. However, our model performs better, with a lower MAE of 0.0222 mm/day, RMSE of 0.1098 mm/day, a higher R² of 0.807, and a smaller SMAPE of 7.62%. Comparing our model with traditional machine learning models on our dataset, Random Forest performed better, with an MAE of 0.0339 mm/day, ANN with an MAE of 0.0342 mm/day, and SVM performed worst, with an MAE of 0.2430 mm/day. Moreover, our observation shows that the SVM model performed badly compared to other models. Through our code, it is apparently evident that this poor performance of SVM may be attributed to the noise in the model, large dimensional space with lagged variables, rolling variables, and interaction variables, and inability to incorporate temporal relationships into the model. Furthermore, the selection of kernels and default parameters in SVM may have influenced the accuracy of the model compared to other two models which make efficient usage of temporal relationships. A combination of models LSTM + RF performed moderately well, with an MAE of 0.0275 mm/day and an RMSE of 0.1125 mm/day, but it was slightly poor compared to our model based on LSTM with a custom loss function. These experiments clearly show that incorporating a loss function, focusing on estimating heavy rainfall events in our model, substantially outperformed both conventional methods and deep learning methods proposed in previous studies in the literature.

While the LSTM architecture, together with the wavelet transform, offers strong capabilities for modeling complex temporal patterns in rainfall data, it also has some pitfalls. LSTMs inherently require large amounts of quality training data and fine hyperparameter tuning to avoid overfitting, potentially limiting their generalizability in data-scarce or very high variance situations [66]. Adding wavelet-based feature extraction increases the strength of multi-scale temporal dynamics capture but comes with increased model complexity and computational cost [67]. Deep learning models are also often “black boxes” where it is difficult to explain how specific inputs contribute to the prediction, making it harder to establish trust and acceptance among stakeholders [68,69]. Despite these challenges, the strength of the hybrid approach in focusing on extreme event prediction and uncertainty estimation makes it a good match for flood forecasting purposes, where unusual but significant events must be replicated. Future work would do well to explore further ways to improve model interpretability, reduce computational demands, and balance complexity with operational suitability.

Uncertainty, that sometimes unspoken elephant in the environmental model room, is addressed directly in this study through the application of a Bayesian framework [70]. The employment of the Metropolis–Hastings algorithm to estimate the posterior distribution of prediction variance adds depth in terms of interpretability to the model [71]. The positively skewed uncertainty parameter captures the model’s greater conservatism in low-probability events, a welcome attribute when machine and human life are at risk [72]. This formulation as probability makes the model more than just a black-box predictor of events, but rather an advanced decision-making tool, capable of informing machine and human decision-makers not only of what is expected to happen, but also with what level of confidence we can rely on it. By providing a probabilistic measure of forecast uncertainty, the proposed Bayesian framework strengthens the operational use of rainfall predictions, making them more actionable for flood risk management and emergency preparedness.

Another extremely important point is the sensitivity analysis, sort of like an X-ray, revealing to us the inner mechanics of the model’s decision-making [73]. The focus on rainfall and land surface temperature in propelling predictions aligns with physical hydrological intuition in checking that the model is not merely accurate but physically sound. NDVI’s relatively lower impact invites reflection; it may not be that vegetation is unimportant, but that in the context of short-term flood prediction, its signal is less immediate than the direct effects of precipitation and surface heating. This finding opens avenues for deeper investigations into when and where secondary environmental features become crucial, perhaps under different climatic regimes or land use types.

Even with these advancements, the model’s limitations to accurately reproduce the magnitude and timing of extreme rainfalls recall how tough the job is. Extreme events are inherently messy, governed by nonlinear atmospheric dynamics that may not be fully reproducible in the current feature space. To address this shortfall, future models can incorporate attention mechanisms, a mechanism borrowed from natural language processing that allows models to “pay attention” to the most relevant pieces of data at each time step. This could enable more responsive adaptation to drastic changes in weather patterns. Ensemble learning techniques can also bring complementary strengths by averaging across different model predictions to reduce variance and enhance robustness.

In addition, some input parameters, such as soil particle density, were taken to be invariant (2.65 g/cm³) due to sparse data. While such is a common assumption, differences in soil composition from site to site could influence model accuracy through effects on soil moisture and runoff characteristics. Accounting for such assumptions with the use of site-specific soils would further enhance future model simulations. In addition, the selected river basins under investigation in this study are subjected to varying anthropogenic influences, such as urbanization, land use change, and water management, all of which exert significant effects on hydrological processes, as well as flood risk. While our modeling system primarily addresses rainfall estimation from meteorological and environmental inputs alone, it does not necessarily take these human-induced processes into account. It is essential to recognize this limitation since anthropogenic processes can alter runoff regimes, soil properties, and local microclimates, thus influencing flood behavior. Future research should aim to integrate human influence information and land use change processes into the model framework to improve flood-forecasting accuracy and applicability in highly regulated or urban catchments.

5. Study Limitations

In addition to these variables, limitations in the input datasets should also be noted. The rainfall data used still have gaps in time and measurement inconsistency. Interpolation methods were applied for missing values, but uncertainty is introduced, especially during periods of extreme meteorological transition. Moreover, the density of rain gauges over the study area is heterogeneous and may lead to data quality and coverage inhomogeneities. Low gauge density areas may potentially fail to represent rainfall fluctuations at the local scale, particularly under convective or extreme events. These issues highlight the need for investment in more spatially distributed and reliable monitoring networks. Advances in model performance in the future will also depend on integrating richer and more wide-ranging datasets, hopefully combining ground observations with satellite-derived data to remove spatial biases and fill temporal gaps.

The performance of the proposed framework seems satisfactory when it comes to the prediction of extreme rainfall events in the Al-Batinah region. However, several limitations should be acknowledged with the approach. First, it was a model trained and evaluated based on data within one geographic region that presents specific climatic and topographic characteristics. While the inclusion of static features, namely soil properties and topography, adds to physical relevance, generalization capability of the model has not been explicitly tested on other hydro-climatic regions. The proposed loss function and associated modeling framework should be examined for transferability in different climatic zones in future studies.

Secondly, the forecasting model and framework were based on aggregated datasets on a monthly basis, which can make it less efficient in identifying short-term rainfall occurrences in relation to flash flood events. Although wavelet analysis assists in pointing out high-frequency elements, a short-term forecasting model would demand higher frequency datasets. Third, though Bayesian uncertainty estimation by using MMC provides the probabilistic insight into the confidence of a prediction, the computational cost of MCMC sampling could be a challenge toward fully real-time operational deployment. It would be worth considering approximate Bayesian or ensemble-based uncertainty methods that might allow for improved computational efficiency while retaining uncertainty awareness. In addition, the weighting parameter in the custom loss function was chosen by validation-based tuning rather than a more formal approach to optimization; future research could therefore investigate adaptive or data-driven optimization strategies for loss function parameters. Having these limitations in mind, the framework represents an excellent starting point that is open to various methodological and operational avenues for further improvement in predicting extreme rainfall.

6. Conclusions

This research has developed, evaluated, and optimized an advanced LSTM-based rainfall prediction model with Bayesian uncertainty estimation and sensitivity analysis. Its capability in predicting rainfall events accurately in real-time, even with the presence of complex temporal dependencies, has shown great promise for hydrological forecasting and flood risk management.

The customized loss function, which puts much emphasis on deviations caused by high-intensity rainfall, proved to be very useful in enhancing the predictive capabilities of the model. It allows the model to give higher penalties for large errors, especially in extreme events, hence showing better accuracy in capturing the rarely occurring but highly impactful rainfall dynamics. Indeed, this enhancement is demonstrated in reduced error metrics that reflect a substantial improvement compared to the earlier versions of the model. But apart from these performance gains, perhaps the Bayesian framework provides further insight into predictive uncertainty. The posterior distribution of the Gaussian uncertainty parameter underlines the variability of extreme rainfall events and points out the necessity of probabilistic approaches in hydrological modeling. This quantification of uncertainty allows for risk-informed decision making by providing actionable information for flood warnings, infrastructure design, and disaster preparedness in the context of climate variability. The sensitivity analysis has also shown that rainfall and land surface temperature are the most influential variables driving the model’s prediction, while NDVI showed a less effective role. These results show that proper and high-resolution data acquisition of essential meteorological variables is necessary to obtain reliable flood risk predictions.

The customized loss function developed in this work has opened new perspectives towards further improvements. Future studies should be directed at elaborating more advanced loss functions with adaptive penalties, either due to prior domain knowledge or time-varying thresholds. On the other hand, it is possible to obtain more precise predictions of extreme rainfall events by integrating probabilistic weighting or event-based components of the loss function. Moreover, such loss functions must be designed for large and sufficient regions to capture the variation in critical factors such as soil composition, road distance, and slope of terrain. The environmental variables influencing the flood risk would then be better represented. Integrating other advanced techniques, such as an attention mechanism or ensemble learning, would further work with the customized loss function toward better refinement of model skill for complex rainfall patterns

Although the current work is implemented in the Al-Batina area, the underlying modeling framework of the foundation and the adapted loss function were designed with broader applicability. The structure of the loss function allows it to fit these data-constrained and skewed rain regimes, thereby enhancing its transferability potential to similar hydroclimatic contexts in other regions. Upcoming research will test the model’s performance in such types of settings to check for such flexibility.

In conclusion, the investigation presented here has taken up the challenge of incorporating machine learning with tailored loss functions and uncertainty quantification in pursuit of enhanced rainfall forecasting. Given the identified deficiencies in predicting extreme rain events and novel designs for loss functions, this study should provide a sound framework on which to build hydrological resilience and flood mitigation as we face an increasingly uncertain climate.

Author Contributions

G.A.-R.: Conceptualization, Methodology, Formal Analysis, Resources, Investigation, Project Administration, Writing—Original Draft. M.R.N.: Conceptualization, Formal Analysis, Investigation, Review and Editing. N.S.: Formal Analysis, Investigation, Validation, Review and Editing. M.A.-W.: Conceptualization, Validation, Review and Editing. All authors have read and agreed to the published version of the manuscript.

Funding

The authors thank Sultan Qaboos University (SQU) and Diwan of Royal Court for the financial support under His Majesty’s (HM) grant number SR/DVC/CESR/22/01.

Data Availability Statement

The original data presented in the study are openly available and obtained from publicly accessible and institutional data sources. Land Surface Temperature (LST) data are derived from the MODIS/Terra Land Surface Temperature/Emissivity 8-Day L3 Global 1 km SIN Grid (V061) product, available via NASA’s Land Processes Distributed Active Archive Center (LP DAAC) https://lpdaac.usgs.gov/. Vegetation indices (NDVI) are sourced from the MODIS/Terra Vegetation Indices 16-Day L3 Global 500 m SIN Grid (V061), also available through NASA LP DAAC. Terrain slope data are obtained from the NASADEM Merged DEM Global 1 arc-second (V001) dataset, available via NASA Earthdata (https://www.earthdata.nasa.gov/). Soil properties, including sand, silt, clay fractions, bulk density, and gravel content for surface and subsurface layers, are derived from the SoilGrids dataset provided by ISRIC (https://isric.org/).

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A. Computational Environment and Libraries

The analysis was conducted using Python [74] in a standard environment with commonly available open-source libraries to ensure reproducibility. The main packages used include pandas 2.2.3 [75] and numpy 1.24.3 [76] for data manipulation, matplotlib 3.10.0 [77], seaborn 0.13.2 [78], and contextily 1.6.2 [79] for visualization, and scikit-learn 1.6.0 [80] and scipy.stats 1.14.1 [81] for machine learning and statistical analysis. Deep learning models, specifically LSTM networks, were implemented using tensorflow.keras 2.13.0 [82] with layers such as Dense, Dropout, and BatchNormalization. Signal processing was performed using the pywt library 1.5.0 [83], and geospatial data handling was facilitated by geopandas 1.0.1 [84]. For model evaluation, functions from sklearn.metrics such as mean_squared_error were employed. Additional utilities like os and time were also used [85]. The code was developed and tested in a typical Python 3.10.16 environment.

References

Ebi, K.L.; Bowen, K. Extreme events as sources of health vulnerability: Drought as an example. Weather. Clim. Extrem. 2016, 11, 95–102. [Google Scholar] [CrossRef]
Zwiers, F.W.; Alexander, L.V.; Hegerl, G.C.; Knutson, T.R.; Kossin, J.P.; Naveau, P.; Nicholls, N.; Schär, C.; Seneviratne, S.I.; Zhang, X. Climate extremes: Challenges in estimating and understanding recent changes in the frequency and intensity of extreme climate and weather events. In Climate Science for Serving Society: Research, Modeling and Prediction Priorities; Springer: Dordrecht, The Netherlands, 2013; pp. 339–389. [Google Scholar]
Olowoyeye, T.; Abegunrin, G.; Sojka, M. Are Agroecosystem Services Under Threat? Examining the Influence of Climate Externalities on Ecosystem Stability. Atmosphere 2024, 15, 1480. [Google Scholar] [CrossRef]
Diakakis, M.; Deligiannakis, G.; Antoniadis, Z.; Melaki, M.; Katsetsiadou, N.; Andreadakis, E.; Spyrou, N.; Gogou, M. Proposal of a flash flood impact severity scale for the classification and mapping of flash flood impacts. J. Hydrol. 2020, 590, 125452. [Google Scholar] [CrossRef]
Terzi, S.; Torresan, S.; Schneiderbauer, S.; Critto, A.; Zebisch, M.; Marcomini, A. Multi-risk assessment in mountain regions: A review of modelling approaches for climate change adaptation. J. Environ. Manag. 2019, 232, 759–771. [Google Scholar] [CrossRef]
Murray, A.T.; Carvalho, L.; Church, R.L.; Jones, C.; Roberts, D.; Xu, J.; Zigner, K.; Nash, D. Coastal vulnerability under extreme weather. Appl. Spat. Anal. Policy 2021, 14, 497–523. [Google Scholar]
Rijal, M.; Luo, P.; Mishra, B.K.; Zhou, M.; Wang, X. Global systematical and comprehensive overview of mountainous flood risk under climate change and human activities. Sci. Total Environ. 2024, 941, 173672. [Google Scholar] [CrossRef]
Shen, C. A transdisciplinary review of deep learning research and its relevance for water resources scientists. Water Resour. Res. 2018, 54, 8558–8593. [Google Scholar] [CrossRef]
Yaseen, Z.M. A new benchmark on machine learning methodologies for hydrological processes modelling: A comprehensive review for limitations and future research directions. Knowl. Based Eng. Sci. 2023, 4, 65–103. [Google Scholar] [CrossRef]
Saha, A.; Pal, S.C. Application of machine learning and emerging remote sensing techniques in hydrology: A state-of-the-art review and current research trends. J. Hydrol. 2024, 632, 130907. [Google Scholar] [CrossRef]
Sillmann, J.; Thorarinsdottir, T.; Keenlyside, N.; Schaller, N.; Alexander, L.V.; Hegerl, G.; Seneviratne, S.I.; Vautard, R.; Zhang, X.; Zwiers, F.W. Understanding, modeling and predicting weather and climate extremes: Challenges and opportunities. Weather Clim. Extrem. 2017, 18, 65–74. [Google Scholar] [CrossRef]
Li, X.; Sun, Q.L.; Zhang, Y.; Sha, J.; Zhang, M. Enhancing hydrological extremes prediction accuracy: Integrating diverse loss functions in Transformer models. Environ. Model. Softw. 2024, 177, 106042. [Google Scholar] [CrossRef]
Verma, S.; Srivastava, K.; Tiwari, A.; Verma, S. Deep learning techniques in extreme weather events: A review. arXiv 2023, arXiv:2308.10995. [Google Scholar] [CrossRef]
Shyalika, C.; Wickramarachchi, R.; Sheth, A.P. A comprehensive survey on rare event prediction. ACM Comput. Surv. 2024, 57, 1–39. [Google Scholar] [CrossRef]
Wang, Z.; Mae, M.; Yamane, T.; Ajisaka, M.; Nakata, T.; Matsuhashi, R. Novel Custom Loss Functions and Metrics for Reinforced Forecasting of High and Low Day-Ahead Electricity Prices Using Convolutional Neural Network–Long Short-Term Memory (CNN-LSTM) and Ensemble Learning. Energies 2024, 17, 4885. [Google Scholar] [CrossRef]
Li, Y.; Xu, J.; Anastasiu, D.C. An extreme-adaptive time series prediction model based on probability-enhanced LSTM neural networks. Proc. AAAI Conf. Artif. Intell. 2023, 37, 8684–8691. [Google Scholar] [CrossRef]
Gao, X.; Xie, D.; Zhang, Y.; Wang, Z.; He, C.; Yin, H.; Zhang, W. A Comprehensive Survey on Imbalanced Data Learning. arXiv 2025, arXiv:2502.08960. [Google Scholar] [CrossRef]
O’Donncha, F.; Hu, Y.; Palmes, P.; Burke, M.; Filgueira, R.; Grant, J. A spatio-temporal LSTM model to forecast across multiple temporal and spatial scales. Ecol. Inform. 2022, 69, 101687. [Google Scholar] [CrossRef]
Leščešen, I.; Tanhapour, M.; Pekárová, P.; Miklánek, P.; Bajtek, Z. Long Short-Term Memory (LSTM) Networks for Accurate River Flow Forecasting: A Case Study on the Morava River Basin (Serbia). Water 2025, 17, 907. [Google Scholar] [CrossRef]
Andika, N.; Wongso, P.; Rohmat, F.I.W.; Wulandari, S.; Fadhil, A.; Rosi, R.; Burnama, N.S. Machine learning-based hydrograph modeling with LSTM: A case study in the Jatigede Reservoir Catchment, Indonesia. Results Earth Sci. 2025, 3, 100090. [Google Scholar] [CrossRef]
Rajashekar, P. Enhancing Weather Forecasting Precision Through Advanced Machine Learning Techniques. Doctoral Dissertation, California State University, Northridge, CA, USA, 2024. [Google Scholar]
Roy, D.K. Long short-term memory networks to predict one-step ahead reference evapotranspiration in a subtropical climatic zone. Environ. Process. 2021, 8, 911–941. [Google Scholar] [CrossRef]
Lindemann, B.; Müller, T.; Vietz, H.; Jazdi, N.; Weyrich, M. A survey on long short-term memory networks for time series prediction. Procedia CIRP 2021, 99, 650–655. [Google Scholar] [CrossRef]
Nguyen-Duc, P.; Nguyen, H.D.; Nguyen, Q.H.; Phan-Van, T.; Pham-Thanh, H. Application of Long Short-Term Memory (LSTM) Network for seasonal prediction of monthly rainfall across Vietnam. Earth Sci. Inform. 2024, 17, 3925–3944. [Google Scholar]
Huang, H.; Chen, J.; Huo, X.; Qiao, Y.; Ma, L. Effect of multi-scale decomposition on performance of neural networks in short-term traffic flow prediction. IEEE Access 2021, 9, 50994–51004. [Google Scholar] [CrossRef]
Wang, T.; Peng, D.; Wang, X.; Wu, B.; Luo, R.; Chu, Z.; Sun, H. Study on wavelet multi-scale analysis and prediction of landslide groundwater. J. Hydroinform. 2024, 26, 237–254. [Google Scholar] [CrossRef]
Shen, X.; Yu, Y.; Yan, J.; Hou, C.; Zhang, S. Water Level Prediction at Cascade Pump Stations Based on Multi-Scale Augmented Temporal Decomposition Network. In Proceedings of the 2024 IEEE/WIC International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), Bangkok, Thailand, 9–12 December 2024; pp. 556–561. [Google Scholar]
Ouma, Y.O.; Cheruyot, R.; Wachera, A.N. Rainfall and runoff time-series trend analysis using LSTM recurrent neural network and wavelet neural network with satellite-based meteorological data: Case study of Nzoia hydrologic basin. Complex Intell. Syst. 2022, 8, 213–236. [Google Scholar]
Yi, K.; Zhang, Q.; Fan, W.; Cao, L.; Wang, S.; He, H.; Long, G.; Hu, L.; Wen, Q.; Xiong, H. A survey on deep learning based time series analysis with frequency transformation. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2, Toronto, ON, Canada, 3–7 August 2025; pp. 6206–6215. [Google Scholar]
Wang, J.; Wang, Z.; Li, J.; Wu, J. Multilevel wavelet decomposition network for interpretable time series analysis. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 2437–2446. [Google Scholar]
Wang, Y.; Han, Y.; Guo, Y. Self-adaptive extreme penalized loss for imbalanced time series prediction. In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, Jeju, Republic of Korea, 3–9 August 2024; pp. 5135–5143. [Google Scholar]
Niu, H.; Murray, S.; Jaber, F.; Heidari, B.; Duffield, N. Tail-Aware Forecasting of Precipitation Extremes Using STL-GEV and LSTM Neural Networks. Hydrology 2025, 12, 284. [Google Scholar]
You, X.X.; Liang, Z.M.; Wang, Y.Q.; Zhang, H. A study on loss function against data imbalance in deep learning correction of precipitation forecasts. Atmos. Res. 2023, 281, 106500. [Google Scholar]
Yang, Y.; Feng, D.; Beck, H.E.; Hu, W.; Abbas, A.; Sengupta, A.; Monache, L.D.; Hartman, R.; Lin, P.; Shen, C.; et al. Global daily discharge estimation based on grid long short-term memory (LSTM) model and river routing. Water Resour. Res. 2025, 61, e2024WR039764. [Google Scholar]
Omar, P.J.; Kumar, V. Land surface temperature retrieval from TIRS data and its relationship with land surface indices. Arab. J. Geosci. 2021, 14, 1897. [Google Scholar] [CrossRef]
Sahoo, A.; Parida, S.S.; Samantaray, S.; Satapathy, D.P. Daily flow discharge prediction using integrated methodology based on LSTM models: Case study in Brahmani-Baitarani basin. HydroResearch 2024, 7, 272–284. [Google Scholar] [CrossRef]
Saxton, K.E.; Rawls, W.J.; Romberger, J.S.; Papendick, R.I. Estimating Generalized Soil-water Characteristics from Texture. Soil Sci. Soc. Am. J. 1986, 50, 1031–1036. [Google Scholar] [CrossRef]
FAO. Harmonized World Soil Database (Version 1.2); Food Agriculture Organization: Rome, Italy; IIASA: Laxenburg, Austria, 2012; Available online: https://www.fao.org/soils-portal/data-hub/soil-maps-and-databases/harmonized-world-soil-database-v12/en/ (accessed on 24 November 2024).
Wieder, W.R.; Boehnert, J.; Bonan, G.B. Evaluating soil biogeochemistry parameterizations in Earth system models with observations. Glob. Biogeochem Cycles 2014, 28, 211–222. [Google Scholar] [CrossRef]
Wan, Z.; Hook, S.; Hulley, G. MODIS/Terra Land Surface Temperature/Emissivity 8-Day L3 Global 1km SIN Grid V061. NASA EOSDIS Land Processes Distributed Active Archive Center. 2021. Available online: https://www.earthdata.nasa.gov/data/catalog/lpcloud-mod11a2-061 (accessed on 24 November 2024).
Didan, K. MODIS/Terra Vegetation Indices 16-Day L3 Global 500m SIN Grid V061. NASA EOSDIS Land Processes Distributed Active Archive Center. 2021. Available online: https://www.earthdata.nasa.gov/data/catalog/lpcloud-mod13a1-061 (accessed on 24 November 2024).
NASA JPL. NASADEM Merged DEM Global 1 arc Second nc V001. NASA EOSDIS Land Processes Distributed Active Archive Center. 2020. Available online: https://data.nasa.gov/dataset/nasadem-merged-dem-global-1-arc-second-nc-v001-c5d1f (accessed on 24 November 2024).
Tanriverdi, C.; Atilgan, A.; Degirmenci, H.; Akyuz, A. Comparasion of crop water stress index (CWSI) and water deficit index (WDI) by using remote sensing (RS). Infrastrukt. I Ekol. Teren. Wiej. 2017, 3, 879–894. [Google Scholar]
Flint, A.L.; Flint, L.E. 2.2 Particle Density. In Methods of Soil Analysis: Part 4 Physical Methods; Soil Science Society of America: Madison, WI, USA, 2002; Volume 5, pp. 229–240. [Google Scholar]
Blake, G.R.; Hartge, K.H. Particle density. In Methods of Soil Analysis: Part 1 Physical and Mineralogical Methods; Soil Science Society of America: Madison, WI, USA, 1986; Volume 5, pp. 377–382. [Google Scholar]
Eluozo, S.N. Predictive model to monitor the rate of bulk density in fine and coarse soil formation influenced variation of porosity in coastal area of Port Harcourt. Am. J. Eng. Sci. Technol. Res. 2013, 1, 115–127. [Google Scholar]
Sparks, D.L.; Singh, B.; Siebecker, M.G. Environmental Soil Chemistry; Elsevier: Amsterdam, The Netherlands, 2022. [Google Scholar]
Rhif, M.; Ben Abbes, A.; Farah, I.R.; Martínez, B.; Sang, Y. Wavelet transform application for/in non-stationary time-series analysis: A review. Appl. Sci. 2019, 9, 1345. [Google Scholar] [CrossRef]
Sang, Y.F. A review on the applications of wavelet transform in hydrology time series analysis. Atmos. Res. 2013, 122, 8–15. [Google Scholar] [CrossRef]
Liu, P.; Zhou, Z.; Gu, F.; LuyangZhang; Song, Y.; Lu, S. AW-SARIMA: Efficient Hybrid Framework for Nonstationary Time Series Forecasting via DWT and Adaptive Thresholding. In Proceedings of the International Conference on Intelligent Computing, Ningbo, China, 26–29 July 2025; pp. 37–47. [Google Scholar]
Nourani, V.; Baghanam, A.H.; Adamowski, J.; Kisi, O. Applications of hybrid wavelet–artificial intelligence models in hydrology: A review. J. Hydrol. 2014, 514, 358–377. [Google Scholar] [CrossRef]
Serravalle Reis Rodrigues, V.H.; de Melo Barros Junior, P.R.; dos Santos Marinho, E.B.; Lima de Jesus Silva, J.L. Wavelet gated multiformer for groundwater time series forecasting. Sci. Rep. 2023, 13, 12726. [Google Scholar] [CrossRef]
Bharadiya, J.P. Exploring the use of recurrent neural networks for time series forecasting. Int. J. Innov. Sci. Res. Technol. 2023, 8, 2023–2027. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Thakkar, A.; Lohiya, R. Analyzing fusion of regularization techniques in the deep learning-based intrusion detection system. Int. J. Intell. Syst. 2021, 36, 7340–7388. [Google Scholar] [CrossRef]
Kruskal, W.H.; Wallis, W.A. Use of ranks in one-criterion variance analysis. J. Am. Stat. Assoc. 1952, 47, 583–621. [Google Scholar] [CrossRef]
Metropolis, N.; Rosenbluth, A.W.; Rosenbluth, M.N.; Teller, A.H.; Teller, E. Equation of state calculations by fast computing machines. J. Chem. Phys. 1953, 21, 1087–1092. [Google Scholar] [CrossRef]
Hastings, W.K. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 1970, 57, 97–109. [Google Scholar] [CrossRef]
Elsberry, R.L. Predicting hurricane landfall precipitation: Optimistic and pessimistic views from the symposium on precipitation extremes. Bull. Am. Meteorol. Soc. 2002, 83, 1333–1339. [Google Scholar] [CrossRef]
Engelbrecht, C.J.; Engelbrecht, F.A.; Dyson, L.L. High-resolution model-projected changes in mid-tropospheric closed-lows and extreme rainfall events over southern Africa. Int. J. Climatol. 2013, 33, 173–187. [Google Scholar] [CrossRef]
Poornima, S.; Pushpalatha, M. Prediction of rainfall using intensified LSTM based recurrent neural network with weighted linear units. Atmosphere 2019, 10, 668. [Google Scholar] [CrossRef]
Terbuch, A. LSTM Hyperparameter Optimization: Impact of the Selection of Hyperparameters on Machine Learning Performance when Applied to Time Series in Physical Systems. Master’s Thesis, Technical University of Leoben, Leoben, Austria, 2021. [Google Scholar]
Munoz, A.; Ertlé, R.; Unser, M. Continuous wavelet transform with arbitrary scales and O (N) complexity. Signal Process. 2002, 82, 749–757. [Google Scholar] [CrossRef]
Guidotti, R.; Monreale, A.; Ruggieri, S.; Turini, F.; Giannotti, F.; Pedreschi, D. A survey of methods for explaining black box models. ACM Comput. Surv. 2018, 51, 1–42. [Google Scholar] [CrossRef]
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef]
Bastin, L.; Cornford, D.; Jones, R.; Heuvelink, G.B.; Pebesma, E.; Stasch, C.; Nativi, S.; Mazzetti, P.; Williams, M. Managing uncertainty in integrated environmental modelling: The UncertWeb framework. Environ. Model. Softw. 2013, 39, 116–134. [Google Scholar] [CrossRef]
Kaji, T.; Ročková, V. Metropolis–Hastings via classification. J. Am. Stat. Assoc. 2023, 118, 2533–2547. [Google Scholar] [CrossRef]
Hannachi, A. Quantifying changes and their uncertainties in probability distribution of climate variables using robust statistics. Clim. Dyn. 2006, 27, 301–317. [Google Scholar] [CrossRef]
Insua, D.R. Sensitivity analysis in multi-objective decision making. In Sensitivity Analysis in Multi-Objective Decision Making; Springer: Berlin/Heidelberg, Germany, 1990; pp. 74–126. [Google Scholar]
Van Rossum, G.; Drake, F.L. Python/C Api Manual-Python 3; CreateSpace: Scotts Valley, CA, USA, 2009. [Google Scholar]
McKinney, W. Data structures for statistical computing in Python. Scipy 2010, 445, 51–56. [Google Scholar]
Harris, C.R.; Millman, K.J.; van der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Wieser, E.; Taylor, J.; Berg, S.; Smith, N.J.; et al. Array programming with NumPy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef] [PubMed]
Hunter, J.D. Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 2007, 9, 90–95. [Google Scholar] [CrossRef]
Waskom, M.L. Seaborn: Statistical data visualization. J. Open Source Softw. 2021, 6, 3021. [Google Scholar] [CrossRef]
Arribas-Bel, D. Contextily: Context Geo Tiles in Python, Version 1.6.2; GitHub. 2023. Available online: https://contextily.readthedocs.io (accessed on 15 November 2024).
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nat. Methods 2020, 17, 261–272. [Google Scholar] [CrossRef] [PubMed]
Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv 2016, arXiv:1603.04467. [Google Scholar]
Lee, G.; Gommers, R.; Waselewski, F.; Wohlfahrt, K.; O’Leary, A. PyWavelets: A Python package for wavelet analysis. J. Open Source Softw. 2019, 4, 1237. [Google Scholar] [CrossRef]
Jordahl, K.; Van Den Bossche, J.; Wasserman, J.; McBride, J.; Fleischmann, M.; Gerard, J.; Tratner, J.; Perry, M.; Farmer, C.; Hjelle, G.A.; et al. Geopandas/Geopandas: V0.1.0 [Computer software]. Zenodo. 2020. Available online: https://zenodo.org/records/3669853 (accessed on 15 November 2024).
Python Software Foundation. Os—Miscellaneous Operating System Interfaces (Python Standard Library). Python. 2023. Available online: https://docs.python.org/3/library/os.html (accessed on 15 November 2024).
Python Software Foundation. Time—Time Access and Conversions (Python Standard Library). Python. 2023. Available online: https://docs.python.org/3/library/time.html (accessed on 15 November 2024).
Python Software Foundation. Python Language Reference, Version 3.10.16. Available online: https://www.python.org/ (accessed on 15 November 2024).

Figure 1. Location map of Al Batinah region, Oman, showing the distribution of rain gauge stations (red dots). The Al Batinah region (blue border) on the northeastern coast of Oman is highlighted in the main map, and its position in Oman (top right) and within the Middle East region (bottom right) are shown in the inset maps (map produced using python, see Appendix A).

Figure 2. The topographic and slope characteristics of the study area (Appendix A).

Figure 3. Architecture of the proposed two-layer Long Short-Term Memory (LSTM) model. The network consists of two stacked LSTM layers and dense and output layers, which are intended to capture temporal dependencies from flood-related time series data. Regularization techniques like dropout and batch normalization were utilized for better generalization and avoiding overfitting (Appendix A).

Figure 4. Flowchart of the LSTM-based methodology. It illustrates the complete workflow from raw data preprocessing through feature engineering, wavelet decomposition, model training, and uncertainty quantification using MCMC.

Figure 5. Scatter plot and QQ plot of model predictions for rainfall: (a) predicted versus actual rainfall values for LSTM, RF, SVM, ANN, and the ensemble model showing alignment with the ideal 1:1 line; (b) QQ plot comparing predicted and observed quantiles.

Figure 6. (a) Trace plot for posterior samples of σ with the MCMC chain traced out, burn-in part marked with dashed red line. (b) Running mean for σ posterior distribution showing convergence to posterior mean. (c) Posterior distribution for σ with posterior mean marked with dashed red line and 95% credible interval in blue shade. (d) Autocorrelation function for chain σ.

Figure 7. Wavelet decomposition of Water Stress Index (WSI). Detail components capture short-term changes that decrease over time, indicating stabilization, and trend component indicates a constant, long-term decrease in water stress, possibly due to climate-driven drought or reduced water supply.

Table 1. Description of datasets and variables used in the study, including soil properties, precipitation, vegetation, land surface temperature, and topographic data, along with their sources, spatial resolution, and temporal resolution.

Data	Description	Source	Spatial Resolution	Temporal Resolution
S_SAND, S_SILT, S_CLAY, T_SAND, T_SILT, T_CLAY, S_BULK_DEN, T_BULK_DEN, S_GRAVEL, T_GRAVEL	Surface and subsurface soil texture (% sand, silt, clay), bulk density (g/cm³), and gravel content (%)	Derived from FAO-HWSD (FAO, 2012) [38], Saxton et al. (1986) [37], and Wieder et al. (2014) [39]	~1 km (rasterized)	Static
PRCP	Precipitation (mm)	In situ measurements from the Ministry of Regional Municipalities and Water Resources	Station-based	Daily (aggregated to monthly)
NDVI	Normalized Difference Vegetation Index	MODIS/Terra MOD13A1 (Didan, 2021) [41]	500 m	16-day (aggregated to monthly)
LST	Land Surface Temperature (°C)	MODIS/Terra MOD11A2 (Wan et al., 2021) [40]	1 km	8-day (aggregated to monthly)
DEM	Elevation (m) and Derived Topography	NASADEM (NASA JPL, 2020) [36]	30 m (~1 arc-second)	Static

Table 2. Performance evaluation metrics used to assess model accuracy include Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Coefficient of Determination (R²), and Symmetric Mean Absolute Percentage Error (SMAPE).

Metric	Symbol	Equation	Units	Description/Interpretation
Mean Absolute Error	MAE	$\frac{1}{n} \sum_{i = 1}^{n} \|y_{i} - {\hat{y}}_{i}\|$	mm	Average magnitude of prediction errors. Lower values indicate better predictive accuracy.
Root Mean Squared Error	RMSE	$\sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}$	mm	Square root of the mean squared differences between observed and predicted values. Penalizes large errors more heavily.
Coefficient of Determination	R²	1 − $\frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - {\bar{y}}_{i})}^{2}}$	–	Proportion of variance in observed data explained by the model. Values closer to 1 indicate a better fit.
Symmetric Mean Absolute Percentage Error	SMAPE	$\frac{100}{n} \sum_{i = 1}^{n} \frac{\|y_{i} - {\hat{y}}_{i}\|}{(\|y_{i}\| + \|{\hat{y}}_{i}\|) / 2}$	percentage (%)	Measures relative prediction error, normalized by the mean of observed and predicted values. Useful for comparing errors across different scales.

Table 3. Performance metrics of different rainfall prediction models. The Kruskal–Wallis H-test shows a statistically significant difference in prediction errors between the models (H = 14,021.12, p < 0.001), which confirms that LSTM and ensemble models outperform the others.

Ranking	Model	MAE (mm/Day)	RMSE (mm/Day)	R² (-)	SMAPE (%)
3	Random Forest	0.0339	0.1257	0.747	21.19
1	LSTM (Custom Loss)	0.0222	0.1098	0.8068	7.62
5	SVM	0.243	15.1356	−3668.31	51.77
4	ANN	0.0342	0.155	0.615	32.05
2	Ensemble (LSTM + RF)	0.0275	0.1125	0.7971	13.65

Table 4. Comparison of LSTM model performance with different loss functions.

Loss Function	RMSE (All)	RMSE (Top 10%)	MAE (Top 10%)
MSE	0.1289	0.3052	0.158
MAE	0.1188	0.2783	0.148
Quantile_0.9	1.2205	2.9165	1.095
Custom Weighted Loss	0.1201	0.2816	0.1335

Table 5. Summary of MCMC simulation output and convergence statistics for parameter σ. This table holds the number of iterations, burn-in, step size, acceptance rate, effective sample size, Monte Carlo standard error, Geweke diagnostic Z-score, and posterior statistics, including mean, standard deviation, and 95% credible interval. The posterior estimates shown in this table are statistically acceptable, as indicated by convergence diagnostics (Geweke Z-score, effective sample size) and acceptance rate within the recommended range.

Statistic	Value	Notes
Monte Carlo Standard Error	0.0004	Precision of posterior estimates
Geweke Z-score	0.262	Convergence diagnostic
Posterior Mean of σ	0.118	Mean estimate after burn-in
Posterior Std Dev of σ	0.001	Uncertainty (standard deviation)
95% Credible Interval	[0.117, 0.120]	Interval covering 95% of posterior mass

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Al-Rawas, G.; Nikoo, M.R.; Sadra, N.; Al-Wardy, M. Leveraging Machine Learning Flood Forecasting: A Multi-Dimensional Approach to Hydrological Predictive Modeling. Water 2026, 18, 192. https://doi.org/10.3390/w18020192

AMA Style

Al-Rawas G, Nikoo MR, Sadra N, Al-Wardy M. Leveraging Machine Learning Flood Forecasting: A Multi-Dimensional Approach to Hydrological Predictive Modeling. Water. 2026; 18(2):192. https://doi.org/10.3390/w18020192

Chicago/Turabian Style

Al-Rawas, Ghazi, Mohammad Reza Nikoo, Nasim Sadra, and Malik Al-Wardy. 2026. "Leveraging Machine Learning Flood Forecasting: A Multi-Dimensional Approach to Hydrological Predictive Modeling" Water 18, no. 2: 192. https://doi.org/10.3390/w18020192

APA Style

Al-Rawas, G., Nikoo, M. R., Sadra, N., & Al-Wardy, M. (2026). Leveraging Machine Learning Flood Forecasting: A Multi-Dimensional Approach to Hydrological Predictive Modeling. Water, 18(2), 192. https://doi.org/10.3390/w18020192

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Leveraging Machine Learning Flood Forecasting: A Multi-Dimensional Approach to Hydrological Predictive Modeling

Abstract

1. Introduction

2. Methodology

2.1. Study Area

2.2. Data Collection and Preprocessing

2.3. Wavelet Transformation of the Time-Varying Data

2.4. LSTM Model Implementation and Evaluation

2.5. Forecasting and Uncertainty Estimation

3. Results

3.1. Model Performance Overview

3.2. Uncertainty Analysis and Bayesian Estimation

3.3. Sensitivity Analysis Insights and Limitations

3.4. Multi-Scale Insights from Wavelet Decomposition

4. Discussion

5. Study Limitations

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Computational Environment and Libraries

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI