Deep Learning-Based River Flow Forecasting with MLPs: Comparative Exploratory Analysis Applied to the Tejo and the Mondego Rivers
Abstract
:1. Introduction
“Can the systematic deployment of MLP models—optimized through extensive hyperparameter tuning and evaluated in comparison with alternative machine learning(ML) techniques—provide reliable and robust short-term river flow forecasts in dam-regulated rivers, under both normal and raised flow conditions?”
2. Related Work
2.1. ML and DL Models for River Flow Forecasting
- Autoregressive Integrated Moving Average (ARIMA)ARIMA models are traditional time series forecasting tools that capture temporal trends, seasonality, and time-dependent patterns in data. They are particularly effective in analyzing historical river flow data and providing baseline forecasts [11,12]. Despite their interpretability, ARIMA models often struggle with non-stationary data and the nonlinear patterns typical of hydrological processes [9].
- Linear Regression (LR)LR models the relationship between river flow and external factors such as precipitation and temperature through a linear equation. Although simple and interpretable, LR is limited in its ability to capture nonlinear interactions and complex dependencies that often characterize river flow processes [13].
- Multilayer Perceptron (MLP)MLPs are feedforward neural networks capable of capturing nonlinear relationships in data. They are well suited for short-term river flow predictions, especially when high temporal resolution data are available. While computationally efficient and flexible, MLPs require extensive hyperparameter tuning, and their performance is sensitive to the quality of input data. Furthermore, the default optimization algorithm (gradient descent) commonly used in MLPs often converges to local minima, particularly when handling highly stochastic time series data like river streamflow. This limitation, along with the risk of overfitting, can lead to inaccurate predictions, thus necessitating more advanced optimization strategies [14].
- Support Vector Machines (SVMs)SVMs are supervised learning models that predict continuous outcomes by finding the optimal hyperplane that separates data. They effectively model complex nonlinear relationships using kernel functions. In hydrological applications like river flow forecasting, this sensitivity may result in unreliable forecasts when the data display significant variations or noise, thereby decreasing overall forecasting accuracy [15].
- Random Forests (RFs)RFs are ensemble learning methods that construct multiple decision trees and merge their results to improve predictive accuracy and control overfitting [16]. As noted in [15], RF-based approaches sometimes underestimate extreme values in longer forecast horizons (e.g., 2- and 3-h projections), thereby diminishing their ability to accurately predict severe hydrological phenomena.
- Extreme Learning Machines (ELMs)ELMs are single-hidden-layer feedforward neural networks with randomly assigned weights and biases, offering rapid training speeds and good generalization performance. Sometimes, ELMs face accuracy limitations when modeling complex and highly dynamic hydrological scenarios, despite their efficiency. Additional disadvantages include their dependence on random initialization, which can lead to inconsistent performance across experiments, and the need for a large number of hidden neurons to achieve competitive accuracy, which increases both computational complexity and the risk of overfitting [17].
- Recurrent Neural Networks (RNNs)RNNs are designed to handle sequential data by maintaining a memory of previous inputs, making them effective for modeling dynamic systems such as river flow. However, standard RNNs often face challenges such as vanishing gradients that hinder their ability to learn long-term dependencies. Furthermore, they can suffer from instability issues such as gradient explosion, require the fine-tuning of learning rates and other meta-parameters, and generally struggle to capture long-term dynamics in complex hydrological time series [18].
- Long Short-Term Memory (LSTM)LSTMs, an advanced RNN architecture, overcome the vanishing gradient problem and capture long-term dependencies in time series data [19]. However, the complexity and computational requirements of LSTMs, as highlighted by Rahimzad et al. (2021), can become significant challenges, especially with large datasets [20].
- Convolutional Neural Networks (CNNs)Originally developed for spatial data, CNNs have been adapted for time series forecasting by capturing local patterns in hydrological data, such as precipitation and topography. Because CNNs are inherently designed to extract spatial features, they can struggle with modeling long-term temporal dependencies in hydrological processes and may require large amounts of training data and careful kernel size design to effectively capture the sequential nature of the data [21].
- Gated Recurrent Unit (GRU)GRUs are streamlined versions of LSTMs with fewer parameters, resulting in faster training while still effectively capturing temporal relationships. They may not capture the fine temporal dynamics as well as LSTMs in all cases, but they offer a good compromise between complexity and performance. Furthermore, GRUs may sometimes fall short when modeling very complex or long-term dependencies in hydrological time series compared to more sophisticated architectures [22].
- Positive and Negative Perceptron (PNP)PNPs incorporate both positive and negative contributions within their architecture, aiming to capture diverse hydrological characteristics more effectively. Being relatively new, further research is needed to establish their stability and reliability in river flow prediction. Furthermore, their innovative structure may require more complex tuning and extensive validation to ensure robustness under different hydrological conditions, and their relative performance against traditional models remains to be comprehensively evaluated [23].
- Attention-Based Neural Networks (AttNet)Recent studies have shown that attention-based models can significantly enhance streamflow forecasting by focusing on the most relevant temporal features [24,25]. Despite promising results, the increased complexity and computational demands of these models can be challenging in operational settings. Moreover, as highlighted by Liu et al. (2024) and Lee et al. (2024), these models often require extensive data segmentation, hyperparameter tuning, and significant computational resources to effectively manage and interpret complex hydrological data, which can impede their real-time application.
- Hybrid ModelsHybrid models integrate multiple methodologies—such as combining wavelet transforms with ML algorithms—to capture both linear and nonlinear patterns in river flow data. Hybrid models, by combining multiple methods such as wavelet transform and ML algorithms, demonstrate more advanced forecasting capabilities by exploiting both linear and nonlinear components of hydrological datasets [26,27].
2.2. Research Overview
3. Methodology
3.1. Case Studies: Tejo and Mondego Rivers
3.1.1. Tejo River
3.1.2. Mondego River
3.2. AI Model Construction
3.2.1. Data Collection
3.2.2. Preprocessing Steps
- Data Synchronization: Initially, datasets from various stations, including hydrometric and meteorological stations monitored by the Sistema Nacional de Informação de Recursos Hídricos (SNIRH), are acquired and formatted into a consistent structure. This involves parsing date–time information, normalizing measurement units, and synchronizing records from different stations to establish a unified timeline. Normally, the dataset is loaded from a comma-separated values (CSV) file, downloaded from the SNIRH website, and the date column is converted to a date–time format. The date column is then set as the index of the DataFrame to facilitate time series analysis.
- Missing Data Handling: Ensuring the completeness of the data is essential to preserve the accuracy and reliability of the dataset. To properly handle missing values, several techniques are employed, including linear interpolation and forward and backward filling. However, due to the nature of the river flow data, filling in missing values can introduce inaccuracies. Therefore, we create a set of functions to promote these assessments.
- Feature Selection: It is crucial to identify and select the key variables that significantly influence the predictive model’s performance. These variables include historical river flow discharge measurements, meteorological data (such as precipitation), and other relevant factors, such as dam discharge rates. For this study, we promote several selections based on the performance of the constructed model.
- Temporal Resampling: To achieve consistency, the data’s resolution is standardized through temporal resampling. The dataset is resampled to a daily frequency to ensure uniform time intervals. This step aggregates the data and fills any missing dates with interpolated values.
- Alignment of Common Periods: It is crucial to synchronize datasets from several stations so they overlap within the same time periods. This guarantees that models are trained on datasets encompassing all pertinent characteristics within the same time frame. Due to the presence of significant missing values, we focus on combining periods with minimal missing data to ensure the integrity of the dataset.The get_common_periods_sections function is used to identify periods with a maximum of 10 missing values. This approach helps maintain the validity of the river flow data while ensuring enough data are available for model training. The steps are shown in Algorithm 1, which includes combining datasets, figuring out missing values, and selecting acceptable time segments based on certain criteria.
Algorithm 1 Identify common periods with minimal missing values and fill missing data. |
Require: – dataframes: List of DataFrames from different stations |
– max_missing: Maximum allowed missing values per day (e.g., 10) |
– min_required_period: Minimum number of consecutive days required |
Ensure: – common_periods: List of start and end dates with minimal missing data |
– filled_data: DataFrames with missing values filled by interpolation |
1: Initialize common_periods as an empty list |
2: Initialize filled_data as an empty list |
3: Merge all dataframes on the date–time index using an outer join |
4: Calculate the total number of missing values per day |
5: Create a Boolean mask where missing values ≤ max_missing |
6: Find continuous True segments in the mask |
7: for each continuous segment do |
8: if length of segment ≥ min_required_period then |
9: Append (start_date, end_date) to common_periods |
10: Extract data for the segment |
11: Fill missing values in the segment using interpolation |
12: Append filled segment to filled_data |
13: end if |
14: end for |
15: return common_periods, filled_data |
- Transformation to Supervised Learning Format: Time series data are converted into a format suitable for supervised learning. The series_to_supervised function transforms the time series data into a supervised learning problem by creating lagged versions of the input features. This transformation allows the model to learn temporal dependencies in the data. The function creates input sequences of length and output sequences of length . To generate input–output pairings, Algorithm 2 methodically shifts the data. The future values () function as targets, and the lag features () as model inputs.
Algorithm 2 Transform time series to supervised learning format. |
Require: – data: Time series data as a DataFrame |
– n_in: Number of lag observations as input (e.g., 3) |
– n_out: Number of observations as output (e.g., 1) |
Ensure:> supervised_data: Transformed DataFrame suitable for supervised learning |
1: Initialize cols as an empty list |
2: for i = −n_in 0 do |
3: cols.append(data.shift(i)) |
4: end for |
5: for j = 1 n_out do |
6: cols.append(data.shift(−j)) |
7: end for |
8: Concatenate cols along the column axis |
9: Drop all rows with NaN values |
10: Rename columns appropriately (e.g., var(t − n), …, var(t), var(t + 1), …) |
11: return supervised_data |
3.2.3. Model Development
- Data Partitioning: The combined supervised data are split into training and testing sets using an 80–20 ratio. The training set is used to train the model, while the testing set is used to evaluate its performance.
- Model Architecture Definition: The structure of the ML models is determined according to the particular needs of the forecasting task. This involves choosing the appropriate number of layers, neurons per layer, activation functions, and other architectural characteristics for models such as LSTM, MLPs, ELMs, RFs, and SVMs. We mainly used the Keras library to define the model. Multiple instances of these models were trained and selected based on performance. For example, for MLPs, we used multiple dense layers with ReLU activation functions and dropout layers to avoid overfitting. The input dimension of the initial layer was set to the number of features in the training data, while the output dimension was set to the number of forecasting steps (three days).
- Hyperparameter Optimization: To improve model performance, hyperparameters were fine-tuned using grid search. The goal of this stage was to determine the best combination of hyperparameters that maximizes the predicted accuracy of the model. Depending on the model configuration, we varied the number of neurons per layer, the number of epochs, L2 regularization values, dropout rates, batch sizes, optimizers, and early stopping patience. Although comprehensive hyperparameter tuning improves model accuracy for a specific dataset, it creates issues with scalability and adaptability. While grid search enhances model performance under a set of conditions, it may fail to consider seasonal fluctuations, climate change, or human activities over time. Other tuning methodologies can be pursued, such as Bayesian Optimization or meta-learning, to counteract the diminishing model efficacy if hyperparameters are not adjusted.
- Model Training: Our models are trained using the Adam optimizer and mean squared error (MSE) as the loss function. Early stopping is used to monitor the validation loss and prevent overfitting. The model is trained for a specified number of epochs, and the best model weights are saved based on the validation loss.
3.2.4. Model Validation and Forecasting
- Prediction Generation on Test Data: The trained models generate predictions based on the test data. To evaluate model performance, the root mean squared error (RMSE) is calculated for each forecasting step (today, tomorrow, and the day after tomorrow).
- Validation with Future Data: The models are validated on a separate validation set containing recent data. For example, we currently use the entire 2023 dataset. The validation data are preprocessed and transformed in the same manner as the training data. Each model generates predictions for the validation period, and the RMSE is calculated for each forecasting step.
4. Theoretical Background
4.1. Multilayer Perceptron
4.2. Long Short-Term Memory Networks
4.3. Support Vector Machine
4.4. Extreme Learning Machine
4.5. Random Forest
5. Application to the Case Studies
5.1. Comparison of Models Performance for Tejo River and Selection of MLP
- 0. Castelo de Bode Average daily dam outflow discharge (m3/s)
- 1. Castelo de Bode Reservoir (m)
- 2. Fratel Average daily dam outflow discharge (m3/s)
- 3. Fratel Reservoir water level (m)
- 4. Almourol Daily Average discharge (m3/s)
- Scenarios a, b, and c: Validation period from 2022-08-07 to 2023-09-04.
- Scenario d: Validation period from 2003-03-31 to 2004-11-07.
- Scenario a (Input: [0,2] → [4], 2-day forecast):
- –
- The MLP configurations yielded RMSE values of 162.65 (today) and 227.02 (tomorrow) in one configuration, while LSTM models produced slightly higher errors (168.70 and 239.71, respectively).
- –
- SVM reported RMSE values of 367.23 and 367.31 in one configuration and 203.01 and 231.81 in another, indicating sensitivity to hyperparameter selection.
- –
- Although RF achieved an RMSE of 155.71 for the 1-day forecast, its error increased to 264.51 for the 2-day prediction.
- –
- ELM results were generally higher, with one configuration reporting 326.42 and 334.82, and another with 307.51 and 308.27.
- Scenario b (Input: [0,2,3] → [4], 2-day forecast):
- –
- The MLP produced RMSE values of 152.71 and 213.88 in one configuration and 149.14 and 227.87 in another, suggesting that including an additional input (from station 3) improved performance.
- –
- The LSTM model yielded RMSE values of 141.85 and 216.48, while SVM and RF achieved 198.12, 230.45 and 148.79, 261.46, respectively.
- –
- ELM reported values of 218.08 and 255.18.
- Scenario c (Input: [0,1,2,3,4] → [4], 2-day forecast):
- –
- The MLP achieved RMSE values as low as 136.71 and 206.49 in one configuration and 141.39 and 217.94 in another.
- –
- LSTM values were 136.02 and 212.92, while SVM produced 153.17 and 200.09.
- –
- RF and ELM in this scenario recorded RMSE values of 149.53, 257.72 and 103.82, 179.47 (with the latter configuration for ELM highlighting the potential for lower error in one output), respectively.
- Scenario d (Input: [0,2,4] → [4], 1-day forecast):
- –
- The MLP reported an RMSE of 104.53 for the 1-day forecast, which is slightly better than that of SVM, at 105.75, and notably lower than those of LSTM, at 121.51, and RF, at 119.11.
- –
- ELM, however, showed a higher RMSE of 223.36 (with one additional configuration at 356.93 in the LSTM column).
- Forecast Horizon: A two-day forecast inherently introduces more uncertainty, leading to higher RMSE values than a one-day forecast.
- Validation Period and Training Duration: The experimental configurations are validated over different periods. A validation period that captures higher discharge variability or extreme events results in higher RMSE values compared to a period with more stable flows.
- Input Features and Model Configuration: Although both configurations are based on dam data, variations in the number of input features (40 in scenario a vs. 20 in scenario d, with subsequent flattening to 60) and differences in the hyperparameter settings impact the model’s ability to capture the underlying hydrological dynamics.
- Hyperparameter Optimization: MLP requires the fine-tuning of parameters such as the number of neurons in hidden layers, learning rate, and regularization coefficients. Identifying the optimal configuration can be time-consuming and necessitates extensive experimentation, particularly with large and complex datasets [28,54].
- Overfitting: MLP is susceptible to overfitting when the model complexity exceeds the available training data. Overfitting can lead to excellent performance on the training dataset but poor generalization to unseen data. Although regularization techniques such as L2 regularization and dropout can mitigate this issue, they require meticulous calibration to balance model complexity and performance [55].
5.2. Model Configurations and Forecasting Results
5.2.1. Mondego River
5.2.2. Tejo River
- Scenario (1): Two hidden layers with 90 neurons each, trained for 100 epochs.
- Scenario (2): Two hidden layers with 150 neurons in the first hidden layer and 40 neurons in the second hidden layer, trained for 300 epochs.
5.3. Performance Metrics
- RMSE (Equation (9)) quantifies the overall magnitude of prediction errors by squaring individual differences before averaging, thus placing a heavier penalty on larger errors. A lower RMSE indicates better agreement between predictions and observations.
- Bias (Equation (10)) measures the systematic offset between the model and the observations. A positive bias means the model tends to overpredict, while a negative bias indicates underprediction.
5.4. Model Evaluation
- –
- Tejo River ResultsFor the Tejo River, the model accurately captures short-term trends, especially for today’s flow, with acceptable accuracy. However, as predictions extend further into the future, slight increases in RMSE reflect growing uncertainty. The model effectively tracks peaks and troughs but shows some under- and overpredictions as the forecast horizon grows. Both scenarios demonstrate reliable short-term forecasting, and Scenario (2) incorporates rainfall data to provide an alternative view of the influence of precipitation.
- –
- Mondego River ResultsIn contrast, the Mondego River models exhibit lower RMSE values, with 49.05 m3/s for 1-day forecasts (Scenario (1)). This indicates a higher predictive accuracy compared to the Tejo River. Similar trends of increasing RMSE are observed with longer forecast horizons, although the absolute errors remain smaller. The bias values are closer to zero, suggesting more balanced predictions.
6. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Explanation of Table Notations and Abbreviations
- Sc: Scenario.
- Cfg.: Configuration.
- Units: The number of neurons (or units) in a given layer.
- Act.: Activation function (e.g., ReLU, SELU, and Softplus).
- T/F: In the context of LSTM layers, these denote the return_sequences parameter. “T” indicates return_sequences=True (the layer returns the full sequence of outputs), and “F” indicates return_sequences=False (only the final output is returned).
- The Arrow Symbol (→): The arrow separates the specifications of successive layers within the network architecture. For example, in the configuration “96, ReLU, T → 96, ReLU, F → 96, ReLU, F”, the notation represents a sequence of three LSTM layers:
- –
- The first LSTM layer has 96 units, uses the ReLU activation function, and returns the full sequence (T).
- –
- The second LSTM layer also has 96 units with ReLU activation but returns only the final output (F).
- –
- The third LSTM layer similarly has 96 units, uses ReLU activation, and returns only the final output (F).
- L2 Reg: L2 regularization coefficient.
- lr: Learning rate, which determines the step size at each iteration while moving toward a minimum of the loss function.
- n_Estimators: Number of trees used in the Random Forest.
- max_Depth: Maximum depth allowed for each tree in the Random Forest.
- min_Samples_Split: Minimum number of samples required to split an internal node in the Random Forest.
- Gamma: Kernel coefficient for SVM.
- Epsilon: Epsilon parameter in the epsilon-SVR model, which defines the margin within which no penalty is associated with the training loss.
References
- Mosavi, A.; Ozturk, P.; Chau, K.W. Flood prediction using machine learning models: Literature review. Water 2018, 10, 1536. [Google Scholar] [CrossRef]
- Chen, Y.; Song, L.; Liu, Y.; Yang, L.; Li, D. A Review of the Artificial Neural Network Models for Water Quality Prediction. Appl. Sci. 2020, 10, 5776. [Google Scholar] [CrossRef]
- Egawa, T.; Suzuki, K.; Ichikawa, Y.; Iizaka, T.; Matsui, T.; Shikagawa, Y. A water flow forecasting for dam using neural networks and regression models. In Proceedings of the 2011 IEEE Power and Energy Society General Meeting, Detroit, MI, USA, 24–28 July 2011; pp. 1–5. [Google Scholar] [CrossRef]
- Oliveira, A.; Fortunato, A.B.; Rodrigues, M.; Azevedo, A.; Rogeiro, J.; Bernardo, S.; Lavaud, L.; Bertin, X.; Nahon, A.; de Jesus, G.; et al. Forecasting contrasting coastal and estuarine hydrodynamics with OPENCoastS. Environ. Model. Softw. 2021, 143, 105132. [Google Scholar] [CrossRef]
- Oliveira, A.; Fortunato, A.; Rogeiro, J.; Teixeira, J.; Azevedo, A.; Lavaud, L.; Bertin, X.; Gomes, J.; David, M.; Pina, J.; et al. OPENCoastS: An open-access service for the automatic generation of coastal forecast systems. Environ. Model. Softw. 2020, 124, 104585. [Google Scholar] [CrossRef]
- Korani, Z.M.; Challenger, M.; Moin, A.; Ferreira, J.C.; da Silva, A.R.; Jesus, G.; Alves, E.L.; Correia, R. From ML2 to ML2+: Integrating Time Series Forecasting in Model-Driven Engineering of Smart IoT Applications. In Proceedings of the 13th International Conference on Model-Based Software and Systems Engineering SCITEPRESS, Porto, Portugal, 25–27 February 2025; pp. 458–465. [Google Scholar] [CrossRef]
- Yaseen, Z.M.; El-Shafie, A.; Jaafar, O.; Afan, H.A.; Sayl, K.N. Artificial intelligence based models for stream-flow forecasting: 2000–2015. J. Hydrol. 2015, 530, 829–844. [Google Scholar] [CrossRef]
- Costa Silva, D.F.; Galvão Filho, A.R.; Carvalho, R.V.; de Souza L Ribeiro, F.; Coelho, C.J. Water Flow Forecasting Based on River Tributaries Using Long Short-Term Memory Ensemble Model. Energies 2021, 14, 7707. [Google Scholar] [CrossRef]
- Jain, A.; Sharma, B.; Gupta, C. A Brief Review of Flood Forecasting Techniques and Their Applications. Int. J. River Basin Manag. 2018, 15, 245–260. [Google Scholar] [CrossRef]
- Kumar, V.; Kedam, N.; Sharma, K.V.; Mehta, D.J.; Caloiero, T. Advanced Machine Learning Techniques to Improve Hydrological Prediction: A Comparative Analysis of Streamflow Prediction Models. Water 2023, 15, 2572. [Google Scholar] [CrossRef]
- Box, G.E.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
- Musarat, M.A.; Alaloul, W.S.; Rabbani, M.B.A.; Ali, M.; Altaf, M.; Fediuk, R.; Vatin, N.; Klyuev, S.; Bukhari, H.; Sadiq, A.; et al. Kabul River Flow Prediction Using Automated ARIMA Forecasting: A Machine Learning Approach. Sustainability 2021, 13, 10720. [Google Scholar] [CrossRef]
- Montgomery, D.C.; Runger, G.C. Applied Statistics and Probability for Engineers; John Wiley & Sons: Hoboken, NJ, USA, 2021. [Google Scholar]
- Pham, Q.B.; Afan, H.A.; Mohammadi, B.; Ahmed, A.N.; Linh, N.T.T.; Vo, N.D.; Moazenzadeh, R.; Yu, P.S.; El-Shafie, A. Hybrid Model to Improve the River Streamflow Forecasting Utilizing Multi-Layer Perceptron-Based Intelligent Water Drop Optimization Algorithm. Soft Comput. 2020, 24, 18039–18056. [Google Scholar] [CrossRef]
- Zhang, H.; Chen, Y.; Zhao, Y. Comparison of Random Forests and Support Vector Machine for Real-Time Radar-Derived Rainfall Forecasting. Water Resour. Manag. 2019, 33, 1543–1556. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar]
- Huang, G.B.; Zhu, Z.; Siew, K.M. Extreme learning machine: A new learning scheme for feedforward neural networks. IEEE Trans. Neural Netw. 2006, 17, 825–836. [Google Scholar]
- Ley, A.; Bormann, H.; Casper, M. Intercomparing LSTM and RNN to a Conceptual Hydrological Model for a Low-Land River with a Focus on the Flow Duration Curve. Water 2023, 15, 505. [Google Scholar] [CrossRef]
- Belvederesi, C.; Dominic, J.A.; Hassan, Q.K.; Gupta, A.; Achari, G. Predicting River Flow Using an AI-Based Sequential Adaptive Neuro-Fuzzy Inference System. Water 2020, 12, 1622. [Google Scholar] [CrossRef]
- Rahimzad, M.; Nia, A.M.; Zolfonoon, H.; Soltani, J.; Mehr, A.D.; Kwon, H. Performance Comparison of an LSTM-based Deep Learning Model versus Conventional Machine Learning Algorithms for Streamflow Forecasting. Water Resour. Manag. 2021, 35, 4167–4187. [Google Scholar] [CrossRef]
- Zhao, X.; Wang, H.; Bai, M.; Xu, Y.; Dong, S.; Rao, H.; Ming, W. A Comprehensive Review of Methods for Hydrological Forecasting Based on Deep Learning. Water 2024, 16, 1407. [Google Scholar] [CrossRef]
- Le, X.H.; Ho, H.V.; Lee, G. Application of gated recurrent unit (GRU) network for forecasting river water levels affected by tides. In Proceedings of the APAC 2019: 10th International Conference on Asian and Pacific Coasts, Hanoi, Vietnam, 25–28 September 2019; Springer: Singapore, 2020; pp. 673–680. [Google Scholar]
- Doe, J.; Smith, J.; Johnson, A. Deep Learning Algorithm Development for River Flow Prediction: PNP Algorithm. J. Hydrol. Eng. 2024, 58, 123–145. [Google Scholar] [CrossRef]
- Liu, X.; Zhang, Y.; Wang, L. Improving streamflow forecasting in semi-arid basins by combining data segmentation and attention-based deep learning. J. Hydrol. 2024, 615, 127–145. [Google Scholar] [CrossRef]
- Lee, J.; Kim, S.; Park, H. Enhanced streamflow forecasting using attention-based neural network models: A comparative study in MOPEX basins. Model. Earth Syst. Environ. 2024, 10, 145–159. [Google Scholar] [CrossRef]
- Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. Adv. Neural Inf. Process. Syst. 2015, 28, 802–810. [Google Scholar]
- Ni, L.; Wang, D.; Singh, V.; Wu, J.; Wang, Y.; Tao, Y.; Zhang, J. Streamflow and rainfall forecasting by two long short-term memory-based models. J. Hydrol. 2020, 583, 124296. [Google Scholar] [CrossRef]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Kratzert, F.; Klotz, D.; Hochreiter, S.; Nearing, G. Rainfall-Runoff Modelling with Long Short-Term Memory Networks. J. Hydrol. 2018, 560, 93–104. [Google Scholar]
- Ostadkalayeh, F.B.; Moradi, S.; Asadi, A.; Nia, A.M.; Taheri, S. Performance Improvement of LSTM-based Deep Learning Model for Streamflow Forecasting Using Kalman Filtering. Water Resour. Manag. 2023, 37, 3111–3127. [Google Scholar] [CrossRef]
- Ho, H.V.; Nguyen, D.; Le, X.H.; Lee, G. Multi-step-ahead water level forecasting for operating sluice gates in Hai Duong, Vietnam. Environ. Monit. Assess. 2022, 194, 442. [Google Scholar] [CrossRef]
- Cho, K.; Kim, Y. Improving Streamflow Prediction in the WRF-Hydro Model with LSTM Networks. J. Hydrol. 2021, 605, 127297. [Google Scholar] [CrossRef]
- Xie, K.; Liu, P.; Zhang, J.; Han, D.; Wang, G.; Shen, C. Physics-guided deep learning for rainfall-runoff modeling by considering extreme events and monotonic relationships. J. Hydrol. 2021, 603, 127043. [Google Scholar] [CrossRef]
- Xiang, Z.; Demir, I. Distributed long-term hourly streamflow predictions using deep learning - A case study for State of Iowa. Environ. Model. Softw. 2020, 131, 104761. [Google Scholar] [CrossRef]
- Nguyen, T.T.H.; Vu, D.Q.; Mai, S.T.; Dang, T. Streamflow Prediction in the Mekong River Basin Using Deep Neural Networks. IEEE Access 2023, 11, 97930–97943. [Google Scholar] [CrossRef]
- Hunt, K.M.R.; Matthews, G.R.; Pappenberger, F.; Prudhomme, C. Using a long short-term memory (LSTM) neural network to boost river streamflow forecasts over the western United States. Hydrol. Earth Syst. Sci. 2022, 26, 5449–5472. [Google Scholar] [CrossRef]
- Bărbulescu, A.; Zhen, L. Forecasting the River Water Discharge by Artificial Intelligence Methods. Water 2024, 16, 1248. [Google Scholar] [CrossRef]
- Ahmad, A.; Reza, M.; Khan, S. River water flow prediction rate based on machine learning algorithms: A case study of Dez River, Iran. J. Hydrol. Stud. 2023, 58, 123–135. [Google Scholar] [CrossRef]
- Islam, K.I.; Elias, E.; Carroll, K.C.; Brown, C. Exploring Random Forest Machine Learning and Remote Sensing Data for Streamflow Prediction: An Alternative Approach to a Process-Based Hydrologic Modeling in a Snowmelt-Driven Watershed. Remote Sens. 2023, 15, 3999. [Google Scholar] [CrossRef]
- Mahmood, O.A.; Sulaiman, S.; Al-Jumeily, D. Forecasting for Haditha reservoir inflow in the West of Iraq using Support Vector Machine (SVM). PLoS ONE 2024, 19, e0308266. [Google Scholar] [CrossRef]
- Dibike, Y.; Solomatine, D. River flow forecasting using artificial neural networks. Phys. Chem. Earth Part B Hydrol. Ocean. Atmos. 2001, 26, 1–7. [Google Scholar] [CrossRef]
- Brandão, A.R.A.; de Menezes Filho, F.C.M.; Oliveira, P.T.S.; Fava, M.C. Artificial neural networks applied for flood forecasting in ungauged basin—The Paranaíba river study case. Proc. IAHS 2024, 386, 81–86. [Google Scholar] [CrossRef]
- SNIRH. SNIRH—Sistema Nacional de Informação de Recursos Hídricos. 2024. Available online: https://snirh.apambiente.pt/ (accessed on 30 December 2024).
- Fernández-Nóvoa, D.; Ramos, A.M.; González-Cao, J.; García-Feal, O.; Catita, C.; Gómez-Gesteira, M.; Trigo, R.M. How to mitigate flood events similar to the 1979 catastrophic floods in the lower Tagus. Nat. Hazards Earth Syst. Sci. 2024, 24, 609–630. [Google Scholar] [CrossRef]
- Rodrigues, M.; Cravo, A.; Freire, P.; Rosa, A.; Santos, D. Temporal assessment of the water quality along an urban estuary (Tagus estuary, Portugal). Mar. Chem. 2020, 223, 103824. [Google Scholar] [CrossRef]
- Alves, E.; Mendes, L.S. Modelação da inundação fluvial do Baixo Mondego. Recur. Hídricos 2014, 35, 41–54. [Google Scholar] [CrossRef]
- Bishop, C.M. Neural Networks for Pattern Recognition; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
- Haykin, S. Neural Networks and Learning Machines; Pearson: London, UK, 2009. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar]
- Vapnik, V.N. Statistical Learning Theory; John Wiley & Sons: Hoboken, NJ, USA, 1998. [Google Scholar]
- Khan, M.T.; Shoaib, M.; Hammad, M.; Salahudin, H.; Ahmad, F.; Ahmad, S. Application of Machine Learning Techniques in Rainfall–Runoff Modelling of the Soan River Basin, Pakistan. Water 2021, 13, 3528. [Google Scholar] [CrossRef]
- Seyam, M.; Othman, F. Hourly stream flow prediction in tropical rivers by multi-layer perceptron network. Desalin. Water Treat. 2017, 93, 187–194. [Google Scholar] [CrossRef]
- Wegayehu, E.B.; Muluneh, F.B. Short-Term Daily Univariate Streamflow Forecasting Using Deep Learning Models. Adv. Meteorol. 2022, 2022, 1–21. [Google Scholar] [CrossRef]
- Kumar, K.S.R.; Biradar, R.V. An Intelligent Flood Prediction System Using Deep Learning Techniques and Fine Tuned MobileNet Architecture. SN Comput. Sci. 2024, 5, 317. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Wilks, D.S. Statistical Methods in the Atmospheric Sciences, 3rd ed.; Academic Press: Cambridge, MA, USA, 2011. [Google Scholar]
Sc | Cfg. | LSTM Layers (Units, Act., Seq.) | Dropout | Output (Units, Act.), Optimizer, Batch, Epochs |
---|---|---|---|---|
a | 1 | 96, ReLU, T → 96, ReLU, F → 96, ReLU, F | 0.3/-/0.2 | 2, Softplus, Adam (lr = 0.00929), 32, 50 |
b | 1 | 128, ReLU, T → 128, ReLU, F → 96, ReLU, F | 0.4/0.3/- | 2, Linear, Adam (lr = 0.00691), 32, 50 |
2 | 50, ReLU, F | - | 2, Linear, Adam, 32, 100 | |
c | 1 | 64, ReLU, T → 128, ReLU, F → 40, ReLU, F | 0.1/-/0.1 | 2, Linear, Adam (lr = 0.00576), 32, 50 |
d | 1 | 32, LSTM, T → 90, LSTM, F → 96, LSTM, F | 0.1/-/- | 1, Linear, Adam (lr = 0.00022), 32, 50 |
2 | 50, ReLU, T → 30, ReLU, F | 0.1 | 1, SELU, Adam, 16, 1000 |
Sc | Cfg. | Hidden Layers (Units, Act.) | L2 Reg | Output (Units, Act.), Optimizer, Batch, Epochs |
---|---|---|---|---|
a | 1 | [90, 90], ReLU | 0.01 | 2, SELU, Adam (lr = 0.001), 16, 100 |
b | 1 | [150, 40], ReLU | 0.01 | 2, SELU, Adam (lr = 0.001), 16, 13 |
2 | [150, 40], ReLU | 0.01 | 2, SELU, Adam (lr = 0.001), 16, 300 | |
c | 1 | [90, 90], ReLU | 0.01 | 2, SELU, Adam (lr = 0.001), 16, 100 |
2 | 64, ReLU → 90, ReLU | 0.01 | 2, ReLU, Adam (lr = 0.0003916), 16, 100 | |
d | 1 | 150, ReLU → 150, ReLU → 90, ReLU | 0.01 | 1, ReLU, Adam (lr = 0.0003097), 16, 100 |
Sc | Cfg. | Hidden Neurons | Activation |
---|---|---|---|
a | 1 | 90 | sigm |
2 | 150 | sigm | |
b | 1 | 200 | tanh |
c | 1 | 50 | tanh |
d | 1 | 90 | sigm |
Sc | Cfg. | C | Gamma | Epsilon |
---|---|---|---|---|
a | 1 | 3 | scale | 0.02 |
2 | 5 | 0.001 | 0.02 | |
b | 1 | 10 | scale | 0.02 |
c | 1 | 100 | scale | 0.2 |
2 | 100 | scale | 0.2 | |
d | 1 | 100 | scale | 0.5 |
Sc | Cfg. | n_Estimators, max_Depth, min_Samples_Split |
---|---|---|
a | 1 | 100, Default (None), Default (2) |
b | 1 | 100, Default (None), Default (2) |
c | 1 | 100, None (Day-1) → 10 (Day-2), 10 |
d | 1 | 100, 30, 10 |
Sc. | Validation Period | Input/Output | Days | MLP | LSTM | SVM | RF | ELM |
---|---|---|---|---|---|---|---|---|
162.65 | 168.70, | 367.23, | 155.71, | 326.42, | ||||
a | 2022–08 to 2023–09 | [0,2] → [4] | 2 | 227.02 | 239.71 | 367.31 | 264.51 | 334.82 |
203.01, | 307.51, | |||||||
231.81 | 308.27 | |||||||
152.71, | 141.85, | 198.12, | 148.79, | 218.08, | ||||
b | 2022–08 to 2023–09 | [0,2,3] → [4] | 2 | 213.88 | 216.48 | 230.45 | 261.46 | 255.18 |
149.14, | ||||||||
227.87 | ||||||||
136.71, | 136.02, | 153.17, | 149.53, | 103.82, | ||||
c | 2022–08 to 2023–09 | [0,1,2,3,4] → [4] | 2 | 206.49 | 212.92 | 200.09 | 257.72 | 179.47 |
141.39, | ||||||||
217.94 | ||||||||
d | 2003–03 to 2004–11 | [0,2,4] → [4] | 1 | 104.53 | 121.51 | 105.75 | 119.11 | 223.36 |
356.93 |
Scenario | Station/Features |
---|---|
(1) Inputs | Fratel (R.E.) (16K/02A)/Average daily dam outflow discharge (m3/s) |
Almourol (17G/02H)/Average daily river discharge (m3/s) | |
Castelo de Bode (R.E.) (16H/01A)/Average daily dam outflow discharge (m3/s) | |
(1) Output | Almourol (17G/02H)/Average daily river discharge (m3/s) |
(2) Inputs | Abrantes (17H/01C)/Daily precipitation(mm) |
Alvaiázere (15G/01UG)/Daily precipitation (mm) | |
Covilhã (12L/03G)/Daily precipitation (mm) | |
Ladoeiro (14N/02UG)/Daily precipitation (mm) | |
Almourol (17G/02H)/Average daily river discharge (m3/s) | |
(2) Output | Almourol (17G/02H)/Average daily river discharge (m3/s) |
Scenario | Station/Features |
---|---|
(1) Inputs | Albufeira da Aguieira (R.E.) (11H/01A)/Average daily dam outflow discharge (m3/s) |
Albufeira da Raiva (R.E.) (12H/01A)/Average daily dam outflow discharge (m3/s) | |
Albufeira de Fronhas (R.E.) (12I/01A)/Average daily dam outflow discharge (m3/s) | |
Açude Ponte Coimbra (12G/01AE)/Average daily weir outflow discharge (m3/s) | |
(1) Output | Açude Ponte Coimbra (12G/01AE)/Average daily weir outflow discharge (m3/s) |
River | Scenario | Training Set | Testing Set | Validation Set |
---|---|---|---|---|
Tejo | (1) | 8042 samples, 60 features | 2011 samples, 60 features | 350 samples, 60 features |
(2) | 7120 samples, 100 features | 1780 samples, 100 features | 350 samples, 100 features | |
Mondego | (1) | 3643 samples, 80 features | 911 samples, 80 features | 1873 samples, 80 features |
Sc. | Cfg. | Hidden Layers (Units, Act.) | L2 Reg | Output (Units, Act.), Optimizer, Batch, Epochs |
---|---|---|---|---|
(1) | MLP1 | [50], ReLU | 0.1 | 3, linear, Adam, 32, 50 |
(1) | MLP2 | [90, 90], ReLU | 0.1 | 3, linear, Adam, 32, 100 |
(1) | MLP3 | [150, 40], ReLU | 0.1 | 3, linear, Adam, 32, 300 |
Model | RMSE (m3/s) Today | RMSE (m3/s) Tomorrow | RMSE (m3/s) Day After | Validation Period |
---|---|---|---|---|
MLP1 | 26.79 | 27.91 | 28.24 | 2024-01-01 to 2024-08-12 |
MLP2 | 24.51 | 44.87 | 66.21 | 2024-01-01 to 2024-08-12 |
MLP3 | 34.01 | 45.54 | 54.04 | 2024-01-01 to 2024-08-12 |
River | Scenario | Forecast Horizon | RMSE (m3/s) | Bias (m3/s) |
---|---|---|---|---|
Tejo | (1) | 1-Day (Today) | 163.1 | −22.5 |
(1) | 2-Day (Tomorrow) | 212.1 | −20.0 | |
(1) | 3-Day (After Tomorrow) | 228.4 | −18.9 | |
(2) | 1-Day (Today) | 169.9 | −5.6 | |
(2) | 2-Day (Tomorrow) | 215.1 | −25.8 | |
(2) | 3-Day (After Tomorrow) | 232.6 | −21.5 | |
Mondego | (1) | 1-Day (Today) | 49.05 | 2.08 |
(1) | 2-Day (Tomorrow) | 72.80 | −0.76 | |
(1) | 3-Day (After Tomorrow) | 86.50 | −3.62 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jesus, G.; Mardani, Z.; Alves, E.; Oliveira, A. Deep Learning-Based River Flow Forecasting with MLPs: Comparative Exploratory Analysis Applied to the Tejo and the Mondego Rivers. Sensors 2025, 25, 2154. https://doi.org/10.3390/s25072154
Jesus G, Mardani Z, Alves E, Oliveira A. Deep Learning-Based River Flow Forecasting with MLPs: Comparative Exploratory Analysis Applied to the Tejo and the Mondego Rivers. Sensors. 2025; 25(7):2154. https://doi.org/10.3390/s25072154
Chicago/Turabian StyleJesus, Gonçalo, Zahra Mardani, Elsa Alves, and Anabela Oliveira. 2025. "Deep Learning-Based River Flow Forecasting with MLPs: Comparative Exploratory Analysis Applied to the Tejo and the Mondego Rivers" Sensors 25, no. 7: 2154. https://doi.org/10.3390/s25072154
APA StyleJesus, G., Mardani, Z., Alves, E., & Oliveira, A. (2025). Deep Learning-Based River Flow Forecasting with MLPs: Comparative Exploratory Analysis Applied to the Tejo and the Mondego Rivers. Sensors, 25(7), 2154. https://doi.org/10.3390/s25072154