Streamflow Forecasting: A Comparative Analysis of ARIMAX, Rolling Forecasting LSTM Neural Network and Physically Based Models in a Pristine Catchment
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Dataset
2.3. Experimental Setup
2.4. ARIMAX Model
2.5. Long Short-Term Memory (LSTM) Recurrent Neural Network
2.6. Physically Based Model
- Canopy-interception storage represents the precipitation that is captured on trees, shrubs, and grasses, and does not reach the soil surface. Precipitation is the only inflow. Water in canopy interception storage is removed by evaporation;
- Surface-interception storage is the volume of water held in shallow surface depressions. Inflows come from precipitation not captured by canopy interception and in excess of the infiltration rate. Outflows can be due to infiltration and to evapotranspiration (ET);
- Soil-profile storage represents the water stored in the top layer of the soil. Inflow is infiltration from the surface. Outflows include percolation to a groundwater layer and ET. The soil profile is subdivided into two distinct layers. The upper zone is defined as the portion of the soil profile that will lose water to ET and/or percolation. The tension zone is defined as the area that will lose water to ET only. The upper zone represents the water held in the pores of the soil. The tension zone represents the water attached to soil particles. ET occurs from the upper zone first and tension zone last.
- Groundwater storage layers in the SMA represent horizontal interflow processes. In this application, only one groundwater layer was used. Water percolates into groundwater storage layers from the soil profile. Losses from the groundwater storage layer are due to groundwater flow or to deep percolation. In the latter case, this water is considered lost from the system.
- Canopy Interception
- -
- Canopy storage [mm], represents the maximum amount of water that can be held on leaves before through-fall to the surface begins.
- Surface Interception
- -
- Surface storage [mm] represents the maximum amount of water that can be held on the soil surface before surface runoff begins.
- Soil Moisture Accounting
- -
- Soil storage [mm], total storage available in the soil layer;
- -
- Tension storage [mm], amount of soil storage that is not drained by percolation but only by evapotranspiration;
- -
- Groundwater storage [mm] represents the total storage in the groundwater layer;
- -
- Impervious percentage [%], percentage of the subcatchment with direct runoff production (no infiltration);
- -
- Maximum infiltration rate [mm/h] sets the upper bound on infiltration from the surface storage into the soil;
- -
- Soil percolation [mm/h] sets the upper bound on percolation from the soil storage into the groundwater;
- -
- Groundwater percolation rate [mm/h] sets the upper bound on deep percolation.
- Clark Unit Hydrograph
- -
- Time of concentration [h] defines the maximum response time in the sub-basin;
- -
- Storage coefficient [h], accounts for storage effects within the subcatchment surface.
- Linear Reservoir Baseflow
- -
- Groundwater coefficient [h] is used as the time lag on a linear reservoir for transforming water in storage to become lateral outflow;
- -
- Number of steps [−], which increases the attenuation of baseflow (minimum attenuation with a single step; attenuation increases as the reservoir release is repeated several times).
- Reach Routing
- -
- Lag [min], time that the inflow hydrograph will be translated.
2.7. Performance Evaluation Criteria
2.7.1. Nash–Sutcliffe Efficiency Index (NSE)
2.7.2. Kling–Gupta Efficiency Index (KGE)
2.7.3. Mean Absolute Error (MAE)
3. Results
3.1. ARIMAX Forecasting Results
3.2. LSTM Forecasting Results
3.3. Physically Based Model Forecasting Results
3.4. Models Performance During Significant Flood Events
3.4.1. Model Comparison Flood Event October–November 2018
3.4.2. Model Comparison Flood Event October–November 2010
3.4.3. Model Comparison Occasional Rain Event July 2021
4. Discussion and Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
LSTM | Long Short-Term Memory |
ARIMAX | Autoregressive Integrated Moving Average with Exogenous inputs |
RNN | Recurrent Neural Network |
HEC-HMS | Hydrologic Engineering Center-Hydrologic Modeling System |
NSE | Nash–Sutcliffe Efficiency |
KGE | Kling–Gupta Efficiency |
MAE | Mean Absolute Error |
IDF | Intensity–Duration–Frequency |
SMA | Soil Moisture Accounting |
ET | Evapotranspiration |
AVG | Average |
ADF | Augmented Dickey–Fuller |
IDW | Inverse Distance Weighting |
FDC | Flow Duration Curve |
References
- Blöschl, G. Predictions in ungauged basins—Where do we stand? Proc. Int. Assoc. Hydrol. Sci. 2016, 373, 57–60. [Google Scholar] [CrossRef]
- Troin, M.; Arsenault, R.; Wood, A.W.; Brissette, F.; Martel, J.L. Generating ensemble streamflow forecasts: A review of methods and approaches over the past 40 years. Water Resour. Res. 2021, 57, e2020WR028392. [Google Scholar] [CrossRef]
- Devia, G.K.; Ganasri, B.P.; Dwarakish, G.S. A review on hydrological models. Aquat. Procedia 2015, 4, 1001–1007. [Google Scholar] [CrossRef]
- Freeze, R.A.; Harlan, R. Blueprint for a physically-based, digitally-simulated hydrologic response model. J. Hydrol. 1969, 9, 237–258. [Google Scholar] [CrossRef]
- Kirchner, J.W. Getting the right answers for the right reasons: Linking measurements, analyses, and models to advance the science of hydrology. Water Resour. Res. 2006, 42, W03S04. [Google Scholar] [CrossRef]
- Chow, V.T.; Maidment, D.R.; Mays, L.W. Applied Hydrology; McGraw-Hill: New York, NY, USA, 1988. [Google Scholar]
- Wood, E.F.; Roundy, J.K.; Troy, T.J.; Van Beek, L.; Bierkens, M.F.; Blyth, E.; de Roo, A.; Döll, P.; Ek, M.; Famiglietti, J.; et al. Hyperresolution global land surface modeling: Meeting a grand challenge for monitoring Earth’s terrestrial water. Water Resour. Res. 2011, 47, 5. [Google Scholar] [CrossRef]
- Kratzert, F.; Klotz, D.; Brenner, C.; Schulz, K.; Herrnegger, M. Rainfall–runoff modelling using long short-term memory (LSTM) networks. Hydrol. Earth Syst. Sci. 2018, 22, 6005–6022. [Google Scholar] [CrossRef]
- Dorigo, W.; Wagner, W.; Albergel, C.; Albrecht, F.; Balsamo, G.; Brocca, L.; Chung, D.; Ertl, M.; Forkel, M.; Gruber, A.; et al. ESA CCI Soil Moisture for improved Earth system understanding: State-of-the art and future directions. Remote Sens. Environ. 2017, 203, 185–215. [Google Scholar] [CrossRef]
- Ochsner, T.E.; Cosh, M.H.; Cuenca, R.H.; Dorigo, W.A.; Draper, C.S.; Hagimoto, Y.; Kerr, Y.H.; Larson, K.M.; Njoku, E.G.; Small, E.E.; et al. State of the art in large-scale soil moisture monitoring. Soil Sci. Soc. Am. J. 2013, 77, 1888–1919. [Google Scholar] [CrossRef]
- Teng, J.; Jakeman, A.J.; Vaze, J.; Croke, B.F.; Dutta, D.; Kim, S. Flood inundation modelling: A review of methods, recent advances and uncertainty analysis. Environ. Model. Softw. 2017, 90, 201–216. [Google Scholar] [CrossRef]
- Gupta, H.V.; Sorooshian, S.; Yapo, P.O. Status of automatic calibration for hydrologic models: Comparison with multilevel expert calibration. J. Hydrol. Eng. 1999, 4, 135–143. [Google Scholar] [CrossRef]
- Beven, K.J. Rainfall-Runoff Modelling: The Primer; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
- Kaur, J.; Parmar, K.S.; Singh, S. Autoregressive models in environmental forecasting time series: A theoretical and application review. Environ. Sci. Pollut. Res. 2023, 30, 19617–19641. [Google Scholar] [CrossRef]
- Dimri, T.; Ahmad, S.; Sharif, M. Time series analysis of climate variables using seasonal ARIMA approach. J. Earth Syst. Sci. 2020, 129, 1–16. [Google Scholar] [CrossRef]
- Benvenuto, D.; Giovanetti, M.; Vassallo, L.; Angeletti, S.; Ciccozzi, M. Application of the ARIMA model on the COVID-2019 epidemic dataset. Data Brief 2020, 29, 105340. [Google Scholar] [CrossRef]
- Myronidis, D.; Ioannou, K.; Fotakis, D.; Dörflinger, G. Streamflow and hydrological drought trend analysis and forecasting in Cyprus. Water Resour. Manag. 2018, 32, 1759–1776. [Google Scholar] [CrossRef]
- Moura, R.; Mendes, A.; Cascalho, J.; Mendes, S.; Melo, R.; Barcelos, E. Predicting Flood Events with Streaming Data: A Preliminary Approach with GRU and ARIMA. In Proceedings of the International Conference on Optimization, Learning Algorithms and Applications, Tenerife, Spain, 24–26 July 2024; Springer: Berlin/Heidelberg, Germany, 2024; pp. 319–332. [Google Scholar]
- Hewamalage, H.; Bergmeir, C.; Bandara, K. Recurrent neural networks for time series forecasting: Current status and future directions. Int. J. Forecast. 2021, 37, 388–427. [Google Scholar] [CrossRef]
- Sabzipour, B.; Arsenault, R.; Troin, M.; Martel, J.L.; Brissette, F.; Brunet, F.; Mai, J. Comparing a long short-term memory (LSTM) neural network with a physically-based hydrological model for streamflow forecasting over a Canadian catchment. J. Hydrol. 2023, 627, 130380. [Google Scholar] [CrossRef]
- Ayzel, G.; Heistermann, M. The effect of calibration data length on the performance of a conceptual hydrological model versus LSTM and GRU: A case study for six basins from the CAMELS dataset. Comput. Geosci. 2021, 149, 104708. [Google Scholar] [CrossRef]
- Khatun, A.; Chatterjee, C.; Sahu, G.; Sahoo, B. A novel smoothing-based long short-term memory framework for short-to medium-range flood forecasting. Hydrol. Sci. J. 2023, 68, 488–506. [Google Scholar] [CrossRef]
- Hu, Y.; Huber, A.; Anumula, J.; Liu, S.C. Overcoming the vanishing gradient problem in plain recurrent networks. arXiv 2018, arXiv:1801.06105. [Google Scholar]
- Kratzert, F.; Gauch, M.; Klotz, D.; Nearing, G. HESS Opinions: Never train a Long Short-Term Memory (LSTM) network on a single basin. Hydrol. Earth Syst. Sci. 2024, 28, 4187–4201. [Google Scholar] [CrossRef]
- Leščešen, I.; Tanhapour, M.; Pekárová, P.; Miklánek, P.; Bajtek, Z. Long Short-Term Memory (LSTM) Networks for Accurate River Flow Forecasting: A Case Study on the Morava River Basin (Serbia). Water 2025, 17, 907. [Google Scholar] [CrossRef]
- De la Fuente, L.A.; Ehsani, M.R.; Gupta, H.V.; Condon, L.E. Toward interpretable LSTM-based modeling of hydrological systems. Hydrol. Earth Syst. Sci. 2024, 28, 945–971. [Google Scholar] [CrossRef]
- Lazzaro, G.; Basso, S.; Schirmer, M.; Botter, G. Water management strategies for run-of-river power plants: Profitability and hydrologic impact between the intake and the outflow. Water Resour. Res. 2013, 49, 8285–8298. [Google Scholar] [CrossRef]
- Botter, G.; Basso, S.; Rodriguez-Iturbe, I.; Rinaldo, A. Resilience of river flow regimes. Proc. Natl. Acad. Sci. USA 2013, 110, 12925–12930. [Google Scholar] [CrossRef] [PubMed]
- Paparoditis, E.; Politis, D.N. The asymptotic size and power of the augmented Dickey–Fuller test for a unit root. Econom. Rev. 2018, 37, 955–973. [Google Scholar] [CrossRef]
- Hyndman, R.J.; Khandakar, Y. Automatic time series forecasting: The forecast package for R. J. Stat. Softw. 2008, 27, 1–22. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Ali, P.J.M.; Faraj, R.H.; Koya, E.; Ali, P.J.M.; Faraj, R.H. Data normalization and standardization: A technical report. Mach. Learn. Sci. Technol. 2014, 1, 1–6. [Google Scholar]
- Botchkarev, A. Performance metrics (error measures) in machine learning regression, forecasting and prognostics: Properties and typology. arXiv 2018, arXiv:1809.03006. [Google Scholar]
- Leavesley, G.H. Precipitation-Runoff Modeling System: User’s Manual; US Department of the Interior: Burlington, MA, USA, 1984; Volume 83.
- Bennett, T.H. Development and Application of a Continuous Soil Moisture Accounting Algorithm for the Hydrologic Engineering Center Hydrologic Modeling System (HEC-HMS). Master’s Thesis, University of California, Davis, CA, USA, 1998. [Google Scholar]
- U.S. Army Corps of Engineers. Hydrologic Modeling System HEC-HMS Technical Reference Manual; Hydrologic Engineering Center: Davis, CA, USA, 2000. [Google Scholar]
- Hargreaves, G.H.; Allen, R.G. History and evaluation of Hargreaves evapotranspiration equation. J. Irrig. Drain. Eng. 2003, 129, 53–63. [Google Scholar] [CrossRef]
- Hargreaves, G.H.; Samani, Z.A. Reference crop evapotranspiration from temperature. Appl. Eng. Agric. 1985, 1, 96–99. [Google Scholar] [CrossRef]
- Clark, C. Storage and the unit hydrograph. Trans. Am. Soc. Civ. Eng. 1945, 110, 1419–1446. [Google Scholar] [CrossRef]
- Kramer, O. Scikit-learn. In Machine Learning for Evolution Strategies; Springer: Berlin/Heidelberg, Germany, 2016; pp. 45–53. [Google Scholar]
- Hallouin, T. Hydroeval: An Evaluator for Streamflow Time Series in Python. 2021. Available online: https://pypi.org/project/hydroeval/0.0.1.post1/ (accessed on 10 February 2024).
- McCuen, R.H.; Knight, Z.; Cutter, A.G. Evaluation of the Nash–Sutcliffe efficiency index. J. Hydrol. Eng. 2006, 11, 597–602. [Google Scholar] [CrossRef]
- Gupta, H.V.; Kling, H.; Yilmaz, K.K.; Martinez, G.F. Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. J. Hydrol. 2009, 377, 80–91. [Google Scholar] [CrossRef]
- Knoben, W.; Freer, J.; Woods, R. Technical note: Inherent benchmark or not? Comparing Nash-Sutcliffe and Kling-Gupta efficiency scores. Hydrol. Earth Syst. Sci. 2019, 23, 4323–4331. [Google Scholar] [CrossRef]
- Fabris, L.; Lazzaro, G.; Buddendorf, W.B.; Botter, G.; Soulsby, C. A general analytical approach for assessing the effects of hydroclimatic variability on fish habitat. J. Hydrol. 2018, 566, 520–530. [Google Scholar] [CrossRef]
Metric | ARIMAX | LSTM | Physically Based Hydrological Model |
---|---|---|---|
NSE [−] | 0.67 | 0.93 | 0.82 |
KGE [−] | 0.50 | 0.82 | 0.85 |
MAE [AVG m3/s] | 1.16 | 0.75 | 1.27 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Perazzolo, D.; Lazzaro, G.; Fiume, A.; Fanton, P.; Grisan, E. Streamflow Forecasting: A Comparative Analysis of ARIMAX, Rolling Forecasting LSTM Neural Network and Physically Based Models in a Pristine Catchment. Water 2025, 17, 2341. https://doi.org/10.3390/w17152341
Perazzolo D, Lazzaro G, Fiume A, Fanton P, Grisan E. Streamflow Forecasting: A Comparative Analysis of ARIMAX, Rolling Forecasting LSTM Neural Network and Physically Based Models in a Pristine Catchment. Water. 2025; 17(15):2341. https://doi.org/10.3390/w17152341
Chicago/Turabian StylePerazzolo, Diego, Gianluca Lazzaro, Alvise Fiume, Pietro Fanton, and Enrico Grisan. 2025. "Streamflow Forecasting: A Comparative Analysis of ARIMAX, Rolling Forecasting LSTM Neural Network and Physically Based Models in a Pristine Catchment" Water 17, no. 15: 2341. https://doi.org/10.3390/w17152341
APA StylePerazzolo, D., Lazzaro, G., Fiume, A., Fanton, P., & Grisan, E. (2025). Streamflow Forecasting: A Comparative Analysis of ARIMAX, Rolling Forecasting LSTM Neural Network and Physically Based Models in a Pristine Catchment. Water, 17(15), 2341. https://doi.org/10.3390/w17152341