Hybrid Streamflow Forecasting with ERA5 and Machine Learning Across Daily and Monthly Time Scales

França, Gutemberg Borges; Almeida, Vinícius Albuquerque de; Senna, Mônica Carneiro Alves; Souza, Enio Pereira de; Silva, Madson Tavares; Aranha, Thaís Regina Benevides Trigueiro; Silva, Maurício Soares da; Araujo, Afonso Augusto Magalhães de; Melo, Gabriel Titara Silva de; Almeida, Manoel Valdonel de; Velho, Haroldo Fraga Campos; Frota, Mauricio Nogueira; Freitas, Gabriel Gomes; Anochi, Juliana Aparecida; Moreno Aldana, Emanuel Alexander; Viana, Lude Quieto

doi:10.3390/w18111337

Open AccessArticle

Hybrid Streamflow Forecasting with ERA5 and Machine Learning Across Daily and Monthly Time Scales

by

Gutemberg Borges França

¹

,

Vinícius Albuquerque de Almeida

¹

,

Mônica Carneiro Alves Senna

^2,*

,

Enio Pereira de Souza

³

,

Madson Tavares Silva

³

,

Thaís Regina Benevides Trigueiro Aranha

³

,

Maurício Soares da Silva

¹

,

Afonso Augusto Magalhães de Araujo

⁴

,

Gabriel Titara Silva de Melo

⁴,

Manoel Valdonel de Almeida

¹,

Haroldo Fraga Campos Velho

⁵

,

Mauricio Nogueira Frota

⁶,

Gabriel Gomes Freitas

¹,

Juliana Aparecida Anochi

⁵

,

Emanuel Alexander Moreno Aldana

⁶

and

Lude Quieto Viana

⁷

¹

Department of Meteorology, Federal University of Rio de Janeiro (UFRJ), Rio de Janeiro 21941-916, RJ, Brazil

²

Department of Geoenvironmental Analysis, Federal Fluminense University (UFF), Niterói 24220-900, RJ, Brazil

³

Department of Atmospheric Sciences, Federal University of Campina Grande (UFCG), Campina Grande 58429-900, PB, Brazil

⁴

Department of Water Resources and Environment, Federal University of Rio de Janeiro (UFRJ), Rio de Janeiro 21941-909, RJ, Brazil

⁵

National Institute for Space Research (INPE), São José dos Campos 12227-010, SP, Brazil

⁶

Postgraduate Programme in Metrology, Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Rio de Janeiro 22451-900, RJ, Brazil

⁷

Light Energia S.A., Electric Power Concessionaire, Rio de Janeiro 20080-002, RJ, Brazil

^*

Author to whom correspondence should be addressed.

Water 2026, 18(11), 1337; https://doi.org/10.3390/w18111337

Submission received: 19 March 2026 / Revised: 14 May 2026 / Accepted: 25 May 2026 / Published: 1 June 2026

(This article belongs to the Special Issue Climate Modeling and Impacts of Climate Change on Hydrological Cycle)

Download

Browse Figures

Versions Notes

Abstract

This study presents an updated Hybrid Hydrological Forecasting System (HHFS) for streamflow prediction at the Santa Branca outlet, located in the upper Paraíba do Sul River Basin in southeastern Brazil, aiming to support hydropower-oriented water resources management. This paper is explicitly framed as a companion paper which introduced the original HHFS framework and demonstrated the feasibility of combining deterministic and probabilistic machine-learning approaches for monthly streamflow forecasting. Building upon that foundation, the present study develops and validates a substantially enhanced and operationally oriented version of the system. The upgraded HHFS replaces the original BR-DWGD forcing strategy—a Brazilian gridded meteorological dataset useful for research applications but not routinely updated for sustained operations—with ERA5, the fifth-generation global atmospheric reanalysis produced by the European Centre for Medium-Range Weather Forecasts (ECMWF), which provides temporally consistent and operationally updated meteorological fields. This transition renders the framework fully operational while preserving the original dual-stage architecture, composed of a deterministic forecasting module (GA₁) and a hydro-adaptive uncertainty module (GA₂). In addition, the study introduces a daily short-term forecasting extension using a single multi-output XGBoost 2.1.1 model to predict streamflow from D+1 to D+10. Predictive uncertainty is quantified using split conformal prediction, a distribution-free uncertainty method that provides valid prediction intervals with empirical coverage guarantees. Coverage represents the proportion of observed values falling within the prediction intervals and is used here as a reliability metric. For the monthly product, the ERA5-based methodology maintained and slightly improved deterministic skill relative to the original BR-DWGD benchmark, with independent-test NSE increasing to 0.798, KGE to 0.878, and RMSE decreasing to 18.778 m³/s. The probabilistic component preserved a high hit rate and similar relative width, although coverage declined modestly to 0.838, indicating slight undercoverage relative to the previous reliability target. For the daily forecasts, predictive skill decreased progressively with lead time, from NSE = 0.881 at D+1 to 0.394 at D+10, accompanied by coherent widening of the uncertainty intervals. Taken together, these results demonstrate that ERA5 is a robust and operationally practical forcing source for the HHFS, preserving monthly forecasting skill while enabling a promising multi-day extension for anticipatory streamflow prediction across multiple temporal scales.

Keywords:

streamflow forecasting; machine learning; ERA5 reanalysis; operational hydrology; probabilistic forecasting; uncertainty quantification; Paraíba do Sul River Basin

1. Introduction

Reliable streamflow forecasting is essential for reservoir operation, hydropower scheduling, flood preparedness, drought mitigation, and anticipatory water-resources management. In hydro-economically strategic basins, forecast information supports operational and long-term decisions under strong hydroclimatic variability and growing pressure on water and energy systems. These challenges are particularly relevant in southeastern Brazil, where the Paraíba do Sul River Basin supplies major metropolitan regions and sustains an important fraction of the national hydropower infrastructure. The basin exhibits pronounced seasonal variability, nonlinear rainfall–runoff behaviour, and strong dependence on large-scale atmospheric forcing mechanisms, including the South Atlantic Convergence Zone (SACZ), making streamflow forecasting particularly challenging under both wet- and dry-season conditions [1,2,3].

Traditionally, streamflow forecasting has relied on physically based and conceptual hydrological models such as SWAT (Soil and Water Assessment Tool), HBV (Hydrologiska Byråns Vattenbalansavdelning), VIC (Variable Infiltration Capacity), and MGB (Modelo de Grandes Bacias). These models provide physically interpretable representations of infiltration, evapotranspiration, groundwater recharge, and runoff generation processes [4,5,6]. Despite their hydrological consistency, such approaches often require extensive calibration and may exhibit reduced flexibility under strongly nonlinear hydroclimatic conditions, particularly in tropical and subtropical basins characterized by pronounced seasonality and high interannual variability [7].

Over the last decade, machine-learning and deep-learning approaches have emerged as promising alternatives for streamflow prediction. Methods based on Random Forest, Gradient Boosting, XGBoost, Support Vector Machines, Multilayer Perceptrons (MLP), Long Short-Term Memory (LSTM), and hybrid neural architectures have demonstrated strong capacity to represent nonlinear rainfall–runoff relationships and long-memory hydrological behaviour [8,9,10,11,12,13,14]. In particular, deep-learning rainfall–runoff systems have shown remarkable predictive performance at multiple temporal scales, especially in large-sample hydrology applications [11,12]. However, many of these approaches still exhibit important limitations, including reduced physical interpretability, sensitivity to forcing inconsistencies, limited treatment of predictive uncertainty, and weak adaptation to hydroclimatic regime changes [15].

Parallel to deterministic forecasting advances, probabilistic hydrological forecasting has become increasingly important for operational water management. Probabilistic systems attempt to represent forecast uncertainty through ensemble spread, predictive intervals, quantile estimation, or post-processing techniques such as Ensemble Model Output Statistics (EMOS), Bayesian approaches, quantile regression, and conformal prediction [16,17,18,19,20]. Among these, conformal prediction has recently attracted attention because it provides distribution-free prediction intervals with empirical coverage guarantees under relatively weak assumptions [20]. Nevertheless, most probabilistic streamflow forecasting approaches remain focused on pointwise uncertainty estimation and often fail to explicitly represent hydrological seasonality, antecedent basin memory, or intra-period variability. Moreover, relatively few studies have explored probabilistic forecasting frameworks capable of simultaneously combining operational interpretability, multi-scale forecasting, and hydro-adaptive uncertainty behaviour [15,16,17,18,19].

An additional challenge concerns the dependence of hydrological forecasting systems on meteorological forcing products. Changes in forcing datasets may alter precipitation statistics, thermodynamic distributions, spatial representativeness, and temporal aggregation properties, thereby modifying the learned relationships among atmospheric forcing, antecedent hydrological memory, and streamflow response [21,22]. Consequently, forecasting systems trained under one forcing regime may not necessarily preserve their predictive structure or probabilistic behaviour after migration to another meteorological product. Despite the operational importance of this issue, relatively few studies have explicitly investigated how hybrid machine-learning hydrological forecasting systems respond to forcing replacement while maintaining hydrological coherence and predictive stability.

Recently, França et al., (2026) [2] introduced the Hybrid Hydrological Forecasting System (HHFS), a dual-stage framework for monthly streamflow forecasting at the Santa Branca outlet in the upper Paraíba do Sul River Basin. The original HHFS combined a deterministic hybrid machine-learning forecasting stage (GA₁) with a hydro-adaptive probabilistic uncertainty stage (GA₂). The deterministic component integrated weighted nonlinear regression, regime-aware ensemble fusion, rolling bias correction, and month-wise mean–variance calibration, while the probabilistic stage generated hydro-adaptive uncertainty intervals calibrated against the observed intra-month discharge envelope. In practical terms, this structure allows the HHFS not only to forecast the expected mean monthly streamflow, but also to estimate the anticipated intra-month hydrological amplitude represented by the observed minimum–maximum discharge envelope. The framework demonstrated robust independent performance and introduced an interpretable uncertainty structure capable of dynamically adapting to seasonal hydrological variability.

However, despite its scientific robustness, the original HHFS remained limited from an operational perspective because it relied on BR-DWGD (Brazilian Daily Weather Gridded Data), a high-resolution Brazilian meteorological dataset primarily designed for research applications rather than sustained operational forecasting [23]. Although BR-DWGD provides valuable hydroclimatic information over Brazil, it is not continuously updated with the temporal consistency and operational availability required for routine forecasting execution. Therefore, BR-DWGD was retained in the original HHFS study because of its high-quality retrospective hydroclimatic representation over Brazil, but its limited operational continuity motivated the transition toward ERA5 in the present work. This limitation motivated the need to transition the HHFS toward a meteorological forcing source capable of supporting long-term operational deployment.

To address this limitation, the present study replaces BR-DWGD with ERA5, the fifth-generation global atmospheric reanalysis produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) [21]. ERA5 provides globally consistent meteorological fields with continuous temporal coverage, systematic updates, physically coherent atmospheric reanalysis, and operational availability suitable for hydrological forecasting systems. Unlike static or intermittently updated datasets, ERA5 supports reproducible operational workflows because new atmospheric analyses are continuously incorporated into the database. However, this transition is not a simple data-source substitution. Replacing BR-DWGD with ERA5 modifies predictor distributions, precipitation statistics, thermodynamic descriptors, and hydrological-memory relationships, requiring full retraining and reassessment of the forecasting system.

Building upon the HHFS framework introduced by França et al. (2026) [2], the present study advances the system toward operational implementation while preserving the original hybrid conceptual architecture. Whereas the previous work established the methodological foundations of the HHFS for monthly streamflow forecasting, the present manuscript focuses on three major developments: (i) replacing BR-DWGD forcing with ERA5 reanalysis products to enable sustained operational applicability; (ii) retraining and revalidating the complete deterministic (GA₁) and probabilistic (GA₂) forecasting structure under the new forcing regime; and (iii) extending the framework toward direct multi-output daily forecasting from D+1 to D+10 using a unified XGBoost-based forecasting pipeline with split conformal uncertainty intervals. In addition, the present study evaluates whether the ERA5-based HHFS can simultaneously preserve deterministic predictive skill, hydro-adaptive probabilistic behaviour, and hydrological interpretability across both monthly and daily forecasting horizons.

The daily forecasting extension constitutes an additional contribution of this work. Unlike recursive forecasting approaches, direct multi-output forecasting predicts all lead times simultaneously within a single forecasting structure, reducing error propagation between horizons [24,25,26]. In this study, the daily extension was designed as a proof-of-concept operational prototype to investigate whether the ERA5-based framework could preserve useful predictive skill across short-term horizons while maintaining coherent uncertainty behaviour. Predictive uncertainty was quantified using split conformal prediction intervals, whose width varies with lead time and empirical residual behaviour.

Accordingly, the main objective of this study is to evaluate whether the HHFS can preserve deterministic performance, probabilistic consistency, and hydrological interpretability after migration from BR-DWGD to ERA5 forcing. A secondary objective is to assess the feasibility of extending the ERA5-based framework toward daily direct multi-output forecasting from D+1 to D+10 with uncertainty quantification.

More broadly, this study contributes to the literature on hybrid hydrological forecasting by addressing three unresolved challenges: (i) operational transition of machine-learning hydrological forecasting systems under meteorological forcing replacement; (ii) hydro-adaptive uncertainty representation under seasonal hydrological variability; and (iii) integrated multi-scale forecasting combining monthly and daily horizons within a unified and physically interpretable framework.

2. Study Area and Data

2.1. Study Area and Hydrological Targets

This study focuses on the Santa Branca outlet in the upper Paraíba do Sul River Basin, southeastern Brazil. A thorough description of the region can be found in [5]. The contributing drainage area upstream of this outlet comprises fourteen monitored subbasins and is characterized by pronounced hydroclimatic seasonality, with a wet season concentrated during the austral summer and a dry season dominated by baseflow and delayed hydrological contributions. Under these conditions, streamflow variability reflects both short-term meteorological forcing and basin processes associated with longer hydrological memory, making the basin an ideal test case for hybrid forecasting approaches. In addition, the Santa Branca sector plays an important role in regional hydropower operation and water-resources management within the Paraíba do Sul system, further reinforcing its relevance for operational streamflow forecasting studies. Figure 1 presents the geographical location of the Santa Branca outlet and the corresponding upstream monitored subbasins.

The monthly component of this study adopts the same hydrological closure and target definitions as the original HHFS formulation. The deterministic target is the mean monthly discharge at the Santa Branca outlet, while the probabilistic component is evaluated against the observed intra-month discharge envelope, defined by the monthly minimum and maximum daily discharges. This dual definition preserves the original HHFS distinction between prediction of central tendency and representation of within-month hydrological variability. In this manuscript, the terms “flow”, “streamflow”, and “discharge” are used interchangeably to refer to the same river-flow variable at the Santa Branca outlet and its corresponding basin-scale aggregated indicators.

In contrast, the daily extension is based on a basin-aggregated daily dataset derived from the fourteen upstream subbasins. Accordingly, the daily targets should be interpreted as aggregated basin-scale streamflow indicators rather than as a strict reformulation of the monthly outlet target. This distinction is important because the monthly and daily products address different forecasting scales and are constructed using different aggregation logics, although both are derived from the same broader hydrological system.

2.2. Streamflow Observations and Temporal Partitioning

For the monthly framework, discharge observations were organized as a continuous, chronological series and partitioned into training (1998–2019) and independent test (2020–2023) subsets, consistent with the validation design adopted in the original HHFS study. This chronology-preserving split avoids information leakage and ensures that all forecasts are generated using only information available up to the forecast issue time.

For the daily extension, the ERA5-based basin dataset spans 1998–2024. The chronology-preserving partition follows the same logic as in the monthly analysis, with model development restricted to the pre-2020 period and independent evaluation conducted from 2020 onward. Within the training portion, additional fitting and calibration subsets were later defined for model selection and conformal uncertainty calibration, as described in Section 3.

2.3. Meteorological Forcing: Transition from BR-DWGD to ERA5

In the original HHFS formulation presented in [2], the monthly forecasting workflow relied on BR-DWGD (Brazilian Daily Weather Gridded Data), a high-resolution Brazilian meteorological gridded dataset derived from station observations and spatial interpolation procedures. This product provided valuable hydroclimatic information for predictor construction and contributed to the successful performance of the original monthly HHFS. However, despite its scientific usefulness, BR-DWGD was not designed as a continuously updated operational product, which limits its long-term applicability for routine forecasting systems.

Because sustained operational forecasting requires reproducible access, regular updates, temporal continuity, and stable data availability, the present study replaces BR-DWGD with ERA5-family reanalysis products. ERA5 is the fifth-generation global atmospheric reanalysis produced by the European Centre for Medium-Range Weather Forecasts (ECMWF), providing physically consistent and regularly updated meteorological fields with broad spatial coverage and long historical continuity.

The transition from BR-DWGD to ERA5 should not be interpreted as a simple data-source substitution. Changes in forcing products may alter variable definitions, spatial representativeness, temporal aggregation behaviour, and the statistical distribution of predictors. Consequently, the learned relationships between atmospheric drivers, basin memory, and streamflow response must be reassessed under the new forcing regime.

For the monthly product, ERA5 variables were aggregated into descriptors consistent with the original HHFS predictor philosophy, including monthly minima, maxima, means, and accumulated quantities according to variable type. The final predictor set includes precipitation-related variables, near-surface thermodynamic conditions, radiation and surface-energy indicators, and 10 m wind components, together with hydrological-memory descriptors.

This strategy preserves methodological continuity with the companion paper while enabling the main objective of the present manuscript: to evaluate whether the HHFS can maintain predictive skill and probabilistic consistency after migration to a meteorological forcing source that is substantially more suitable for operational deployment.

3. Methods

3.1. General Methodological Structure

The methodological framework adopted in this study consists of two coordinated forecasting products driven by ERA5-based meteorological predictors, both derived from the original Hybrid Hydrological Forecasting System (HHFS) introduced by França et al. (2026) [2]. While the previous companion paper focused on the conceptual development of the monthly HHFS architecture, the present study extends that framework toward operational implementation and multi-horizon forecasting.

The first product is the monthly ERA5-based HHFS, which remains the primary focus of this manuscript. This system preserves the original dual-stage structure composed of: (a) GA₁, the deterministic forecasting module, responsible for predicting the central estimate of next-month streamflow; and (b) GA₂, the hydro-adaptive uncertainty module, responsible for generating probabilistic prediction intervals consistent with seasonal hydrological variability.

Together, GA₁ and GA₂ reconcile deterministic predictive skill with hydro-adaptive probabilistic interpretability while preserving seasonal hydrological consistency and maintaining the conceptual logic of the original HHFS under the new ERA5 forcing regime.

The second product is a daily short-term forecasting extension developed as an exploratory operational component of the ERA5-based HHFS framework. This component uses a single direct multi-output XGBoost model to simultaneously predict streamflow from D+1 to D+10, together with lead-time-dependent uncertainty intervals estimated through split conformal prediction.

Thus, the methodology combines two complementary forecasting horizons: (1) Monthly horizon (D+30): oriented toward reservoir management, hydropower planning, and strategic water-resources decisions; and (2) Daily horizon (D+1 to D+10): oriented toward short-term operations, event anticipation, and tactical management.

Because the monthly system is the central objective of the manuscript, the methodological description is presented first for the monthly HHFS transition from BR-DWGD to ERA5, followed by the daily forecasting extension. This organization allows the reader to distinguish between the validated operational migration of the original HHFS and the new exploratory short-term forecasting component.

3.2. Monthly ERA5-Based HHFS

Step 1—Data loading, harmonization, and temporal organization: The monthly workflow began with an ERA5-based basin dataset containing meteorological descriptors and next-month discharge information for Santa Branca. Column names were standardized, precipitation variables were harmonized when necessary, and the monthly records were organized chronologically by year and month. When duplicate records existed for the same month, they were aggregated into a single monthly entry using variable-specific rules: minima for “_min” variables, maxima for “_max” variables, accumulation for precipitation totals, and means for the remaining fields. A monthly timestamp was then assigned to each record to preserve time ordering throughout model development and evaluation.

Step 2—Target definition and seasonal normalization: The deterministic monthly target was defined as the mean discharge of the following month at the Santa Branca outlet. To reduce the effect of seasonal nonstationarity, the monthly climatology of discharge was first computed, and an anomaly target was constructed by subtracting the month-specific climatological mean from the observed discharge for the following month. This anomaly-based representation preserves intermonth predictability while reducing the influence of the seasonal baseline prior to model fitting.

Step 3—Reduced ERA5 predictor set and hydrological-memory features: A reduced predictor set was then constructed from ERA5-derived monthly variables and hydrological-memory descriptors. The final predictor vector included precipitation descriptors (

p r_{m i n}

,

p r_{m a x}

,

p r_{s u m}

), near-surface thermodynamic variables (

d 2 m

,

t 2 m

), radiation-related terms (

s s r d

), surface flux information (e), and 10 m wind components (

u 10

,

v 10

). To represent antecedent hydrological and pluviometry memory, lagged discharge and lagged precipitation terms were added, along with rolling precipitation accumulations, rolling flow means, and an antecedent precipitation index (API5). Additional predictors included the minimum and maximum discharge observed in the previous month and harmonic encodings of the annual cycle through sine and cosine transforms of the month index. This reduced ERA5 feature space was designed to preserve the physical logic of the original HHFS while remaining operationally practical [2].

Step 4—Chronological split and target transformation: The monthly dataset was split chronologically into a training period ending in 2019 and an independent test period beginning in 2020. Predictor variables were standardized using a scaler fitted only on the training subset. The anomaly target was transformed using a Yeo–Johnson power transformation [27] estimated exclusively on the training data, and all predictions were back-transformed to physical discharge units after model inference.

Step 5—Deterministic forecasting with GA₁: The deterministic stage (GA₁) followed the hybrid logic of the original HHFS and combined two complementary base learners: a multilayer perceptron (MLP) and an XGBoost regressor (XGB) [8]. The MLP was used to capture smoother nonlinear relationships and longer-memory effects, whereas XGBoost was used to represent sharper nonlinear responses and threshold-like behaviour. To increase the influence of hydrologically relevant higher-flow situations during training, flow-dependent weights were introduced. The MLP fitting used a replicated training representation derived from these weights, while XGBoost received the same flow-based weights directly during training.

The outputs of the two base learners were then combined through Ridge-based meta-learning. In addition to a global meta-learner, separate wet-season and dry-season Ridge formulations were calibrated to preserve regime-dependent hydrological behaviour. The final deterministic forecast for the test period was thus obtained by fusing the base learners in a regime-aware manner.

Step 6—Deterministic post-processing and monthly calibration: After back-transformation to physical units and reintroduction of the monthly climatological mean, the deterministic forecast underwent post-processing to reduce low-frequency drift and improve seasonal consistency. This included a rolling bias-adjustment strategy and a month-wise mean–variance calibration derived from the training subset. Finally, non-negativity was enforced to preserve physical realism in the predicted discharge series.

Step 7—Deterministic evaluation: The deterministic performance of GA₁ was quantified separately for the training and test periods using Nash–Sutcliffe efficiency (NSE) [6], Kling–Gupta efficiency (KGE) [7], coefficient of determination (R²), Pearson correlation, mean absolute error (MAE), root mean square error (RMSE), and bias. To characterize uncertainty in the monthly deterministic metrics, a bootstrap procedure was applied over the test period.

Step 8—Hydro-adaptive uncertainty modelling with GA₂: The probabilistic stage (GA₂) was designed to construct hydro-adaptive prediction intervals around the GA₁ forecast, with explicit reference to the observed intra-month discharge envelope. For this purpose, the observed minimum and maximum discharges of the previous month were used as antecedent hydrological range information. The base width of the uncertainty band was linked to the robust residual spread of GA₁, while its final asymmetry and scaling were allowed to vary across seasonal regimes and precipitation anomaly conditions.

A genetic algorithm was then used to optimize the interval structure under a reliability–sharpness objective. For each generation, empirical coverage statistics were computed by evaluating the fraction of observed streamflow values contained within the lower and upper uncertainty bounds generated by the candidate interval solutions. These diagnostics were subsequently used to monitor the convergence behaviour.

Several coverage-related diagnostics were monitored during the GA₂ calibration. True COV represents the empirical interval coverage obtained from the validation dataset, corresponding to the fraction of observed streamflow values contained within the forecast interval. Mean COV corresponds to the mean coverage obtained across all candidate solutions within a given generation, whereas Max COV represents the maximum coverage achieved among the evaluated individuals during that generation. In addition, a composite metric termed COVcomp was internally used to balance interval reliability and interval sharpness simultaneously during the hydro-adaptive calibration process.

The calibration searched for a balanced solution that maximized comparative coverage of the observed discharge envelope while penalizing excessive interval width. After the genetic search, a scale adjustment was applied to move the final interval toward the target comparative coverage adopted in the monthly HHFS design (approximately 0.88). The resulting lower and upper bounds were constrained to remain physically consistent, including non-negativity of the lower limit.

Step 9—Probabilistic evaluation and monthly diagnostics: The final monthly uncertainty intervals were evaluated using comparative coverage (

{C O V}_{c o m p}

), hit rate, and relative width. An additional Monte Carlo perturbation was used to estimate uncertainty in these probabilistic diagnostics. For diagnostic analysis and manuscript preparation, the monthly workflow also generated a set of graphical products, including: (i) the GA₂ convergence plot, (ii) the final hydro-adaptive uncertainty band, (iii) a deterministic observed-versus-predicted scatter plot, (iv) a deterministic forecast series with a simple residual-based

\pm 1 σ

band, and (v) monthly boxplots comparing observed and predicted discharge distributions.

3.3. Daily ERA5-Based Short-Term Forecasting Extension

Step 1—Data loading, parsing, and chronological organization: The daily forecasting workflow started from an ERA5-based subbasin dataset containing up to 14 records per day, one for each monitored subbasin. The time coordinate was converted to calendar date format, invalid subbasin or date entries were removed, and the full dataset was sorted chronologically within each subbasin, from

V a z a o_D 0

:

V a z a o_D - 1

,

V a z a o_D - 2

, and

V a z a o_D - 3

. This ordering was required to consistently generate lagged hydrological predictors.

Step 2—Construction of antecedent flow memory at the subbasin level: Before any spatial aggregation, antecedent flow memory was introduced by generating lagged versions of the same-day streamflow for each subbasin. Specifically, three lag terms were created for each subbasin to encode short-term basin memory at the daily scale.

Step 3—Selection of complete daily records: To ensure spatial consistency in the aggregated daily predictors, only dates with complete spatial coverage across all monitored subbasins were retained. After removing rows with missing values in the core predictor set, a day was considered valid only when all 14 subbasins were simultaneously present. This step avoided distortions caused by incomplete daily spatial sampling.

Step 4—Construction of the 44-feature mean–range predictor block: For each complete day, a set of 22 base variables was retained, including 18 meteorological variables (

u 10

,

v 10

,

d 2 m

,

t 2 m

,

s s r d

, and

t p

statistics) and 4 hydrological variables (

V a z a o_D 0

,

V a z a o_D - 1

,

V a z a o_D - 2

, and

V a z a o_D - 3

). For each of these variables, two spatial summaries were computed across the 14 subbasins: spatial mean (

m e a n S B

) and the spatial range (

r a n g e S B = m a x S B - m i n S B

). This yielded a 44-feature predictor block representing both basin-wide average conditions and daily spatial heterogeneity.

Step 5—Addition of six temporal and hydrological descriptors: To complement the 44 mean–range predictors, six additional low-dimensional descriptors were computed from the daily aggregated series. Two of them represented seasonal cyclicity through sine and cosine transforms of the day of year. The remaining four represented short-memory basin conditions: the 7-day and 14-day rolling means of aggregated contemporaneous flow, and the 3-day and 7-day rolling sums of aggregated precipitation. The final daily predictor vector, therefore, contained 50 inputs.

Step 6—Construction of the direct multi-output target matrix: The daily predictand was defined as a direct multi-output streamflow target for horizons D+1 to D+10. After spatial aggregation using the daily mean across subbasins, the target matrix was constructed at day

t + h

. After spatial aggregation using the daily mean across subbasins, the target matrix was formed by the vector

[V a z a o_D 1, V a z a o_D 2, \dots, V a z a o_D 10]

. Only dates for which all target horizons were simultaneously available were retained, ensuring a complete multi-output training matrix.

Step 7—Chronological partition into FIT, CAL, and TEST subsets: The final daily dataset was split chronologically into a development period ending in 2019 and an independent test period beginning in 2020. Within the development subset, the earliest 80% of samples were used for model fitting (FIT), and the last 20% were reserved for calibration (CAL), preserving time order throughout the procedure. This chronology-preserving partition was adopted to emulate operational forecasting conditions and avoid information leakage.

Step 8—Horizon-wise target stabilization: To reduce asymmetry and stabilize the distribution of daily targets, a Yeo–Johnson transformation was applied independently to each output horizon. The transformation parameters were estimated using only the FIT subset and then transferred to CAL and TEST. After model prediction, all outputs were back transformed to original discharge units before metric computation and graphical analysis.

Step 9—Training of the single multi-output XGBoost model: The daily forecasting core was based exclusively on XGBoost regression. A single multi-output model was trained to produce the entire forecast vector from D+1 to D+10 in one unified pipeline. When the installed XGBoost version supported native multi-output regression, this capability was used directly; otherwise, the implementation automatically fell back to a Multioutput Regressor (XGBRegressor) structure. Although this fallback internally fits one estimator per horizon, it preserves the same operational input structure and unified workflow.

To emphasize hydrologically more relevant higher-flow situations during fitting, sample weights proportional to the mean magnitude of the target vector were used. Model selection was not performed independently for each horizon; instead, a predefined set of global XGBoost presets was tested, and the final configuration was selected according to the mean NSE across all horizons on the CAL subset, optionally penalizing overfitting when performance on FIT substantially exceeded that on CAL.

Step 10—Split conformal uncertainty quantification: Predictive uncertainty was quantified separately for each forecast horizon using split conformal prediction intervals. For each horizon

h

, the absolute residuals on the CAL subset,

∣ y_{c a l, h} - {\hat{y}}_{c a l, h} ∣

, were computed and the

(1− α)

-quantile was extracted with

α = 0.10

. This yielded a conformal half-width

q_{h}

in physical units of streamflow (

m^{3} / s

). Test-period prediction intervals were then formed as

[{\hat{y}}_{h} - q_{h}, {\hat{y}}_{h} + q_{h}]

. This approach provides a transparent and distribution-free uncertainty estimate for each lead time.

Step 11—Daily evaluation and visualization of products: The final daily forecasts were evaluated separately for each horizon using NSE, KGE, Coefficient of Determination R², Pearson correlation; MAE; RMSE; and bias, along with empirical coverage and the average interval width of the conformal bands. To support retrospective and operational interpretation, the workflow also generated: (i) skill-versus-lead-time plots, (ii) error-growth plots, (iii) complete test-period hydrographs for D+1 to D+10, and (iv) a rainy-season operational panel combining the previous 10 days of observed flow, a forecast issuance date, and the subsequent 10-day forecast with uncertainty bounds. For clarity, the daily hydrographs were plotted on the verification axis,

(t+ h)

although the prediction was issued at an earlier time t. For operational visualization purposes, the workflow also included a history band representing the recent observed streamflow variability preceding the forecast issuance date.

4. Results

The results are structured according to the methodological sequence described in Section 3. One first examines the monthly ERA5-based HHFS, beginning with the deterministic forecasting stage (GA₁), followed by the hydro-adaptive uncertainty stage (GA₂). We then assess the daily ERA5-based extension, implemented as a single, direct, multi-output forecasting system for horizons from D+1 to D+10 with split conformal uncertainty. This organization mirrors the methodology and enables both the monthly operational transition and the daily proof-of-concept extension to be interpreted within a unified forecasting framework.

4.1. Operational Motivation and Dataset Transition (BR-DWGD → ERA5)

The primary objective of this study was to evaluate whether the monthly HHFS could be successfully migrated from BR-DWGD to ERA5 forcing without altering its conceptual structure. As described in Section 3, this transition required more than a simple data replacement. It involved reconstruction of the monthly predictor set, preservation of anomaly-based target formulation, regeneration of rainfall-memory and antecedent-flow descriptors, retraining of the deterministic hybrid forecasting stage (GA₁), and recalibration of the hydro-adaptive uncertainty stage (GA₂).

This transition is operationally significant because it tests whether a scientifically validated forecasting system can be transferred to a meteorological forcing source with superior update continuity and production consistency. The monthly ERA5-based results indicate that this transition successfully preserved the core predictive structure of the HHFS. The overall outcome of the migration is summarized in Table 1, while the diagnostic behaviour of the ERA5-based monthly solution is illustrated in Figure 2.

4.2. Deterministic Performance with ERA5 Forcing

The monthly deterministic results should be interpreted in the context of the GA₁ workflow described in Section 3, which integrates a reduced ERA5 predictor set, hydrological-memory descriptors, Yeo–Johnson target transformation, weighted base-learner fitting, regime-aware Ridge fusion, rolling bias correction, and month-wise mean–variance calibration into a single hybrid forecasting pipeline. Within this framework, the deterministic target remained the mean monthly discharge at the Santa Branca outlet.

The ERA5-based implementation preserved—and slightly improved—the deterministic performance of the original monthly system. As shown in Table 1, the test-period results reached NSE = 0.798, KGE = 0.878, Coefficient of Determination R² = 0.798, and Pearson correlation = 0.896, with MAE = 13.629 m³/s and RMSE = 18.778 m³/s. Compared to the BR-DWGD baseline, these values indicate modest gains in explained variance, hydrological consistency, and magnitude error, although mean bias increased slightly in absolute terms. This finding underscores that the HHFS’s deterministic forecasting core is robust to forcing replacement and can be transferred to ERA5 without loss of hydrological consistency.

The graphical diagnostics reinforce this interpretation. In Figure 2c, the deterministic monthly forecast reproduces the main seasonal oscillations and follows the timing of the principal discharge variations with good fidelity. Similarly, Figure 2e shows that the observed-versus-predicted relationship remains well structured around the 1:1 line, despite some dispersion at higher flows. A slight underprediction tendency is observed under some higher-flow conditions, likely associated with the increased nonlinear complexity of extreme runoff response, smoothing effects inherent to monthly aggregation, and differences in precipitation representation between BR-DWGD and ERA5 forcing. Together, Table 1, Figure 2c,e demonstrate that the monthly ERA5-based GA₁ solution preserves the essential predictive behaviour of the original HHFS while remaining fully compatible with operational implementation.

4.3. Probabilistic Performance of GA₂ Under ERA5 Forcing

The monthly probabilistic results should be interpreted in the context of the GA₂ methodology detailed in Section 3, where the deterministic GA₁ forecast is complemented by a hydro-adaptive uncertainty interval calibrated against the observed intra-month discharge envelope. In the ERA5-based implementation, interval width and asymmetry were linked to residual spread, seasonal regime, and precipitation anomaly conditions, and then optimized through a genetic algorithm under a reliability–sharpness objective. Within this framework, COVcomp was used as a composite optimization metric, whereas True COV corresponds to the final empirical validation coverage obtained from the calibrated uncertainty intervals. A final scale adjustment was applied to move the interval toward the target comparative coverage adopted in the HHFS design.

As shown in Table 1, the ERA5-based GA₂ solution yielded

{C O V}_{c o m p} = 0.838

, hit rate

p = 0.979

, and relative width

r = 2.388

. These results indicate that the uncertainty intervals remained sharp and operationally informative, with hit rate preserved at a high level and relative width like the BR-DWGD baseline. The principal change after migration is therefore not interval inflation, but a modest reduction in comparative coverage relative to the hydrologically balanced range adopted in the original HHFS design.

This interpretation is supported by the diagnostic plots in Figure 2a,b,d. In Figure 2a, the GA₂ optimization converges stably but levels off slightly below the target reliability range, indicating that the new forcing configuration leads to a somewhat lower reliability plateau. In Figure 2b, the final hydro-adaptive interval retains the expected seasonal behaviour, expanding during wetter, more variable periods and contracting under lower-flow conditions, confirming that the hydro-adaptive logic of the HHFS was preserved. However, this panel also shows that the observed monthly envelope is not fully captured in all situations, which is consistent with the comparative coverage reduction reported in Table 1. Similarly, Figure 2d demonstrates that the predicted monthly distributions follow the observed seasonal cycle reasonably well, although some differences in spread remain visible in selected months.

Taken together, these results indicate that the ERA5-based GA₂ solution remains hydrologically coherent and operationally useful, but still undercovers the intra-month discharge envelope under some conditions. In other words, the forcing transition preserved the conceptual integrity of the uncertainty module but shifted the reliability–sharpness balance slightly toward under coverage.

4.4. Comparative Summary of the Monthly Transition

The overall effect of the monthly forcing transition is summarized in Table 1, which compares the original BR-DWGD-based HHFS with the ERA5-based implementation. The comparison shows that the monthly deterministic component improved slightly after migration, whereas the probabilistic component retained sharpness and hit rate but exhibited a modest loss in comparative coverage. This trade-off is also reflected qualitatively in Figure 2, where the deterministic behaviour remains robust and the uncertainty bands remain hydro-adaptive, though slightly less reliable with respect to the observed monthly discharge envelope.

From a methodological perspective, this result is consistent with the transition described in Section 3. The ERA5 migration preserved the anomaly-based monthly target, the hydrological-memory formulation, the hybrid GA₁ structure, and the hydro-adaptive GA₂ logic. Therefore, the differences observed after migration are best interpreted as the effects of forcing replacement and predictor redistribution, rather than as structural change in the HHFS itself.

4.5. Daily Short-Term Forecast Performance (Direct Multi-Output, D+1 to D+10)

The daily results should be interpreted in the context of the core daily workflow described in Section 3. In summary, the ERA5-based daily database was organized chronologically at the subbasin level, short-term antecedent flow memory was introduced through lagged flow terms, only days with complete spatial coverage across the fourteen monitored subbasins were retained, and the predictors were aggregated into the 44-feature mean–range block augmented by six temporal and hydrological descriptors. A single multi-output XGBoost model was then trained to produce direct forecasts from D+1 to D+10, and predictive uncertainty was quantified separately for each lead time using split conformal calibration. This daily forecasting component represents one of the new contributions of the present companion paper, extending the monthly HHFS philosophy introduced by França et al. (2026) [2] toward short-range operational horizons.

Using this framework, the model generated basin-aggregated daily streamflow forecasts for lead times from D+1 to D+10. The lead-time-dependent evolution of deterministic performance is summarized in Table 2 and Figure 3, while Figure 4 and Figure 5 provide time-domain and operational illustrations of forecast behaviour. The skill values shown in Figure 3a were computed independently for each forecast horizon using deterministic verification metrics calculated between observed and predicted daily streamflow during the TEST period. As expected, predictive skill decreases progressively with forecast horizon. According to Table 2 and Figure 3a, NSE declines from 0.881 at D+1 to 0.394 at D+10, while Pearson correlation decreases from 0.944 to 0.630 over the same interval. Even so, the persistence of moderate positive correlation at D+10 indicates that the model still retains useful information on flow phasing and the temporal organization of wetter and drier periods. This progressive loss of skill with lead time is physically coherent, as longer horizons become increasingly sensitive to future precipitation uncertainty and nonlinear basin response.

The growth in forecast error with lead time is shown in Figure 3b and quantified in Table 2. MAE increases from 9.04 m³/s at D+1 to 25.12 m³/s at D+10, whereas RMSE rises from 18.42 to 42.23 m³/s. In Figure 3b, ‘Error’ refers specifically to deterministic forecast error quantified through MAE and RMSE calculated independently for each lead time. The stronger increase in RMSE relative to MAE suggests that longer lead times are increasingly influenced by larger event-scale deviations, especially under more variable hydrological conditions. Bias remains small and predominantly negative across horizons, indicating limited mean drift and suggesting that the main effect of increasing lead time is a widening of forecast dispersion rather than the emergence of a systematic offset.

The uncertainty analysis is consistent with this behaviour. As shown in Table 2, the conformal half-width increases with lead time, from 14.80 m³/s at D+1 to 49.87 m³/s at D+10, indicating that the uncertainty envelope expands coherently as the forecasting problem becomes more difficult. The empirical coverage values remain close to the nominal target, supporting the practical interpretability of split conformal prediction intervals without requiring strong distributional assumptions. This result is particularly relevant operationally, as it indicates that uncertainty estimates remain informative even when deterministic skill decreases at longer horizons.

These statistical results are reflected in the time-domain visualizations. In Figure 4, the shorter lead times show the closest agreement with observed daily discharge, particularly in terms of temporal tracking and recession behaviour, whereas longer horizons exhibit smoother responses, larger peak-amplitude differences, and wider uncertainty bands. The largest discrepancies are generally associated with maximum-flow periods, when rapid runoff response and rainfall uncertainty make peak discharge forecasting substantially more challenging than low-flow or recession regimes. In some high-flow situations, the deterministic forecasts exhibit moderate peak underestimation, particularly at longer lead times. This behaviour is consistent with the smoothing tendency of the direct multi-output framework under highly nonlinear and rapidly evolving runoff conditions. Figure 5 provides a complementary rainy-season operational perspective, illustrating how the 10-day forecasting window behaves under more hydrologically variable conditions. In that context, the widening of split conformal prediction intervals with horizon becomes especially important, as it shows that the uncertainty estimates scale consistently with predictive difficulty.

Overall, the daily results indicate that the ERA5-based extension provides strong short-range predictive skill, particularly from D+1 to approximately D+4, while still retaining useful basin-scale hydrological information out to D+10. Taken together, these results suggest that the daily module is a promising operational complement to the monthly HHFS product, enabling anticipatory streamflow prediction across multiple temporal scales.

4.6. Interpretation for Operational Deployment

From an operational hydrology perspective, the monthly and daily results are complementary. The monthly ERA5-based HHFS demonstrates that the original hybrid framework can be transferred to a forcing source with greater operational sustainability while preserving deterministic skill and hydro-adaptive uncertainty structure. The daily extension shows that the same ERA5-based framework can also support shorter lead times through a unified direct multi-output system, yielding strong short-range performance and coherent uncertainty expansion with horizon.

The main difference between the two products lies in their remaining limitations. At the monthly scale, the principal issue is the modest under coverage of the probabilistic intervals, which suggests that the ERA5-based GA₂ module still requires targeted recalibration to recover the original reliability–sharpness balance. At the daily scale, the main limitation is the expected loss of amplitude accuracy at longer lead times, particularly under hydrologically more variable conditions. Neither of these limitations invalidates the proposed framework; rather, they highlight where further refinement is most likely to produce additional gains.

A related point is the role of precipitation representation. In the current monthly ERA5 experiment, accumulated precipitation still relied partly on a proxy formulation, which may have more strongly influenced the reliability of the hydro-adaptive intervals than the deterministic forecast itself. Because both the monthly uncertainty module and the daily short-term extension depend on an adequate representation of rainfall accumulation and antecedent wetness, future refinements should prioritize the direct derivation of these predictors from native ERA5 variables and, subsequently, from higher-resolution atmospheric products.

5. Discussion

The results demonstrate that the ERA5-based HHFS preserves the essential predictive structure of the original framework while improving operational sustainability. At the monthly scale, the transition from BR-DWGD to ERA5 did not degrade deterministic performance; instead, Table 1 shows modest gains in NSE, KGE, R², Pearson correlation, MAE, and RMSE. This robustness aligns with the diagnostics in Figure 2c,e, where the ERA5-based implementation continues to reproduce the main seasonal oscillations and maintains a coherent observed-versus-predicted relationship. Nevertheless, some underestimation remains visible during higher-flow conditions, which is physically consistent with the greater uncertainty associated with extreme precipitation events and with the smoothing behaviour typically introduced by monthly aggregation and hybrid regression structures. Although the principal benchmark of this study is the original BR-DWGD-based HHFS, due to the controlled objective of evaluating the forcing transition, the present framework may also be interpreted in relation to other streamflow forecasting approaches. Deep learning models such as LSTM are often effective in capturing long temporal dependencies, while Random Forest models provide robust nonlinear regression baselines. Traditional hydrological models such as SWAT and HBV offer process-based interpretability but often require stronger calibration effort. In contrast, the HHFS combines deterministic skill, probabilistic interpretability, and operational flexibility within a hybrid data-driven structure. These findings are particularly relevant because they indicate that the forecasting skill of the original HHFS, first introduced in [2], can be preserved after migration to an operationally sustainable forcing product. The probabilistic behaviour is more nuanced. Although Table 1 shows that the ERA5-based GA₂ solution preserved a high hit rate and nearly unchanged relative width, comparative coverage fell slightly below the previously adopted hydrologically balanced range. This is consistent with the convergence behaviour in Figure 2a and the partial under-capture of the observed monthly min–max envelope in Figure 2b. Overall, these results indicate that the ERA5 migration preserved the hydro-adaptive logic of the uncertainty module but shifted the reliability–sharpness balance slightly toward under coverage. In practical terms, this means that the probabilistic product remains useful for decision support, although additional calibration may further improve reliability.

The daily results reinforce this interpretation at shorter forecast horizons. As shown in Table 2 and Figure 3, deterministic skill decreases progressively from D+1 to D+10, with declining NSE and correlation and increasing MAE and RMSE, as expected for direct multi-step hydrological forecasting. Even so, correlation remains above 0.60 at D+10, indicating that the model still retains useful information on flow phasing and the sequencing of wetter and drier periods. This behaviour is consistent with the expected accumulation of atmospheric and hydrological uncertainty as lead time increases. The conformal uncertainty analysis is consistent with this behaviour: interval width increases monotonically with lead time, mirroring the growth in forecast error, while empirical coverage remains reasonably close to the nominal target. In this context, Figure 4 and Figure 5 are important because they translate into time-domain and operational forecasting behaviour the statistical results summarized in Table 2 and Figure 3. The wider intervals observed at longer horizons and during high-flow periods indicate that the uncertainty model appropriately recognizes lower predictability conditions rather than providing overconfident deterministic estimates. Nevertheless, some peak underestimation remains visible during extreme-flow situations, indicating that further refinement is still necessary to better represent rapid runoff amplification under highly variable hydrometeorological conditions. Overall, the combined monthly and daily results show that the ERA5-based HHFS remains scientifically coherent across temporal scales, preserving seasonal predictability, hydro-adaptive uncertainty structure, and useful short-range forecasting skill. The main remaining challenges are recalibrating monthly probabilistic reliability and addressing the expected reduction in amplitude accuracy at longer daily lead times, but neither alters the conclusion that the ERA5 transition represents a viable and operationally consistent evolution of the HHFS framework.

This study presents limitations including ERA5 precipitation bias, slight probabilistic undercoverage, and reduced performance at longer lead times. Future work will focus on improving forcing resolution and probabilistic calibration. Additional future directions include testing higher-resolution atmospheric products (e.g., WRF), extending the framework to other basins, and further refining the daily forecasting component under extreme-flow conditions.

6. Conclusions

This study evaluated the transition of the Hybrid Hydrological Forecasting System (HHFS) from BR-DWGD to ERA5 forcing for streamflow forecasting at Santa Branca and examined a daily short-term forecasting extension based on the same ERA5-driven framework. The results demonstrate that the ERA5-based implementation preserves the essential scientific structure of the original HHFS while providing a more operationally sustainable forcing basis for routine forecasting applications.

At the monthly scale, the deterministic component remained robust after migration and showed modest improvement relative to the BR-DWGD baseline, as indicated by gains in NSE, KGE, R², and Pearson correlation, together with reductions in MAE and RMSE. These results confirm that the predictive core of the HHFS can be transferred to ERA5 without loss of hydrological consistency, preserving both the seasonal cycle and the magnitude of monthly discharge variability. The probabilistic component also retained its hydro-adaptive character, maintaining a high hit rate and nearly unchanged relative width, although comparative coverage decreased slightly below the previously adopted hydrologically balanced range. This indicates that the uncertainty module remains physically coherent and operationally useful but still requires targeted recalibration to restore the original reliability–sharpness balance.

At the daily scale, the single multi-output XGBoost framework produced useful direct forecasts from D+1 to D+10, with the strongest performance at short lead times and a gradual degradation of skill as the horizon increased. Even at D+10, the model retained meaningful correlation, indicating persistent information on flow phasing and the temporal organization of wetter and drier periods. Split conformal prediction intervals expanded coherently with lead time and maintained empirical coverage reasonably close to the nominal target, supporting the practical interpretability of the daily forecasts as predictive uncertainty increased.

Taken together, these findings indicate that the ERA5-based HHFS is robust across temporal scales. At the monthly scale, it preserves seasonal predictability and hydro-adaptive uncertainty structure; at the daily scale, it provides strong short-range forecasting skill and coherent lead-time-dependent uncertainty quantification. The main limitations identified in this study are the need to refine monthly probabilistic reliability under ERA5 and the expected reduction in amplitude accuracy at longer daily horizons, particularly under more variable flow conditions. These limitations include the tendency to underestimate some higher-flow peaks during longer daily forecasting horizons.

Future work should focus on four complementary directions: first, deriving precipitation accumulation and lagged rainfall-memory predictors directly from native ERA5 variables; second, recalibrating the monthly GA₂ module to restore comparative coverage toward the target range without sacrificing sharpness; third, incorporating high-resolution WRF-based meteorological fields to better resolve spatial and temporal variability in precipitation and related atmospheric drivers, particularly during rainy-season and high-flow conditions; and fourth, further improving the daily multi-output framework where uncertainty representation and event-amplitude errors become most critical.

Overall, these results support the ERA5 transition not merely as a technical replacement of meteorological forcing, but as a scientifically consistent and operationally viable evolution of the HHFS framework for multi-scale streamflow forecasting.

Author Contributions

Conceptualization, G.B.F., V.A.d.A. and M.C.A.S.; Data curation G.B.F.; Funding acquisition, M.N.F., G.B.F. and E.P.d.S.; Supervision, M.N.F., G.B.F. and E.P.d.S.; Writing—review and editing, G.B.F., V.A.d.A., M.C.A.S., M.N.F., E.P.d.S., M.T.S., T.R.B.T.A., M.S.d.S., A.A.M.d.A., G.T.S.d.M., M.V.d.A., H.F.C.V., G.G.F., J.A.A., E.A.M.A. and L.Q.V. All authors have read and agreed to the published version of the manuscript.

Funding

The present work was carried out with the financial support provided through the R&D contract 4600010353, reference PD 05161-0025/2024, approved by the concessionaire Light Energia S.A. This support is part of the Regulated Research and Development (R&D) Program managed by the National Electric Energy Agency of the Brazilian Government (ANEEL).

Data Availability Statement

The original data and develop script presented in the study are openly available in https://github.com/aa-vinicius/ufrj-hhfs-sb, accessed on 24 May 2026.

Acknowledgments

This paper presents the results of the R&D project MoVaSC-II, which aims to enhance seasonal precipitation forecasting. This improvement is considered a vital tool for energy planning and water resource management in the Paraíba do Sul hydrological basin. The authors are grateful to all institutions involved in this development.

Conflicts of Interest

Author Lude Quieto Viana is employed by the company Light Energia. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

de Oliveira, D.M.; Carvalho, V.S.B.; da Silva, B.C.; Reboita, M.S.; de Campos, B. Hydrological and Precipitation Extremes and Trends over the Paraíba do Sul River Basin, Brazil. Climate 2023, 11, 138. [Google Scholar] [CrossRef]
França, G.B.; Almeida, V.A.; Senna, M.C.A.; Souza, E.P.; Silva, M.T.; Aranha, T.R.B.T.; Silva, M.S.; de Araujo, A.A.M.; de Almeida, M.V.; de Campos Velho, H.F.; et al. Integrating Regressive and Probabilistic Streamflow Forecasting via a Hybrid Hydrological Forecasting System: Application to the Paraíba do Sul River Basin. Water 2026, 18, 210. [Google Scholar] [CrossRef]
Xavier, A.C.; Scanlon, B.R.; King, C.W.; Alves, A.I. New Improved Brazilian Daily Weather Gridded Data (1961–2020). Int. J. Climatol. 2022, 42, 8390–8404. [Google Scholar] [CrossRef]
Hersbach, H.; Bell, B.; Berrisford, P.; Hirahara, S.; Horányi, A.; Muñoz-Sabater, J.; Nicolas, J.; Peubey, C.; Radu, R.; Schepers, D.; et al. The ERA5 Global Reanalysis. Q. J. R. Meteorol. Soc. 2020, 146, 1999–2049. [Google Scholar] [CrossRef]
Frota, M.N.; França, G.B.; Souza, E.P.; Araujo, A.A.M.; Godoy, J.M.; Viana, L.Q. The Water Cycle of the Paraíba do Sul River Basin: A Strategic Tool for Integrated Management of the Hydroelectric System; PUC-Rio: Rio de Janeiro, Brazil, 2023. [Google Scholar]
Nash, J.E.; Sutcliffe, J.V. River Flow Forecasting Through Conceptual Models Part I—A Discussion of Principles. J. Hydrol. 1970, 10, 282–290. [Google Scholar] [CrossRef]
Gupta, H.V.; Kling, H.; Yilmaz, K.K.; Martinez, G.F. Decomposition of the Mean Squared Error and NSE Performance Criteria: Implications for Improving Hydrological Modelling. J. Hydrol. 2009, 377, 80–91. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
Reboita, M.S.; Gan, M.A.; da Rocha, R.P.; Ambrizzi, T. Regimes de Precipitação na América do Sul: Uma Revisão Bibliográfica. Rev. Bras. Meteorol. 2010, 25, 185–204. [Google Scholar] [CrossRef]
Carvalho, L.M.V.; Jones, C.; Liebmann, B. The South Atlantic Convergence Zone: Intensity, Form, Persistence, and Relationships with Intraseasonal to Interannual Activity and Extreme Rainfall. J. Clim. 2004, 17, 88–108. [Google Scholar] [CrossRef]
Arnold, J.G.; Srinivasan, R.; Muttiah, R.S.; Williams, J.R. Large Area Hydrologic Modeling and Assessment Part I: Model Development. J. Am. Water Resour. Assoc. 1998, 34, 73–89. [Google Scholar] [CrossRef]
Bergström, S. The HBV Model—Its Structure and Applications; SMHI Reports RH No. 4; SMHI: Norrköping, Sweden, 1992. [Google Scholar]
Liang, X.; Lettenmaier, D.P.; Wood, E.F.; Burges, S.J. A Simple Hydrologically Based Model of Land Surface Water and Energy Fluxes for General Circulation Models. J. Geophys. Res. Atmos. 1994, 99, 14415–14428. [Google Scholar] [CrossRef]
Collischonn, W.; Tucci, C.E.M.; Clarke, R.T.; Chou, S.C.; Guilhon, L.G.; Cataldi, M.; Allasia, D. Medium-Range Reservoir Inflow Predictions Based on Quantitative Precipitation Forecasts. J. Hydrol. 2007, 344, 112–122. [Google Scholar] [CrossRef]
Paiva, R.C.D.; Collischonn, W.; Tucci, C.E.M. Large Scale Hydrologic and Hydrodynamic Modeling Using Limited Data and a GIS-Based Approach. J. Hydrol. 2011, 406, 170–181. [Google Scholar] [CrossRef]
Mosavi, A.; Ozturk, P.; Chau, K.W. Flood Prediction Using Machine Learning Models: Literature Review. Water 2018, 10, 1536. [Google Scholar] [CrossRef]
Kratzert, F.; Klotz, D.; Shalev, G.; Klambauer, G.; Hochreiter, S.; Nearing, G.S. Towards Learning Universal, Regional, and Local Hydrological Behaviours via Machine Learning Applied to Large-Sample Datasets. Hydrol. Earth Syst. Sci. 2019, 23, 5089–5110. [Google Scholar] [CrossRef]
Frame, J.M.; Kratzert, F.; Klotz, D.; Gauch, M.; Shalev, G.; Gilon, O.; Qualls, L.M.; Gupta, H.V.; Nearing, G.S. Deep Learning Rainfall–Runoff Predictions of Extreme Events. Hydrol. Earth Syst. Sci. 2022, 26, 3377–3392. [Google Scholar] [CrossRef]
Feng, D.; Fang, K.; Shen, C. Enhancing Streamflow Forecast and Extracting Insights Using Long-Short Term Memory Networks with Data Integration at Continental Scales. Water Resour. Res. 2020, 56, e2019WR026793. [Google Scholar] [CrossRef]
Ng, G.-H.C.; Huang, Y.F.; Koo, C.H.; Chong, K.L.; El-Shafie, A.; Ahmed, A.N. A Review of Hybrid Deep Learning Applications for Streamflow Forecasting. J. Hydrol. 2023, 625, 130141. [Google Scholar] [CrossRef]
Papacharalampous, G.; Tyralis, H.; Langousis, A.; Jayawardena, A.W.; Sivakumar, B.; Mamassis, N.; Montanari, A.; Koutsoyiannis, D. Probabilistic Hydrological Post-Processing at Scale: Why and How to Apply Machine-Learning Quantile Regression Algorithms. Water 2019, 11, 2126. [Google Scholar] [CrossRef]
Zhong, Y.; Guo, S.; Xiong, F.; Liu, D.; Ba, H.; Wu, X. Probabilistic Forecasting Based on Ensemble Forecasts and EMOS Method for TGR Inflow. Front. Earth Sci. 2020, 14, 188–200. [Google Scholar] [CrossRef]
Delgado, J.M.; Voss, S.; Bürger, G.; Vormoor, K.; Murawski, A.; Pereira, J.M.R.; Martins, E.; Vasconcelos, F.; Francke, T. Seasonal Drought Prediction for Semiarid Northeastern Brazil: Verification of Six Hydrometeorological Forecast Products. Hydrol. Earth Syst. Sci. 2018, 22, 5041–5056. [Google Scholar] [CrossRef]
Angelopoulos, A.N.; Bates, S. A Conformal Prediction: A Gentle Introduction. Found. Trends Mach. Learn. 2023, 16, 494–591. [Google Scholar] [CrossRef]
Taieb, S.B.; Atiya, A.F. A Bias and Variance Analysis for Multistep-Ahead Time Series Forecasting. IEEE Trans. Neural Netw. Learn. Syst. 2016, 27, 62–76. [Google Scholar] [CrossRef]
Ampas, H.; Refanidis, I.; Ampas, V. Hybrid Hydrological Forecasting Through a Physical Model and a Weather-Informed Transformer Model: A Case Study in Greek Watershed. Appl. Sci. 2025, 15, 6679. [Google Scholar] [CrossRef]
Yeo, I.; Johnson, R.A. A New Family of Power Transformations to Improve Normality or Symmetry. Biometrika 2000, 87, 954–959. Available online: http://www.jstor.org/stable/2673623 (accessed on 24 May 2026). [CrossRef]

Figure 1. Location of the fourteen upstream sub-basins contributing to the Santa Branca outlet, situated in the upper Paraíba do Sul River Basin, southeastern Brazil. The highlighted Funil Alto region corresponds to the hydrologically aggregated sub-basin system used for the ERA5-based forecasting framework. The figure also shows the surrounding regional context, hydrographic network, and geographic coordinate grid used for spatial reference.

Figure 2. Diagnostic and forecast visualizations of the proposed hybrid hydrological forecasting system during the test period. (a) GA₂ convergence during the genetic optimization process, showing the evolution of coverage-related diagnostics (True COV, Mean COV, Max COV, and COVcomp) across successive generations. The highlighted region indicates the target reliability interval adopted for hydro-adaptive uncertainty calibration. (b) Final hydro-adaptive uncertainty band from GA₂, compared with the observed monthly discharge range (min–max envelope) and the GA₁ central forecast. (c) GA₁ deterministic forecast with an approximate ± 1 standard deviation uncertainty band derived from test-period residuals, compared to observed discharge. (d) Monthly streamflow distributions for observed and predicted discharges, showing seasonal behaviour and differences in spread and central tendency by month. Month numbering follows the conventional calendar sequence from January (1) to December (12). (e) GA₁ observed-versus-predicted dispersion plot for the test period, including the 1:1 reference line, fitted regression line, and summary performance statistics.

Figure 3. (a) Deterministic forecasting skill versus lead time for the daily multi-output HHFS, represented by the evolution of NSE, KGE, and Pearson correlation across forecast horizons (TEST 2020–2024). (b) Deterministic forecast error growth versus lead time for the daily multi-output HHFS, represented by MAE and RMSE computed independently for each forecast horizon (TEST 2020–2024).

Figure 4. Daily hydrographs for D+1 … D+10 (verification axis t + h): observed aggregated stream flow, point forecasts, and split conformal prediction intervals (~90%).

Figure 5. Operational rainy-season example of daily multi-lead streamflow forecasting (D+1 to D+10) at Santa Branca. Observed discharge is compared with forecasts issued for successive lead times during a hydrologically active period. The forecast band corresponds to the split conformal prediction intervals generated for each forecast horizon, whereas the history band represents the recent observed streamflow variability preceding the forecast issuance date. The figure illustrates forecast behaviour under wetter and more variable basin conditions, highlighting the progressive widening of the split conformal prediction intervals with increasing lead time. Shorter horizons show closer agreement with observations, whereas longer horizons present smoother responses and larger uncertainty envelopes, consistent with the increased predictive difficulty associated with high-variability rainy-season conditions.

Table 1. Comparative performance of HHFS before (BR-DWGD) and after (ERA5) forcing substitution for Santa Branca during the independent test period.

Metric	BR-DWGD (Baseline)	ERA5 (Migration)
GA₁ NSE	0.77	0.798
GA₁ KGE	0.85	0.878
GA₁ R²	0.77	0.798
GA₁ Pearson	0.88	0.896
GA₁ MAE (m³/s)	14.36	13.629
GA₁ RMSE (m³/s)	20.24	18.778
GA₁ Bias (m³/s)	0.84	1.58
GA₂ coverage, COVcomp	0.881	0.838
GA₂ HIT rate, p	0.976	0.979
GA₂ Relative width, r	2.425	2.388

Table 2. Daily forecasting skill on the TEST period (2020–2024) by lead time (D+1 … D+10).

Lead Time (Days)	NSE	KGE	Pearson r	MAE (m³/s)	RMSE (m³/s)	Bias (m³/s)	q (m³/s)	Coverage
1	0.881	0.832	0.944	9.04	18.42	−1.12	14.8	0.842
2	0.749	0.737	0.872	13.92	26.74	−1.46	26.24	0.863
3	0.634	0.645	0.802	17.5	32.29	−1.49	33.86	0.871
4	0.542	0.582	0.74	20.28	36.09	−1.34	40.51	0.872
5	0.481	0.564	0.696	22.45	38	−1.22	42.75	0.872
6	0.457	0.531	0.677	23.42	39	−0.73	45.02	0.871
7	0.44	0.491	0.665	23.83	39.99	−0.98	47.83	0.873
8	0.415	0.467	0.646	24.4	41.16	−1.4	48.3	0.865
9	0.401	0.454	0.635	24.89	41.88	−1.38	49.38	0.861
10	0.394	0.448	0.63	25.12	42.23	−1.76	49.87	0.866

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

França, G.B.; Almeida, V.A.d.; Senna, M.C.A.; Souza, E.P.d.; Silva, M.T.; Aranha, T.R.B.T.; Silva, M.S.d.; Araujo, A.A.M.d.; Melo, G.T.S.d.; Almeida, M.V.d.; et al. Hybrid Streamflow Forecasting with ERA5 and Machine Learning Across Daily and Monthly Time Scales. Water 2026, 18, 1337. https://doi.org/10.3390/w18111337

AMA Style

França GB, Almeida VAd, Senna MCA, Souza EPd, Silva MT, Aranha TRBT, Silva MSd, Araujo AAMd, Melo GTSd, Almeida MVd, et al. Hybrid Streamflow Forecasting with ERA5 and Machine Learning Across Daily and Monthly Time Scales. Water. 2026; 18(11):1337. https://doi.org/10.3390/w18111337

Chicago/Turabian Style

França, Gutemberg Borges, Vinícius Albuquerque de Almeida, Mônica Carneiro Alves Senna, Enio Pereira de Souza, Madson Tavares Silva, Thaís Regina Benevides Trigueiro Aranha, Maurício Soares da Silva, Afonso Augusto Magalhães de Araujo, Gabriel Titara Silva de Melo, Manoel Valdonel de Almeida, and et al. 2026. "Hybrid Streamflow Forecasting with ERA5 and Machine Learning Across Daily and Monthly Time Scales" Water 18, no. 11: 1337. https://doi.org/10.3390/w18111337

APA Style

França, G. B., Almeida, V. A. d., Senna, M. C. A., Souza, E. P. d., Silva, M. T., Aranha, T. R. B. T., Silva, M. S. d., Araujo, A. A. M. d., Melo, G. T. S. d., Almeida, M. V. d., Velho, H. F. C., Frota, M. N., Freitas, G. G., Anochi, J. A., Moreno Aldana, E. A., & Viana, L. Q. (2026). Hybrid Streamflow Forecasting with ERA5 and Machine Learning Across Daily and Monthly Time Scales. Water, 18(11), 1337. https://doi.org/10.3390/w18111337

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hybrid Streamflow Forecasting with ERA5 and Machine Learning Across Daily and Monthly Time Scales

Abstract

1. Introduction

2. Study Area and Data

2.1. Study Area and Hydrological Targets

2.2. Streamflow Observations and Temporal Partitioning

2.3. Meteorological Forcing: Transition from BR-DWGD to ERA5

3. Methods

3.1. General Methodological Structure

3.2. Monthly ERA5-Based HHFS

3.3. Daily ERA5-Based Short-Term Forecasting Extension

4. Results

4.1. Operational Motivation and Dataset Transition (BR-DWGD → ERA5)

4.2. Deterministic Performance with ERA5 Forcing

4.3. Probabilistic Performance of GA₂ Under ERA5 Forcing

4.4. Comparative Summary of the Monthly Transition

4.5. Daily Short-Term Forecast Performance (Direct Multi-Output, D+1 to D+10)

4.6. Interpretation for Operational Deployment

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Hybrid Streamflow Forecasting with ERA5 and Machine Learning Across Daily and Monthly Time Scales

Abstract

1. Introduction

2. Study Area and Data

2.1. Study Area and Hydrological Targets

2.2. Streamflow Observations and Temporal Partitioning

2.3. Meteorological Forcing: Transition from BR-DWGD to ERA5

3. Methods

3.1. General Methodological Structure

3.2. Monthly ERA5-Based HHFS

3.3. Daily ERA5-Based Short-Term Forecasting Extension

4. Results

4.1. Operational Motivation and Dataset Transition (BR-DWGD → ERA5)

4.2. Deterministic Performance with ERA5 Forcing

4.3. Probabilistic Performance of GA2 Under ERA5 Forcing

4.4. Comparative Summary of the Monthly Transition

4.5. Daily Short-Term Forecast Performance (Direct Multi-Output, D+1 to D+10)

4.6. Interpretation for Operational Deployment

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.3. Probabilistic Performance of GA₂ Under ERA5 Forcing