Next Article in Journal
Regional Groundwater Flow and Advective Contaminant Transport Modeling in a Typical Hydrogeological Environment of Northern New Jersey
Previous Article in Journal
Comparative Analysis of Satellite-Based Rainfall Products for Drought Assessment in a Data-Poor Region
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Deep Learning-Based Daily Streamflow Prediction Model for the Hanjiang River Basin

1
School of Civil Engineering, Sun Yat-sen University, Guangzhou 510275, China
2
School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
3
Guangdong Provincial Key Laboratory for Marine Civil Engineering, Sun Yat-sen University, Guangzhou 510275, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Hydrology 2025, 12(7), 168; https://doi.org/10.3390/hydrology12070168
Submission received: 25 May 2025 / Revised: 20 June 2025 / Accepted: 24 June 2025 / Published: 27 June 2025

Abstract

The sharp decline in streamflow prediction accuracy with increasing lead times remains a persistent challenge for effective water resources management and flood mitigation. In this study, we developed a coupled deep learning model for daily streamflow prediction in the Hanjiang River Basin, China. The proposed model integrates self-attention (SA), a one-dimensional convolutional neural network (1D-CNN), and bidirectional long short-term memory (BiLSTM). The model’s effectiveness was assessed during flood events, and its predictive uncertainty was quantified using kernel density estimation (KDE). The results demonstrate that the proposed model consistently outperforms baseline models across all lead times. It achieved Nash-Sutcliffe Efficiency (NSE) scores of 0.92, 0.86, and 0.79 for 1-, 3-, and 5-days, respectively, showing particular strength at these extended lead time predictions. During major flood events, the model demonstrated an enhanced capacity to capture peak magnitudes and timings. It achieved the highest NSE values of 0.924, 0.862, and 0.797 for the 1-, 3-, and 5-day forecasting horizons, respectively, thereby showcasing the strengths of integrating CNN and SA mechanisms for recognizing local hydrological patterns. Furthermore, KDE-based uncertainty analysis identified a high prediction interval coverage in different forecast periods and a relatively narrow prediction interval width, indicating the strong robustness of the proposed model. Overall, the proposed SA-CNN-BiLSTM model demonstrates significantly improved accuracy, especially for extended lead times and flood events, and provides robust uncertainty quantification, thereby offering a more reliable tool for reservoir operation and flood risk management.

1. Introduction

Accurate and timely streamflow prediction is fundamental to sustainable water resource management, underpinning critical applications ranging from short-term flood control and emergency response to mid- and long-term drought mitigation, hydropower optimization, and infrastructure planning [1,2,3]. Consequently, enhancing the reliability of streamflow forecasts remains a cornerstone of hydrological research. Over recent decades, methodologies for streamflow simulation and prediction have broadly diverged into two main paradigms: physical process-based models (PBMs) and data-driven models [4,5].
PBMs such as the Soil and Water Assessment Tool (SWAT) [6] and the Variable Infiltration Capacity (VIC) [7] model employ extensive mathematical equations to describe physical processes, allowing a clear explanation of model behavior and providing strong physical interpretability [8]. However, these models require comprehensive knowledge—including physical, biological, and socioeconomic aspects—to properly define model structures, and any deficiencies in this information can amplify uncertainty and error propagation [9]. Additionally, the increasing complexity of streamflow generation mechanisms—driven by rapid urbanization, land-use alterations, and climate change—has further constrained the practical application of PBMs in certain contexts [10,11,12]. In contrast, the emergence of data-driven approaches, particularly deep learning (DL), has revolutionized hydrological modeling by effectively capturing nonlinear relationships in complex environmental systems without explicit physical assumptions [13,14]. Among these, Long Short-Term Memory (LSTM) networks [15,16], convolutional neural networks (CNNs) [17], and transformer-based models [18] have demonstrated superior performance in handling hydrological data. Unlike traditional statistical models, DL methods excel at learning intricate nonlinear patterns from observational data and offer an alternative approach for streamflow prediction, though they typically lack the ability to provide physical information about hydrological processes [19].
Despite the successes of DL in hydrological modeling, several persistent challenges impede their broader operational adoption and reliability. Firstly, a significant concern is the degradation of predictive accuracy over longer forecasting horizons (i.e., increasing lead times), largely attributable to the accumulation of errors [20]. This issue is often exacerbated during extreme events, such as floods, where models may underperform due to imbalanced data distribution and insufficient learning of extreme hydrological dynamics [21]. While techniques like LSTM aim to capture temporal dependencies and mitigate error propagation, and hybrid models (e.g., CNN-LSTM [22,23], SA-BiLSTM [24]) show promise by combining architectural strengths for enhanced feature extraction, maintaining robust performance at extended lead times remains a key objective. Secondly, the “black-box” nature of many DL models limits their interpretability, hindering the understanding of how model decisions are made and which hydrological drivers are most influential [25]. Although post-hoc explanation methods such as Shapley Additive Explanations (SHAP) [26,27] and Local Interpretable Model-agnostic Explanations (LIME) [28] have been applied to uncover mechanisms captured by deep learning models in hydrology, a systematic understanding of basin-specific streamflow drivers across different temporal scales remains underdeveloped. Thirdly, the majority of DL-based streamflow studies focus on deterministic point predictions, often neglecting the crucial aspect of uncertainty quantification [29]. Reliable prediction intervals are essential for risk-informed decision-making, yet methods like kernel density estimation (KDE) for constructing these intervals face challenges, notably the critical selection of optimal bandwidth, which is rarely addressed systematically in hydrological contexts.
To address the critical challenges of maintaining predictive accuracy at extended lead times and providing reliable uncertainty quantification, this study proposed an integrated deep learning framework. This framework uniquely combined advanced architectural designs and adaptive uncertainty methods for robust daily streamflow prediction in the Hanjiang River Basin, China. The main contributions of this study are as follows: (1) We propose a novel hybrid architecture, SA-CNN-BiLSTM, which synergistically combines Self-Attention (SA), a 1D Convolutional Neural Network (1D-CNN), and a Bidirectional Long Short-Term Memory (BiLSTM) network. This design aims to enhance multiscale feature extraction from hydrometeorological time series and improve predictive accuracy, particularly over extended forecasting horizons, by effectively capturing both local patterns and long-range dependencies. (2) We implement a robust uncertainty quantification approach based on KDE with an adaptive bandwidth selection strategy, aiming to generate reliable and informative prediction intervals. The efficacy of the proposed model is rigorously evaluated against several baseline models, with a specific focus on its performance during flood events and its ability to provide well-calibrated uncertainty estimates.
The remainder of this paper is organized as follows: Section 2 describes the study area, dataset, the architecture of the proposed SA-CNN-BiLSTM model, and evaluation methodologies used in this study. Section 3 presents the data analysis and discussion, while Section 4 concludes this study.

2. Methodology

2.1. Study Area

The Hanjiang River originates from Shangfeng in Zijin County, Guangdong Province, and is the second largest river basin in Guangdong outside the Pearl River Basin. The upper reaches of the Hanjiang River are formed by the confluence of the Meijiang River and the Ting River, after which the main stream of the Hanjiang River flows into the Hanjiang River delta river network before ultimately discharging into the South China Sea. The main stream of the Hanjiang River is 470 km long and drains a total area of approximately 30,100 km2. The basin is distributed across three provinces: Guangdong (59.4%), Fujian (40.1%), and Jiangxi (0.5%).
The Hanjiang River Basin, geographically located in eastern Guangdong and southwestern Fujian, Hanjiang River Basin, is situated within 115.22–117.15° E longitude and 23.28–26.08° N latitude (Figure 1). The region experiences a subtropical monsoon climate, characterized by a mild climate, abundant precipitation, and high vegetation coverage. The multi-year mean temperature ranges from 20 °C to 21.5 °C, and the mean annual precipitation is approximately 1620 mm. However, influenced by the topography, precipitation exhibits marked spatial variability and uneven seasonal distribution. This variability leads to substantial streamflow fluctuations between wet and dry seasons, increasing the risk of flood events and posing a significant challenge to water resource management. Currently, the water management infrastructure in the Hanjiang River Basin includes four major reservoirs: the Cotton Beach Reservoir, with a capacity of approximately 1 billion m3, and three reservoirs on Meijiang River tributaries with a combined capacity of approximately 200 million m3. However, the effective operation and dispatching of these reservoirs depend on the availability of accurate runoff forecasts.

2.2. Dataset

Daily streamflow data were sourced from the Hanjiang River Basin Management Bureau, encompassing three hydrological stations: Chaoan, Hengshan, and Xikou. Basic information on these stations is summarized in Table 1. These hydrological records span from 2001 to 2010, providing a reliable foundation for model development due to their high data quality and temporal continuity. Meteorological data were obtained from the China Meteorological Data Network (CMDN) daily dataset V3 [30], which encompasses observations from 699 national benchmark and basic stations from 1951 to 2010. Finally, six meteorological stations within the basin—Changting, Shanghang, Yongding, Dabu, Meixian, and Wuhua—were selected (Table 2). To address missing values, the cubic spline interpolation method was applied. Soil water content data were retrieved from the ERA5-Land reanalysis dataset provided by the European Centre for Medium-Range Weather Forecasts (ECMWF). This dataset provides hourly estimations for four soil layers: 0–7 cm, 7–28 cm, 28–100 cm, and 100–289 cm at a spatial resolution of 0.1°. Subsequently, catchment-level averages for the upstream hydrometeorological variables were derived using the Thiessen polygon method.
The final dataset compiled for the model comprises streamflow at the Chaoan station and 21 hydrometeorological factors and upstream streamflow (detailed in Table 3). In total, this dataset consists of 3532 daily records, covering the period from 1 May 2001 to 31 December 2010.
To improve model efficiency and reduce multicollinearity, feature selection was conducted using the Maximum Information Coefficient (MIC) method [31]. This process yielded nine predictors: temperature, air pressure, precipitation, evapotranspiration, relative humidity, soil water content, and upstream streamflow. These variables, detailed in Table 4, demonstrated strong associations with the Chaoan station streamflow, with precipitation, soil water content, and upstream streamflow exhibiting particularly high MIC values—underscoring their predictive relevance. Further details regarding the MIC calculation and selection process are provided in the Supplementary Materials.

2.3. Deep Learning Models

In this study, the baseline models were selected from commonly used streamflow prediction models, including Multilayer Perceptron (MLP), one-dimensional CNN (1D-CNN, hereinafter referred to as CNN), gated recurrent unit (GRU), BiLSTM, and SA mechanisms.

2.3.1. MLP

MLP is a classic feed-forward neural network architecture composed of an input layer, one or more hidden layers with nonlinear activation functions, and an output layer [32,33]. Each layer comprises interconnected neurons through which data propagate unidirectionally, from input to output. The network is trained using the backpropagation algorithm and is well-suited for capturing complex nonlinear relationships in data. In hydrology, MLPs have been extensively employed for daily streamflow forecasting and have frequently demonstrated superior performance compared to traditional statistical models [34,35,36]. In this study, an MLP with several hidden layers was implemented to model the nonlinear relationships between hydroclimatic inputs (e.g., precipitation and temperature) and daily streamflow.

2.3.2. CNN

Unlike conventional two-dimensional convolutional neural networks designed for image processing, the 1D-CNNs used in this study are capable of processing one-dimensional vector input, enabling efficient feature extraction and temporal pattern extraction. Due to their simplified structures and fewer parameters, 1D-CNNs can employ larger convolutional kernels to achieve broader receptive fields while maintaining a relatively small number of network parameters [37]. Specifically, 1D-CNNs have been demonstrated as effective models for daily runoff prediction [38]. The mathematical formulation of the 1D-CNN is given by the following:
y i = f ( m = 1 k w m x i + m + b )
where yi denotes the i-th element of the output sequence; xi+m is the corresponding input element; f is the activation function; wm is the m-th weight in the convolution kernel of size k, and b is the bias term. Additionally, multiple convolution kernels can be used to extract a variety of features from the input sequence.

2.3.3. GRU and BiLSTM

Traditional neural networks struggle to capture long-term dependencies within sequential data. To address this, an LSTM network was developed, offering enhanced capabilities for modeling such temporal relationships [39]. Furthermore, simultaneously analyzing both forward and backward temporal patterns within sequence data is an effective strategy for enhancing model performance in time series prediction. Unlike conventional LSTMs, BiLSTM comprises distinct forward and backward LSTMs, allowing for bidirectional flow of sequence information. This bidirectional processing enables simultaneous processing of sequence information in both directions, leading to a more comprehensive capture and recognition of temporal dependencies within the data. Therefore, BiLSTM was used to simulate streamflow in this study. Compared with LSTM, GRU is faster to train and less prone to overfitting due to its simpler structure with fewer parameters, but it may be less effective than LSTM at capturing long-term dependencies in complex sequences.

2.3.4. SA

The attention mechanism, inspired by the human cognitive ability for selective focus, assigns differential weights to input data, thereby prioritizing more salient information. In time series forecasting, such mechanisms have led to significant improvements in both model performance and generalization capabilities [40]. Conventional attention mechanisms typically operate on intermediate hidden states or outputs when applied within neural networks, which may limit their capacity to capture global contextual information effectively. In contrast, the SA mechanism directly processes the entire input sequence by calculating pairwise importance scores between all elements, thereby effectively capturing long-term dependencies without relying on sequential processing. In the self-attention framework, the input vector X is projected into three representations: queries (Q), keys (K), and values (V), using shared weight matrices Wq, Wk, and Wv, respectively. The process of SA can be summarized as follows:
Q = X W Q
K = X W k
V = X W v
Attention Q , K , V = Softmax Q K V d k V
Output = Attention Q , K , V V
where dk is the dimensionality of the key vectors, used for scaling to stabilize gradients. This formulation allows the model to capture complex dependencies and dynamically adjust the importance of different time steps in the input sequence.

2.3.5. Coupled Model

As the forecasting horizon extends, the predictive performance of standalone models may be insufficient for practical application. To address this limitation, a hybrid SA-CNN-BiLSTM model was proposed for multi-step daily streamflow prediction. Within this hybrid architecture, the 1D-CNN is first employed to extract local patterns from the time series input. These patterns are then processed by the SA to assess the relative importance of different time steps for the prediction task. Finally, the BiLSTM captures long-term temporal dependencies from the SA output. The output from the BiLSTM layer is subsequently fed into a fully connected layer, mapping the learned representations to the final multi-step streamflow predictions.

2.3.6. Training and Hyperparameter Optimization

In this study, the dataset was divided into training, validation, and test sets sequentially by time at a ratio of 7:1:2, prior to Z-score normalization. The input sequence length for the BiLSTM model was fixed at 10 steps, while the output sequence length corresponded to the selected prediction horizons. Each model was optimized using the AdamW optimizer with a mean-squared error (MSE) loss function, and the Rectified Linear Unit was employed as the activation function. AdamW [41] is a variant of the Adam optimization algorithm that improves weight decay handling of Adam, thereby enhancing momentum stability and convergence speed [42]. The initial learning rate was set to 0.001, while the training epoch was set to 50 to ensure sufficient model convergence. Additionally, L1 regularization was incorporated into the loss function to mitigate overfitting.
To obtain the optimal model, Bayesian optimization (BO) was employed to search for the best hyperparameter combination according to predefined ranges [43]. Further details on the BO method and hyperparameter search space are provided in the Supplementary Materials. Specifically, this study utilized a BO implementation known as Optuna [44], an open-source automated hyperparameter optimization framework based on the Tree-structured Parzen Estimator (TPE) algorithm. TPE efficiently explores the hyperparameter space by modeling conditional probabilities. Additionally, Optuna incorporates an asynchronous successive halving algorithm for pruning, which terminates unpromising trials early, thereby focusing computational resources on more promising hyperparameter combinations [44]. The optimization process aimed to maximize the Nash-Sutcliffe Efficiency (NSE) on the validation set and consisted of 500 trials, with pruning initiated after the first 10 trials.

2.4. Model Evaluation Method

In this study, model performance on the test set was evaluated using a combination of three commonly applied metrics: NSE, root-mean-square error (RMSE), and mean-absolute error (MAE). NSE measures the overall consistency between model predictions and observations, with values closer to 1 indicating a higher accuracy. RMSE and MAE quantify the average degree of discrepancy between predicted and observed values. While NSE is bounded above by 1, RMSE and MAE are unbounded but provide intuitive measures of error magnitude—lower values indicate better model performance. Furthermore, RMSE is more sensitive to large errors due to its squared term, whereas MAE provides a linear measure of average absolute error. In addition to these metrics, the Diebold-Mariano (DM) test was then employed to determine whether the performance differences between models are statistically significant. The DM test is a non-parametric statistical hypothesis test designed to compare the forecasting performance of two time series forecasting models, proposed in Diebold and Mariano [45] 2002. Additional details regarding the formulas for metrics and the DM test are provided in the Supplementary Materials.

2.5. Flood Event Recognition

Accurate prediction of peak streamflow is crucial for effective flood forecasting and mitigation. To assess the proposed model’s effectiveness in flood prediction, this study employed the Peak Over Threshold (POT) method to identify flood events during the test period. The model’s performance was then specifically assessed on these sequences. In contrast to the widely used method Annual Maximum series approach [46], the POT method was more robust for the relatively short duration of the test period, as it allows for the identification of multiple flood events within a given timeframe. Additionally, it should be noted that, in this context, the term “flood” refers to peak streamflow events identified by the POT method and does not necessarily correspond to events causing catastrophic or destructive impacts in a socio-economic sense.

2.6. Prediction Interval Estimation

Methods for estimating probability density functions (PDFs) are broadly classified into parametric and non-parametric. Among these methods, KDE is a widely used non-parametric technique. A key advantage of non-parametric approaches is their suitability when the underlying data distribution is unknown, as they avoid potential biases stemming from incorrect distributional assumptions [47]. In this study, KDE was utilized to construct prediction intervals (PIs) for streamflow forecast errors at designated confidence levels with a Gaussian kernel function. A critical parameter in KDE is the bandwidth, which influences the smoothness of the density estimate. Optimal bandwidths for each horizon were determined experimentally using the training set, yielding values of 13.565 (1-day), 15.272 (3-day), and 16.686 (5-day). Furthermore, the Prediction Interval Coverage Probability (PICP) and the Prediction Interval Normalized Average Width (PINAW) metrics were then used to quantitatively evaluate the prediction interval. More details can be found in the Supplementary Materials.

2.7. Shapley Additive Explanations

SHAP is a model-agnostic interpretability framework proposed by Lundberg and Lee [26] in 2017, grounded in cooperative game theory. It provides a consistent and theoretically sound method for quantifying the contribution of each input feature to the model’s predictions. In SHAP, each feature is treated as a “player” in a game, and its contribution is measured as its SHAP value, reflecting the change it causes in the predicted output when added to different combinations of features. A positive SHAP value indicates a positive influence on the prediction, while a negative value indicates a negative effect. Furthermore, the absolute magnitude of a feature’s SHAP value reflects its importance in the model, with larger values indicating greater influence [48].

3. Results and Discussion

3.1. Model Performance Evaluation

Figure 2, Figure 3 and Figure 4 illustrate the predictive performance of different models at the Chaoan station over 1-, 3-, and 5-day forecasting horizons, respectively. At the 1-day lead time, most models achieved an NSE higher than 0.7, with MAE below 200 m3 s−1 and RMSE below 300 m3 s−1. The integration of advanced components such as BiLSTM, SA, and CNN consistently enhanced model performance. Among all models, the SA-CNN-BiLSTM model demonstrated superior performance, achieving the highest NSE of 0.92, with the lowest MAE and RMSE.
As the forecasting horizon increased, all models experienced a noticeable decline in performance. Specifically, benchmark models exhibited NSE reductions ranging from 6.82% to 20.77% when the lead time increased from 1 day to 3 days, while the SA-CNN-BiLSTM model showed a comparatively smaller decrease of 6.52%. Similarly, when the lead time increased from 3 days to 5 days, NSE for benchmarks decreased by 4.88% to 10.91%, compared to a more modest decline of 4.65% for the proposed model. These results indicate that incorporating either SA or CNN can efficiently improve the ability to maintain temporal information, thereby mitigating the performance degradation of BiLSTM over longer forecasting horizons. Moreover, the combination of SA and 1D-CNN yielded additional gains in predictive accuracy.
To statistically validate these performance differences, the DM test was employed. As shown in Table 5, the SA-CNN-BiLSTM model significantly outperformed all baseline models at all forecast horizons (1-, 3-, and 5-day), with p-values < 0.05 indicating statistical significance. Positive DM values further confirm that the proposed model consistently yielded more accurate predictions than its counterparts.

3.2. Flood Event Performance Analysis

Deep learning models such as LSTM are well-suited for general streamflow prediction but often struggle with extreme events, which pose challenges for flood control applications. Therefore, all models were explicitly evaluated during flood events. The four largest flood peaks were identified by the POT method within the test period: 2 June 2010; 13 July 2010; 21 July 2010; and 29 September 2010. These peaks occurred within three separate flood events, with the second event (from approximately 9 July to 25 July 2010) exhibiting a bimodal structure.
Figure 5, Figure 6 and Figure 7 present the streamflow prediction results for all models across the different forecasting horizons, focusing specifically on these three flood events. The results show that all models manage to capture these four peaks at a 1-day lead time. However, the performance of the standalone models—MLP, GRU, and BiLSTM—declined significantly in terms of accuracy. The MLP model produced significantly skewed predictions, while GRU and BiLSTM exhibited noticeable discrepancies compared to the observations. In contrast, the coupled models demonstrated superior performance in capturing streamflow dynamics during high-flow periods, even as the forecasting horizon extended to 5 days. Among these models, the SA-CNN-BiLSTM model consistently outperformed the others by effectively integrating the strengths of SA and CNN. Specifically, the SA-CNN-BiLSTM model achieved the highest prediction accuracy during the three flood events, with average NSE values of 0.924, 0.862, and 0.797 for the 1-, 3-, and 5-day forecasting horizons, respectively. These results highlight the model’s robustness in forecasting extreme hydrological events and underscore its potential value for operational flood prediction.

3.3. Interval Prediction of Daily Streamflow for Different Lead Times

The SA-CNN-BiLSTM model’s capability in generating daily streamflow intervals was evaluated across multiple confidence intervals (CIs) (80%, 85%, 90%, 95%) and forecast horizons (1-day, 3-day, 5-day) in Table 6. As expected, both PICP and PINAW increase with higher confidence levels and longer forecast periods. The results consistently highlight the model’s effectiveness, with the key indicator of reliability, PICP, consistently exceeding the corresponding nominal confidence levels (NCLs) in all tested scenarios. Notably, even under the stringent 95% NCL, the model achieved empirical coverage rates of 96.13%, 97.17%, and 96.57% for 1-, 3-, and 5-day forecasts, respectively. This consistent performance (PICP > NCL) suggests that the generated PIs are highly reliable and slightly conservative, ensuring that the observed streamflow is captured within the predicted bounds more often than nominally required. Additionally, PINAW increases with both higher confidence levels (e.g., from 10.33% at 80% CI to 28.24% at 95% CI for a 5-day forecast) and longer forecast horizons (e.g., from 13.58% at 1-day to 18.39% at 5-day under 90% CI), reflecting the expected growth in predictive uncertainty. Importantly, PINAW remains within reasonable bounds, indicating that the intervals are informative without being overly wide. For instance, a 13.58% PINAW for the 90% CI at the 1-day forecast indicates a relatively tight bound, underscoring the model’s ability to balance high reliability with practical precision.
Moreover, at lower confidence levels (e.g., 80%), the model maintains relatively narrow intervals (e.g., 8.54% PINAW for the 1-day lead time) while still achieving reasonable coverage (PICP = 82.71%), suggesting good calibration. The increasing PINAW across the forecast horizons further confirms that the model appropriately represents the growing uncertainty over time. In summary, the model demonstrates well-calibrated and adaptive uncertainty quantification, maintaining a strong trade-off between interval sharpness (PINAW) and reliability (PICP) across different settings.
Figure 8 illustrates the error cumulative distribution functions (CDFs) for each horizon and highlights the derivation of the 90% confidence interval as an example. Specifically, for the 90% confidence level, the derived error intervals were [−323.22, 296.14] m3/s (1-day), [−381.49, 333.33] m3/s (3-day), and [−440.75, 398.34] m3/s (5-day). These results show a broadening of the error range with increasing lead time, reflected in the PINAW values. The dynamic behavior of the 95% PIs, which adapt to flow conditions by widening during high-flow and high-variability periods (e.g., mid-2010 flood events) and narrowing during stable, low-flow periods. This heteroscedastic behavior is consistent with hydrological expectations. In summary, the proposed model not only delivers accurate streamflow predictions but also generates reliable and context-aware uncertainty intervals, making it a valuable tool for operational hydrological forecasting and risk-informed water resource management at Chaoan Station. The corresponding CDFs of the prediction errors were calculated via integration to determine the upper and lower bounds corresponding to the target confidence levels, and the complete daily streamflow prediction intervals at the 90% confidence were then reconstructed in Figure 9.

3.4. Feature Importance

To enhance model interpretability, the SHAP method was employed to quantify the relative importance of each feature for the 1-day forecasting period. Figure 10 reveals that streamflow on the previous day at Hengshan Station (RHS, 1D) and Chaoan Station (RCA, 1D) exerts the most significant influence on streamflow at the Chaoan Station. Other key features included precipitation (DP1D), streamflow at Xikou and Chaoan stations two days prior (RCA, 2D; RXK, 2D), and soil water content (SMC1D). Generally, antecedent streamflow exhibits greater importance than precipitation. Additionally, variables such as minimum daily surface temperature (e.g., ST2D), average daily air pressure (e.g., AP2D), and evapotranspiration (e.g., ET1D) in the previous period also have a measurable influence on streamflow predictions. Notably, variables temporally closer to the prediction time do not necessarily have a stronger influence on the forecast, suggesting that temporal proximity is not the sole determinant of feature importance in the model.

3.5. Research Gaps and Future Work

Despite the promising results, several limitations offer opportunities for further research. First, potential impacts of human activities (e.g., land use changes, reservoir operations) or subsurface hydrological conditions on streamflow are not explicitly modeled in this study. Incorporating broader datasets encompassing these factors may further enhance prediction accuracy and offer more comprehensive insights into predicting streamflow dynamics. Second, the model’s spatial generalization remains limited, as it primarily captures temporal dependencies through a data-driven lens. Integrating spatially explicit architectures, such as graph neural networks or GANs, could better leverage topological information across river networks and meteorological stations. Additionally, hybrid modeling approaches that integrate deep learning components with process-based hydrological models may improve model interpretability and promote physically consistent predictions. Such hybrid frameworks could bridge the gap between data-driven flexibility and process-based transparency, yielding more reliable tools for operational hydrological forecasting.

4. Conclusions

This study established an integrated SA-CNN-BiLSTM framework for daily streamflow prediction, demonstrating significant advancements in both deterministic accuracy and uncertainty quantification in the Hanjiang River Basin. The key findings and conclusions are summarized as follows:
The proposed model consistently achieved superior deterministic prediction performance across 1-, 3-, and 5-day prediction horizons, notably outperforming all benchmark models. This superior performance was statistically validated by the DM test, which confirmed the significant outperformance of our proposed model across all tested horizons (p < 0.05). Its advantage was particularly pronounced for longer horizons, achieving NSE values of 0.92, 0.86, and 0.79 for 1-, 3-, and 5-day lead times, respectively. Beyond general performance, the model exhibited exceptional robustness during major flood events, consistently achieving high average NSE values of 0.924, 0.862, and 0.797 for 1-, 3-, and 5-day forecasts during flood periods, respectively. These results underscore its critical potential for operational flood prediction and early warning systems, an area where traditional deep learning models often struggle due to imbalanced data.
Furthermore, the successful implementation of an adaptive KDE approach enabled the generation of highly reliable and informative prediction intervals. The proposed model consistently achieved PICP that exceeded nominal confidence levels (e.g., 96.13% at 95% NCL for 1-day forecast) while maintaining a relatively narrow PINAW, demonstrating a robust performance. Additionally, the application of SHAP for feature importance analysis further enhanced the interpretability of our integrated model. The results revealed that streamflow from upstream stations and precipitation from the previous day exert the most important effect on the prediction process.
In conclusion, the SA-CNN-BiLSTM framework significantly improves the performance of short-to-medium-range streamflow forecasting (up to 5 days), while effectively quantifying predictive uncertainty. These advancements provide valuable tools for hydrological forecasting and decision-making in water resources management, particularly in flood-prone regions.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/hydrology12070168/s1, Figure S1: Heat map of maximum information coefficient between different variables; Table S1: Hyperparameter search range for each streamflow prediction model. References [31,45,47,49,50,51,52,53,54,55] are cited in the Supplementary Materials.

Author Contributions

Conceptualization, H.H. and X.C.; funding acquisition, X.C.; methodology, J.H., J.C. and X.C.; software, J.H. and J.C.; validation, J.C.; visualization, H.H.; writing—original draft, J.H. and H.H.; writing—review and editing, X.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (42375165) and the National Key Research and Development Program of China (2023YFF0805501). We thank, for the technical support, the National Large Scientific and Technological Infrastructure “Earth System Numerical Simulation Facility” (https://cstr.cn/31134.02.EL).

Data Availability Statement

Daily meteorological data were from the China Meteorological Data Service Center (http://data.cma.cn/en, accessed on 22 May 2024). Daily streamflow data were from the Hanjiang River Basin Management Bureau through a project collaboration. Soil moisture data were from the European Centre for Medium-Range Weather Forecasts ReAnalysis 5-Land (ERA5-Land) reanalysis dataset (https://cds.climate.copernicus.eu/datasets/reanalysis-era5-land-timeseries, accessed on 22 May 2024).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Granata, F.; Di Nunno, F. Neuroforecasting of Daily Streamflows in the UK for Short- and Medium-Term Horizons: A Novel Insight. J. Hydrol. 2023, 624, 22. [Google Scholar] [CrossRef]
  2. Hapuarachchi, H.A.P.; Bari, M.A.; Kabir, A.; Hasan, M.M.; Woldemeskel, F.M.; Gamage, N.; Sunter, P.D.; Zhang, X.S.; Robertson, D.E.; Bennett, J.C.; et al. Development of a National 7-Day Ensemble Streamflow Forecasting Service for Australia. Hydrol. Earth Syst. Sci. 2022, 26, 4801–4821. [Google Scholar] [CrossRef]
  3. Matrenin, P.; Safaraliev, M.; Dmitriev, S.; Kokin, S.; Eshchanov, B.; Rusina, A. Adaptive Ensemble Models for Medium-Term Forecasting of Water Inflow When Planning Electricity Generation under Climate Change. Energy Rep. 2022, 8, 439–447. [Google Scholar] [CrossRef]
  4. Zhang, X.; Peng, Y.; Zhang, C.; Wang, B. Are Hybrid Models Integrated with Data Preprocessing Techniques Suitable for Monthly Streamflow Forecasting? Some Experiment Evidences. J. Hydrol. 2015, 530, 137–152. [Google Scholar] [CrossRef]
  5. Kratzert, F.; Klotz, D.; Brenner, C.; Schulz, K.; Herrnegger, M. Rainfall–Runoff Modelling Using Long Short-Term Memory (Lstm) Networks. Hydrol. Earth Syst. Sci. 2018, 22, 6005–6022. [Google Scholar] [CrossRef]
  6. Arnold, J.G.; Srinivasan, R.; Muttiah, R.S.; Williams, J.R. Large Area Hydrologic Modeling and Assessment Part I: Model Development. JAWRA J. Am. Water Resour. Assoc. 1998, 34, 73–89. [Google Scholar] [CrossRef]
  7. Liang, X.; Lettenmaier, D.P.; Wood, E.F.; Burges, S.J. A Simple Hydrologically Based Model of Land Surface Water and Energy Fluxes for General Circulation Models. J. Geophys. Res. Atmos. 1994, 99, 14415–14428. [Google Scholar] [CrossRef]
  8. Fatichi, S.; Vivoni, E.R.; Ogden, F.L.; Ivanov, V.Y.; Mirus, B.; Gochis, D.; Downer, C.W.; Camporese, M.; Davison, J.H.; Ebel, B.; et al. An Overview of Current Applications, Challenges, and Future Trends in Distributed Process-Based Models in Hydrology. J. Hydrol. 2016, 537, 45–60. [Google Scholar] [CrossRef]
  9. Shen, C.P.; Appling, A.P.; Gentine, P.; Bandai, T.; Gupta, H.; Tartakovsky, A.; Baity-Jesi, M.; Fenicia, F.; Kifer, D.; Li, L.; et al. Differentiable Modelling to Unify Machine Learning and Physical Models for Geosciences. Nat. Rev. Earth Environ. 2018, 4, 552–567. [Google Scholar] [CrossRef]
  10. Freire, P.K.D.M.; Santos, C.A.G.; da Silva, G.B.L. Analysis of the Use of Discrete Wavelet Transforms Coupled with Ann for Short-Term Streamflow Forecasting. Appl. Soft Comput. 2019, 80, 494–505. [Google Scholar] [CrossRef]
  11. Dehghani, A.; Moazam, H.M.Z.H.; Mortazavizadeh, F.; Ranjbar, V.; Mirzaei, M.; Mortezavi, S.; Ng, J.L.; Dehghani, A. Comparative Evaluation of LSTM, CNN, and ConvLSTMfor Hourly Short-Term Streamflow Forecasting Using Deep Learning Approaches. Ecol. Inform. 2023, 75, 12. [Google Scholar] [CrossRef]
  12. Wagena, M.B.; Goering, D.; Collick, A.S.; Bock, E.; Fuka, D.R.; Buda, A.; Easton, Z.M. Comparison of Short-Term Streamflow Forecasting Using Stochastic Time Series, Neural Networks, Process-Based, and Bayesian Models. Environ. Model. Softw. 2020, 126, 10. [Google Scholar] [CrossRef]
  13. Chen, Y.Q.; Niu, J.; Sun, Y.Q.; Liu, Q.; Li, S.; Li, P.; Sun, L.Q.; Li, Q.L. Study on Streamflow Response to Land Use Change over the Upper Reaches of Zhanghe Reservoir in the Yangtze River Basin. Geosci. Lett. 2020, 7, 12. [Google Scholar] [CrossRef]
  14. Mohammed Ji, B.G. Streamflow Modeling under the Impact of Climate Change. (Case Study of Dabus River Sub-Basin, Ethiopia). Topology 2020, 12, 7. [Google Scholar]
  15. Williams, A.P.; Livneh, B.; McKinnon, K.A.; Hansen, W.D.; Mankin, J.S.; Cook, B.I.; Smerdon, J.E.; Varuolo-Clarke, A.M.; Bjarke, N.R.; Juang, C.S.; et al. Growing Impact of Wildfire on Western Us Water Supply. Proc. Natl. Acad. Sci. USA 2022, 119, 8. [Google Scholar] [CrossRef]
  16. Huang, H.; Feng, G.; Cao, Y.; Feng, G.; Dai, Z.; Tian, P.; Wei, J.; Cai, X. Simulation and Driving Factor Analysis of Satellite-Observed Terrestrial Water Storage Anomaly in the Pearl River Basin Using Deep Learning. Remote Sens. 2023, 15, 3983. [Google Scholar] [CrossRef]
  17. Ahmed, Y.; Al-Faraj, F.; Scholz, M.; Soliman, A. Assessment of Upstream Human Intervention Coupled with Climate Change Impact for a Transboundary River Flow Regime: Nile River Basin. Water Resour. Manag. 2019, 33, 2485–2500. [Google Scholar] [CrossRef]
  18. Yin, H.; Guo, Z.; Zhang, X.; Chen, J.; Zhang, Y. Rr-Former: Rainfall-Runoff Modeling Based on Transformer. J. Hydrol. 2022, 609, 127781. [Google Scholar] [CrossRef]
  19. Awchi, T.A. River Discharges Forecasting in Northern Iraq Using Different Ann Techniques. Water Resour. Manag. 2014, 28, 801–814. [Google Scholar] [CrossRef]
  20. Fidal, J.; Kjeldsen, T.R. Accounting for Soil Moisture in Rainfall-Runoff modelling of Urban Areas. J. Hydrol. 2020, 589, 125122. [Google Scholar] [CrossRef]
  21. Malakoutian, M.M.A.; Samaei, S.Y.; Khaksar, M.; Malakoutian, Y. A Prediction of Future Flows of Ephemeral Rivers by Using Stochastic Modeling (Ar Autoregressive Modeling). Sustain. Oper. Comput. 2022, 3, 330–335. [Google Scholar] [CrossRef]
  22. Li, P.; Zhang, J.; Krebs, P. Prediction of Flow Based on a CNN-LSTM Combined Deep Learning Approach. Water 2022, 14, 993. [Google Scholar] [CrossRef]
  23. Ghimire, S.; Yaseen, Z.M.; Farooque, A.A.; Deo, R.C.; Zhang, J.; Tao, X. Streamflow Prediction Using an Integrated Methodology Based on Convolutional Neural Network and Long Short-Term Memory Networks. Sci. Rep. 2021, 11, 17497. [Google Scholar] [CrossRef] [PubMed]
  24. Zhou, F.; Chen, Y.; Liu, J. Application of a New Hybrid Deep Learning Model That Considers Temporal and Feature Dependencies in Rainfall–Runoff Simulation. Remote Sens. 2023, 15, 1395. [Google Scholar] [CrossRef]
  25. Ghaith, M.; Siam, A.; Li, Z.; El-Dakhakhni, W. Hybrid Hydrological Data-Driven Approach for Daily Streamflow Forecasting. J. Hydrol. Eng. 2020, 25, 9. [Google Scholar] [CrossRef]
  26. Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 10. [Google Scholar]
  27. Zhao, Z.; Huang, H.; Wang, J.; Feng, G.; Li, L.; Sun, T.; Li, Y.; Wei, J.; Cai, X. Impacts of the Grain for Green Project on Soil Moisture in the Yellow River Basin, China. Hydrol. Process. 2025, 39, e70112. [Google Scholar] [CrossRef]
  28. Tulio Ribeiro, M.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. arXiv 2016, arXiv:1602.04938. [Google Scholar]
  29. Tiwari Dk, T.N.R. Geomorphology-Wavelet Based Approach to Rainfall Runoff Modeling for Data Scarce Semi-Arid Regions, Kolar River Catchment, India. J. Eng. Res. 2022, 10, 29–40. [Google Scholar] [CrossRef]
  30. Wu, Z.Y.; Feng, H.H.; He, H.; Zhou, J.H.; Zhang, Y.L. Evaluation of Soil Moisture Climatology and Anomaly Components Derived from Era5-Land and Gldas-2.1 in China. Water Resour. Manag. 2021, 35, 629–643. [Google Scholar] [CrossRef]
  31. Reshef, D.N.; Reshef, Y.A.; Finucane, H.K.; Grossman, S.R.; McVean, G.; Turnbaugh, P.J.; Lander, E.S.; Mitzenmacher, M.; Sabeti, P.C. Detecting Novel Associations in Large Data Sets. Science 2011, 334, 1518–1524. [Google Scholar] [CrossRef] [PubMed]
  32. Murtagh, F. Multilayer Perceptrons for Classification and Regression. Neurocomputing 1991, 2, 183–197. [Google Scholar] [CrossRef]
  33. Hasan, M.M.; Nilay, M.S.M.; Jibon, N.H.; Rahman, R.M. Lulc Changes to Riverine Flooding: A Case Study on the Jamuna River, Bangladesh Using the Multilayer Perceptron Model. Results Eng. 2023, 18, 101079. [Google Scholar] [CrossRef]
  34. Granata, F.; Di Nunno, F.; Pham, Q.B. A Novel Additive Regression Model for Streamflow Forecasting in German Rivers. Results Eng. 2024, 22, 102104. [Google Scholar] [CrossRef]
  35. Sammen, S.S.; Ehteram, M.; Abba, S.I.; Abdulkadir, R.A.; Ahmed, A.N.; El-Shafie, A. A New Soft Computing Model for Daily Streamflow Forecasting. Stoch. Environ. Res. Risk Assess. 2021, 35, 2479–2491. [Google Scholar] [CrossRef]
  36. Köyceğiz, C.; Büyükyıldız, M. Estimation of Streamflow Using Different Artificial Neural Network Models. Osman. Korkut Ata Üniv. Fen Bilim. Enst. Derg. 2022, 5, 1141–1154. [Google Scholar] [CrossRef]
  37. Wang, K.; Ma, C.; Qiao, Y.; Lu, X.; Hao, W.; Dong, S. A Hybrid Deep Learning Model with 1DCNN-LSTM-Attention Networks for Short-Term Traffic Flow Prediction. Phys. A Stat. Mech. Its Appl. 2021, 583, 126293. [Google Scholar] [CrossRef]
  38. Xie, Y.; Sun, W.; Ren, M.; Chen, S.; Huang, Z.; Pan, X. Stacking Ensemble Learning Models for Daily Runoff Prediction Using 1d and 2d CNNs. Expert Syst. Appl. 2023, 217, 119469. [Google Scholar] [CrossRef]
  39. Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
  40. Sathi, K.A.; Hosain, M.K.; Hossain, M.A.; Kouzani, A.Z. Attention-Assisted Hybrid 1D CNN-BiLSTM Model for Predicting Electric Field Induced by Transcranial Magnetic Stimulation Coil. Sci. Rep. 2023, 13, 2494. [Google Scholar] [CrossRef]
  41. Srivastava, R.; Mittal, V. Adaw: Age Decay Accuracy Weighted Ensemble Method for Drifting Data Stream Mining. Intell. Data Anal. 2021, 25, 1131–1152. [Google Scholar] [CrossRef]
  42. Loshchilov, I.; Hutter, F. Decoupled Weight Decay Regularization. arXiv 2017, arXiv:1711.05101. [Google Scholar]
  43. Frazier, P.I. A Tutorial on Bayesian Optimization. arXiv 2018, arXiv:1807.02811. [Google Scholar]
  44. Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-Generation Hyperparameter Optimization Framework. arXiv 2019, arXiv:1907.10902. [Google Scholar]
  45. Diebold, F.X.; Mariano, R.S. Comparing Predictive Accuracy (Reprinted). J. Bus. Econ. Stat. 2002, 20, 134–144. [Google Scholar] [CrossRef]
  46. Mangini, W.; Viglione, A.; Hall, J.; Hundecha, Y.; Ceola, S.; Montanari, A.; Rogger, M.; Salinas, J.L.; Borzì, I.; Parajka, J. Detection of Trends in Magnitude and Frequency of Flood Peaks across Europe. Hydrol. Sci. J. 2018, 63, 493–512. [Google Scholar] [CrossRef]
  47. Terrell Gr, S.D.W. Variable Kernel Density Estimation. Ann. Stat. 1992, 20, 1236–1265. [Google Scholar] [CrossRef]
  48. Aumann, R.J.; Hart, S. Handbook of Game Theory with Economic Applications; Elsevier: Amsterdam, The Netherlands, 1992; Volume 2. [Google Scholar]
  49. Liu, H.; Motoda, H. Feature Selection for Knowledge Discovery and Data Mining; Springer Science & Business Media: New York, NY, USA, 2012; Volume 454. [Google Scholar]
  50. Brochu, E.; Cora, V.M.; de Freitas, N. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning. arXiv 2010, arXiv:1012.2599. [Google Scholar]
  51. Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian Optimization of Machine Learning Algorithms. Adv. Neural Inf. Process. Syst. 2012, 25. [Google Scholar]
  52. Shahriari, B.; Swersky, K.; Wang, Z.Y.; Adams, R.P.; de Freitas, N. Taking the Human out of the Loop: A Review of Bayesian Optimization. Proc. IEEE 2016, 104, 148–175. [Google Scholar] [CrossRef]
  53. Jones, D.R.; Schonlau, M.; Welch, W.J. Efficient Global Optimization of Expensive Black-Box Functions. J. Glob. Optim. 1998, 13, 455–492. [Google Scholar] [CrossRef]
  54. Silverman, B.W. Density Estimation for Statistics and Data Analysis; Routledge: London, UK, 2018. [Google Scholar]
  55. Abebe, N.A.; Ogden, F.L.; Pradhan, N.R. Sensitivity and Uncertainty Analysis of the Conceptual Hbv Rainfall–Runoff Model: Implications for Parameter Estimation. J. Hydrol. 2010, 389, 301–310. [Google Scholar] [CrossRef]
Figure 1. The location and DEM of the Hanjiang River Basin.
Figure 1. The location and DEM of the Hanjiang River Basin.
Hydrology 12 00168 g001
Figure 2. Scatter density plots of comparisons between predicted and observed streamflow across different models at a 1-day forecast period: (af) represent MLP, GRU, BiLSTM, SA-BiLSTM, CNN-BiLSTM, and SA-CNN-BiLSTM, respectively.
Figure 2. Scatter density plots of comparisons between predicted and observed streamflow across different models at a 1-day forecast period: (af) represent MLP, GRU, BiLSTM, SA-BiLSTM, CNN-BiLSTM, and SA-CNN-BiLSTM, respectively.
Hydrology 12 00168 g002
Figure 3. Scatter density plots of comparisons between predicted and observed discharge across different models at a 3-day forecast period: (af) represent MLP, GRU, BiLSTM, SA-BiLSTM, CNN-BiLSTM, and SA-CNN-BiLSTM, respectively.
Figure 3. Scatter density plots of comparisons between predicted and observed discharge across different models at a 3-day forecast period: (af) represent MLP, GRU, BiLSTM, SA-BiLSTM, CNN-BiLSTM, and SA-CNN-BiLSTM, respectively.
Hydrology 12 00168 g003
Figure 4. Scatter density plots of comparisons between predicted and observed discharge across different models at a 5-day forecast period: (af) represent MLP, GRU, BiLSTM, SA-BiLSTM, CNN-BiLSTM, and SA-CNN-BiLSTM, respectively.
Figure 4. Scatter density plots of comparisons between predicted and observed discharge across different models at a 5-day forecast period: (af) represent MLP, GRU, BiLSTM, SA-BiLSTM, CNN-BiLSTM, and SA-CNN-BiLSTM, respectively.
Hydrology 12 00168 g004
Figure 5. Comparison of streamflow predictions during the test period (a) and three flood events (bd) between different models under a 1-day lead time. The shaded areas represent the three flood events identified by the POT method, corresponding to subplots (b), (c), and (d), respectively.
Figure 5. Comparison of streamflow predictions during the test period (a) and three flood events (bd) between different models under a 1-day lead time. The shaded areas represent the three flood events identified by the POT method, corresponding to subplots (b), (c), and (d), respectively.
Hydrology 12 00168 g005
Figure 6. Comparison of streamflow predictions during the test period (a) and three flood events (bd) between different models under a 3-day lead time. The shaded areas represent the three flood events identified by the POT method, corresponding to subplots (b), (c), and (d), respectively.
Figure 6. Comparison of streamflow predictions during the test period (a) and three flood events (bd) between different models under a 3-day lead time. The shaded areas represent the three flood events identified by the POT method, corresponding to subplots (b), (c), and (d), respectively.
Hydrology 12 00168 g006
Figure 7. Comparison of streamflow predictions during the test period (a) and three flood events (bd) between different models under a 5-day lead time. The shaded areas represent the three flood events identified by the POT method, corresponding to subplots (b), (c), and (d), respectively.
Figure 7. Comparison of streamflow predictions during the test period (a) and three flood events (bd) between different models under a 5-day lead time. The shaded areas represent the three flood events identified by the POT method, corresponding to subplots (b), (c), and (d), respectively.
Hydrology 12 00168 g007
Figure 8. CDF curves of prediction errors under different lead times (“α” = 90%): (a) 1 day, (b) 3 days, and (c) 5 days.
Figure 8. CDF curves of prediction errors under different lead times (“α” = 90%): (a) 1 day, (b) 3 days, and (c) 5 days.
Hydrology 12 00168 g008
Figure 9. Daily streamflow prediction interval of the Chaoan Station with forecast periods of 1d (a), 3d (b), and 5d (c). (“α” = 90%).
Figure 9. Daily streamflow prediction interval of the Chaoan Station with forecast periods of 1d (a), 3d (b), and 5d (c). (“α” = 90%).
Hydrology 12 00168 g009
Figure 10. Feature importance ranked by SHAP at a 1-day lead time (the number in the variable subscript indicates the lead time).
Figure 10. Feature importance ranked by SHAP at a 1-day lead time (the number in the variable subscript indicates the lead time).
Hydrology 12 00168 g010
Table 1. Basic information of major streamflow control stations of Hanjiang River.
Table 1. Basic information of major streamflow control stations of Hanjiang River.
Station NameStation CodeWater Resources
Zone IV
Catchment Area (km2)Mean Annual Streamflow (billion m3)
Chaoan81,500,650lower reaches of the Hanjiang River29,07722.580
Hengshan81,500,360Meijiang River12,6249.698
Xikou81,503,050Tingjiang River92288.197
Note: The Hengshan station data exclude the catchment areas of the three reservoirs, namely, Changtan, Yitang, and Heshui.
Table 2. Basic information of meteorological stations in Hanjiang River Basin.
Table 2. Basic information of meteorological stations in Hanjiang River Basin.
Station NameStation CodeStation Coordinates
Changting58,91125.51° N, 116.22° E
Shanghang58,91825.03° N, 116.25° E
Yongding59,11324.44° N, 116.43° E
Dabu59,11624.20° N, 116.42° E
Meixian59,11724.16° N, 116.06° E
Wuhua59,30323.56° N, 115.46° E
Table 3. Streamflow prediction dataset information.
Table 3. Streamflow prediction dataset information.
NumberVariableUnitData Source
F1Daily precipitationmmDataset of daily values of surface climate data in China (V3.0)
F2Average daily relative humidity%
F3Daily average surface temperature°C
F4Daily maximum surface temperature°C
F5Daily minimum surface temperature°C
F6Average daily temperature°C
F7Daily maximum temperature°C
F8Daily lowest temperature°C
F9Daily average air pressurehPa
F10Daily maximum air pressurehPa
F11Daily minimum pressurehPa
F12Sunshine hoursh
F13Average wind speedm s−1
F14Maximum wind speedm s−1
F15Daily evapotranspirationmm
F16Soil water content (0–7 cm)m3 m−3ERA5-Land
F17Soil water content (7–28 cm)m3 m−3
F18Soil water content (28–100 cm)m3 m−3
F19Soil water content (100–289 cm)m3 m−3
F20Average daily streamflow at Hengshan Stationm3 s−1Hanjiang River Basin Management Bureau
F21Average daily streamflow at Xikou Stationm3 s−1
F22Average daily streamflow at Chaoan Stationm3 s−1
Table 4. Results of feature screening for streamflow prediction models.
Table 4. Results of feature screening for streamflow prediction models.
NumberVariableAcronymsMIC
F1Daily precipitationDP0.44
F2Daily average relative humidityRH0.19
F5Daily minimum surface temperatureST0.33
F8Daily minimum air temperatureT0.32
F9Daily average air pressureAP0.31
F15Daily evapotranspirationET0.22
F17Soil water contentSMC0.48
F20Average daily streamflow at Hengshan StationRHS0.51
F21Average daily streamflow at Xikou StationRXK0.36
F22Average daily streamflow at Chaoan StationRCA1.00
Note: MIC stands for Maximum Information Coefficient between the variable and the average daily streamflow at the Chaoan station.
Table 5. DM test results comparing the proposed model and other benchmark models across three lead times at the Chaoan station.
Table 5. DM test results comparing the proposed model and other benchmark models across three lead times at the Chaoan station.
Forecast PeriodBase ModelDM Valuep
1dMLP6.714.09 × 10−10
GRU4.411.21 × 10−5
BiLSTM3.131.82 × 10−4
CNN-BiLSTM2.446.08 × 10−3
SA-BiLSTM2.599.71 × 10−4
3dMLP6.353.92 × 10−10
GRU5.983.64 × 10−9
BiLSTM4.212.90 × 10−5
CNN-BiLSTM2.143.27 × 10−3
SA-BiLSTM2.461.43 × 10−3
5dMLP5.341.30 × 10−7
GRU5.163.33 × 10−7
BiLSTM5.026.57 × 10−7
CNN-BiLSTM2.182.95 × 10−3
SA-BiLSTM3.652.83 × 10−4
Note: The significance level is 0.05; a positive DM value indicates that the proposed model outperforms the benchmark model.
Table 6. Performance evaluation of daily streamflow interval predictions.
Table 6. Performance evaluation of daily streamflow interval predictions.
Confidence IntervalForecast PeriodPICPPINAW
80%1d82.71%8.54%
3d80.92%9.27%
5d84.20%10.33%
85%1d86.29%10.51%
3d85.84%12.30%
5d88.52%13.97%
90%1d92.55%13.58%
3d93.29%15.67%
5d93.00%18.39%
95%1d96.13%20.25%
3d97.17%23.96%
5d96.57%28.24%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, J.; Chen, J.; Huang, H.; Cai, X. Deep Learning-Based Daily Streamflow Prediction Model for the Hanjiang River Basin. Hydrology 2025, 12, 168. https://doi.org/10.3390/hydrology12070168

AMA Style

Huang J, Chen J, Huang H, Cai X. Deep Learning-Based Daily Streamflow Prediction Model for the Hanjiang River Basin. Hydrology. 2025; 12(7):168. https://doi.org/10.3390/hydrology12070168

Chicago/Turabian Style

Huang, Jianze, Jialang Chen, Haijun Huang, and Xitian Cai. 2025. "Deep Learning-Based Daily Streamflow Prediction Model for the Hanjiang River Basin" Hydrology 12, no. 7: 168. https://doi.org/10.3390/hydrology12070168

APA Style

Huang, J., Chen, J., Huang, H., & Cai, X. (2025). Deep Learning-Based Daily Streamflow Prediction Model for the Hanjiang River Basin. Hydrology, 12(7), 168. https://doi.org/10.3390/hydrology12070168

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop