Dynamic Graph Transformer with Spatio-Temporal Attention for Streamflow Forecasting
Abstract
1. Introduction
- (1)
- A multi-channel dynamic graph constructor that constructs three complementary adjacency matrices—hydrological topology, spatial vector similarity, and statistical correlation patterns—and adaptively fuses them via learnable weights to capture evolving node interactions and enable intelligent information propagation.
- (2)
- A lightweight local Temporal Pattern Enhancement module that integrates extended convolutional downsampling with multi-head self-attention to enhance the model’s ability to address short-term fluctuations and local anomalies while maintaining global dependency awareness.
- (3)
- Implementation and validation through ablation studies and comparative experiments using data from the Delaware River Basin, demonstrating the effectiveness of our dynamic graph strategy and local–temporal synergy mechanism under diverse hydrological conditions.
2. Materials and Methods
2.1. Methodology
2.1.1. Dynamic Adaptive Spatial Graph Constructor for Multi-Perspective Spatio-Temporal Graph Construction
2.1.2. Dual-Stream Enhanced Temporal Predictor for Enhanced Global and Local Feature Extraction in Temporal Forecasting
- Multi-scale convolutional downsampling:
- 2.
- Local attention enhancement:
- 3.
- Temporal dimension recovery:
2.2. Study Area
2.3. Data Sources and Preprocessing
- Data download and preliminary screening: Download water-level- and flow observation data of the gauges with a 15 min resolution. Then, exclude gauges with >5% data missingness.
- Water-level time-series gap filling: Applied linear interpolation to reconstruct isolated or small-scale gaps in water-level records.
- Streamflow data supplement: For periods with complete water-level data but missing flow measurements, we applied a multi-year comprehensive stage–discharge relationship model. This model used the recorded water level to calculate the corresponding missing stream.
- Time-scale aggregation: After completing the above steps, all streamflow data (original 15 min resolution) are aggregated into a 12 h time-series by calculating the arithmetic mean, which is convenient for subsequent model training and testing.
2.4. Experimental Design
3. Results
3.1. Overall Model Performance
3.2. Ablation Study Results
3.3. Comparative Analysis
3.3.1. Basin-Scale Benchmark Performance Evaluation
3.3.2. Validation of Generalizability Across Representative Nodes
3.3.3. Sensitivity Analysis by Hydrological Attributes Grouping
3.3.4. Robustness Testing in Extreme Scenarios
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| ANN | Artificial Neural Network |
| A3T-GCN | Attention-Based Temporal Graph Convolutional Network |
| CV | Coefficient of Variation |
| DL | Deep Learning |
| DynaSTG-Former | Dynamic Spatio-Temporal Graph Transformer |
| DRB | Delaware River Basin |
| GNN | Graph Neural Network |
| GCN | Graph Convolutional Network |
| GAT | Graph Attention Mechanisms |
| GRU | Gated Recurrent Unit |
| KGE | Kling–Gupta Efficiency |
| LSTM | Long Short-Term Memory |
| LTPE | Local Temporal Pattern Enhancement |
| MAE | Mean Absolute Error |
| IPTA | Innovative Polygon Trend Analysis |
| RMSE | Root Mean Square Error |
| RAPS | Rescaled Adjusted Partial Sums |
| TGCN | Temporal Graph Convolutional Network |
| trans | Transformer |
Appendix A. Methodology for Gauge Node Classification
| Indicator | Median | Normal Value Range | |||
|---|---|---|---|---|---|
| 0.866 | 0.900 | 0.931 | 0.065 | [0.769,1.029] | |
| 22.2% | 35.6% | 41.2% | 19.0% | [6.3%,69.7%] | |
| Q_CV | 0.18 | 0.20 | 0.25 | 0.07 | [0.08,0.36] |
- 1-step KGE ∈ [0.769, 1.0], resulting in the exclusion of one node (node_18) with an abnormally low value;
- KGE decay rate ∈ [6.3%, 69.7%], a criterion met by all nodes;
- Flow coefficient of variation (Q_CV) ∈ [0.08, 0.36], resulting in the exclusion of one node (node_14) with an abnormally high value;
- Q_mean > 0, a criterion met by all stations.
| K | Silhouette Score | Feature Coverage |
|---|---|---|
| 3 | 0.3565 | 0.9535 |
| 4 | 0.3739 | 0.8372 |
| 5 | 0.3700 | 0.8837 |
| 6 | 0.3777 | 0.9535 |
| 7 | 0.3912 | 0.9767 |
| 9 | 0.3823 | 0.9302 |
| Node Number | Cluster | Q_Mean | Q_CV | KGE_1-step | KGE_Decay (1- to 6-Step) | Stream |
|---|---|---|---|---|---|---|
| node_30 | 0 | 16.000 | 0.180 | 0.869 | 40.78% | Tributary |
| node_6 | 1 | 16.117 | 0.170 | 0.939 | 20.77% | Tributary |
| node_45 | 2 | 428.000 | 0.210 | 0.958 | 15.33% | Mainstem |
| node_10 | 3 | 2.001 | 0.254 | 0.866 | 41.15% | Tributary |
| node_19 | 4 | 186.000 | 0.205 | 0.916 | 27.28% | Mainstem |
| node_9 | 5 | 89.001 | 0.134 | 0.966 | 35.60% | Tributary |
| node_41 | 6 | 3.300 | 0.200 | 0.883 | 17.66% | Tributary |
References
- Wang, X.; Tian, W.; Zheng, W.; Shah, S.; Li, J.; Wang, X.; Zhang, X. Quantitative Relationships between Salty Water Irrigation and Tomato Yield, Quality, and Irrigation Water Use Efficiency: A Meta-Analysis. Agric. Water Manag. 2023, 280, 108213. [Google Scholar] [CrossRef]
- Huan, S. Geographic Heterogeneity of Activation Functions in Urban Real-Time Flood Forecasting: Based on Seasonal Trend Decomposition Using Loess-Temporal Convolutional Network-Gated Recurrent Unit Model. J. Hydrol. 2024, 636, 131279. [Google Scholar] [CrossRef]
- Bai, X.; Zhao, W. Impacts of Climate Change and Anthropogenic Stressors on Runoff Variations in Major River Basins in China since 1950. Sci. Total Environ. 2023, 898, 165349. [Google Scholar] [CrossRef] [PubMed]
- Jesus, G.; Mardani, Z.; Alves, E.; Oliveira, A. Deep Learning-Based River Flow Forecasting with MLPs: Comparative Exploratory Analysis Applied to the Tejo and the Mondego Rivers. Sensors 2025, 25, 2154. [Google Scholar] [CrossRef]
- Danandeh Mehr, A.; Kahya, E.; Olyaie, E. Streamflow Prediction Using Linear Genetic Programming in Comparison with a Neuro-Wavelet Technique. J. Hydrol. 2013, 505, 240–249. [Google Scholar] [CrossRef]
- Guo, J.; Zhang, M.; Shang, Q.; Liu, F.; Wu, A.; Li, X. River Basin Cyberinfrastructure in the Big Data Era: An Integrated Observational Data Control System in the Heihe River Basin. Sensors 2021, 21, 5429. [Google Scholar] [CrossRef]
- Kratzert, F.; Klotz, D.; Herrnegger, M.; Sampson, A.K.; Hochreiter, S.; Nearing, G.S. Toward Improved Predictions in Ungauged Basins: Exploiting the Power of Machine Learning. Water Resour. Res. 2019, 55, 11344–11354. [Google Scholar] [CrossRef]
- Ougahi, J.H.; Rowan, J.S. Enhanced Streamflow Forecasting Using Hybrid Modelling Integrating Glacio-Hydrological Outputs, Deep Learning and Wavelet Transformation. Sci. Rep. 2025, 15, 2762. [Google Scholar] [CrossRef]
- Box, G.E.P.; Jenkins, G.M.; Reinsel, G.C. Time Series Analysis, 1st ed.; Wiley Series in Probability and Statistics; Wiley: Hoboken, NJ, USA, 2008; ISBN 978-0-470-27284-8. [Google Scholar]
- Thompstone, R.; Mcleod, A. Forecasting Quarter-Monthly Riverflow. Water Resour. Bull. 1985, 21, 731–741. [Google Scholar] [CrossRef]
- Li, B.; Jin, C.; Lin, R.; Zhou, X.; Deng, M. A Method for Constructing Open-Channel Velocity Field Prediction Model Based on Machine Learning and CFD. Comput. Intell. 2025, 41, e70043. [Google Scholar] [CrossRef]
- Karunanithi, N.; Grenney, W.J.; Whitley, D.; Bovee, K. Neural Networks for River Flow Prediction. J. Comput. Civ. Eng. 1994, 8, 201–220. [Google Scholar] [CrossRef]
- Jain, S.K.; Das, A.; Srivastava, D.K. Application of ANN for Reservoir Inflow Prediction and Operation. J. Water Resour. Plann. Manag. 1999, 125, 263–271. [Google Scholar] [CrossRef]
- Zealand, C.; Burn, D.; Simonovic, S. Short Term Streamflow Forecasting Using ANNs. In Proceedings of the Water Resources and the Urban Environment University of Manitoba, Chicago, IL, USA, 7–10 June 1998; Loucks, E., Ed.; pp. 229–234. [Google Scholar]
- Zealand, C.M.; Burn, D.H.; Simonovic, S.P. Short Term Streamflow Forecasting Using Artificial Neural Networks. J. Hydrol. 1999, 214, 32–48. [Google Scholar] [CrossRef]
- Vapnik, V.N. The Nature of Statistical Learning Theory; Springer New York: New York, NY, USA, 2000; ISBN 978-1-4419-3160-3. [Google Scholar]
- Okkan, U.; Serbes, Z.A. Rainfall–Runoff Modeling Using Least Squares Support Vector Machines. Environmetrics 2012, 23, 549–564. [Google Scholar] [CrossRef]
- Sivapragasam, C.; Liong, S.-Y. Flow Categorization Model for Improving Forecasting. Hydrol. Res. 2005, 36, 37–48. [Google Scholar] [CrossRef]
- Cho, K.; van Merrienboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation. arXiv 2014, arXiv:1406.1078. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Kratzert, F.; Klotz, D.; Brenner, C.; Schulz, K.; Herrnegger, M. Rainfall–Runoff Modelling Using Long Short-Term Memory (LSTM) Networks. Hydrol. Earth Syst. Sci. 2018, 22, 6005–6022. [Google Scholar] [CrossRef]
- Hu, C.; Wu, Q.; Li, H.; Jian, S.; Li, N.; Lou, Z. Deep Learning with a Long Short-Term Memory Networks Approach for Rainfall-Runoff Simulation. Water 2018, 10, 1543. [Google Scholar] [CrossRef]
- Wu, S.; Dong, Z.; Guzmán, S.M.; Conde, G.; Wang, W.; Zhu, S.; Shao, Y.; Meng, J. Two-Step Hybrid Model for Monthly Runoff Prediction Utilizing Integrated Machine Learning Algorithms and Dual Signal Decompositions. Ecol. Inform. 2024, 84, 102914. [Google Scholar] [CrossRef]
- Ghimire, S.; Yaseen, Z.M.; Farooque, A.A.; Deo, R.C.; Zhang, J.; Tao, X. Streamflow Prediction Using an Integrated Methodology Based on Convolutional Neural Network and Long Short-Term Memory Networks. Sci. Rep. 2021, 11, 17497. [Google Scholar] [CrossRef]
- Huang, J.; Chen, J.; Huang, H.; Cai, X. Deep Learning-Based Daily Streamflow Prediction Model for the Hanjiang River Basin. Hydrology 2025, 12, 168. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Girihagama, L.; Naveed Khaliq, M.; Lamontagne, P.; Perdikaris, J.; Roy, R.; Sushama, L.; Elshorbagy, A. Streamflow Modelling and Forecasting for Canadian Watersheds Using LSTM Networks with Attention Mechanism. Neural Comput. Applic 2022, 34, 19995–20015. [Google Scholar] [CrossRef]
- Yin, R.; Ren, J. Sequence-to-Sequence LSTM-Based Dynamic System Identification of Piezo-Electric Actuators. In Proceedings of the 2023 American Control Conference (ACC), San Diego, CA, USA, 31 May–2 June 2023; pp. 673–678. [Google Scholar]
- Zhang, S.; Zhang, X.; Zhao, X.; Fang, J.; Niu, M.; Zhao, Z.; Yu, J.; Tian, Q. MTDAN: A Lightweight Multi-Scale Temporal Difference Attention Networks for Automated Video Depression Detection. IEEE Trans. Affect. Comput. 2024, 15, 1078–1089. [Google Scholar] [CrossRef]
- Xu, Y.; Lin, K.; Hu, C.; Wang, S.; Wu, Q.; Zhang, L.; Ran, G. Deep Transfer Learning Based on Transformer for Flood Forecasting in Data-Sparse Basins. J. Hydrol. 2023, 625, 129956. [Google Scholar] [CrossRef]
- Yin, H.; Guo, Z.; Zhang, X.; Chen, J.; Zhang, Y. RR-Former: Rainfall-Runoff Modeling Based on Transformer. J. Hydrol. 2022, 609, 127781. [Google Scholar] [CrossRef]
- Subhadarsini, S.; Kumar, D.N.; Govindaraju, R.S. Enhancing Hydro-Climatic and Land Parameter Forecasting Using Transformer Networks. J. Hydrol. 2025, 655, 132906. [Google Scholar] [CrossRef]
- Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; Volume 35, pp. 11106–11115. [Google Scholar]
- Ghobadi, F.; Yaseen, Z.M.; Kang, D. Long-Term Streamflow Forecasting in Data-Scarce Regions: Insightful Investigation for Leveraging Satellite-Derived Data, Informer Architecture, and Concurrent Fine-Tuning Transfer Learning. J. Hydrol. 2024, 631, 130772. [Google Scholar] [CrossRef]
- Hall, J.; Arheimer, B.; Borga, M.; Brázdil, R.; Claps, P.; Kiss, A.; Kjeldsen, T.R.; Kriaučiūnienė, J.; Kundzewicz, Z.W.; Lang, M.; et al. Understanding Flood Regime Changes in Europe: A State-of-the-Art Assessment. Hydrol. Earth Syst. Sci. 2014, 18, 2735–2772. [Google Scholar] [CrossRef]
- Zhao, Q.; Zhu, Y.; Shu, K.; Wan, D.; Yu, Y.; Zhou, X.; Liu, H. Joint Spatial and Temporal Modeling for Hydrological Prediction. IEEE Access 2020, 8, 78492–78503. [Google Scholar] [CrossRef]
- Roudbari, N.S.; Poullis, C.; Patterson, Z.; Eicker, U. TransGlow: Attention-Augmented Transduction Model Based on Graph Neural Networks for Water Flow Forecasting. In Proceedings of the 2023 International Conference on Machine Learning and Applications (ICMLA), Jacksonville, FL, USA, 15 December 2023; pp. 626–632. [Google Scholar]
- Feng, J.; Sha, H.; Ding, Y.; Yan, L.; Yu, Z. Graph Convolution Based Spatial-Temporal Attention LSTM Model for Flood Forecasting. In Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, 18 July 2022; pp. 1–8. [Google Scholar]
- Lu, J.; Xie, Z.; Chen, J.; Li, M.; Xu, C.; Cao, H. GC-SALM: Multi-Task Runoff Prediction Using Spatial-Temporal Attention Graph Convolution Networks. In Proceedings of the 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Honolulu, Oahu, HI, USA, 1 October 2023; pp. 3633–3638. [Google Scholar]
- Deng, L.; Zhang, X.; Slater, L.J.; Liu, H.; Tao, S. Integrating Euclidean and Non-Euclidean Spatial Information for Deep Learning-Based Spatiotemporal Hydrological Simulation. J. Hydrol. 2024, 638, 131438. [Google Scholar] [CrossRef]
- Hu, Y.; Li, H.; Zhang, C.; Xu, B.; Chu, W.; Shen, D.; Li, R. Streamflow Regime-Based Classification and Hydrologic Similarity Analysis of Catchment Behavior Using Differentiable Modeling with Multiphysics Outputs. J. Hydrol. 2025, 653, 132766. [Google Scholar] [CrossRef]
- Zhao, L.; Song, Y.; Zhang, C.; Liu, Y.; Wang, P.; Lin, T.; Deng, M.; Li, H. T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction. IEEE Trans. Intell. Transport. Syst. 2020, 21, 3848–3858. [Google Scholar] [CrossRef]
- Bai, J.; Zhu, J.; Song, Y.; Zhao, L.; Hou, Z.; Du, R.; Li, H. A3T-GCN: Attention Temporal Graph Convolutional Network for Traffic Forecasting. IJGI 2021, 10, 485. [Google Scholar] [CrossRef]
- Sun, A.Y.; Jiang, P.; Yang, Z.-L.; Xie, Y.; Chen, X. A Graph Neural Network (GNN) Approach to Basin-Scale River Network Learning: The Role of Physics-Based Connectivity and Data Fusion. Hydrol. Earth Syst. Sci. 2022, 26, 5163–5184. [Google Scholar] [CrossRef]
- Sun, A.Y.; Jiang, P.; Mudunuru, M.K.; Chen, X. Explore Spatio-Temporal Learning of Large Sample Hydrology Using Graph Neural Networks. Water Resour. Res. 2021, 57, e2021WR030394. [Google Scholar] [CrossRef]
- Bai, T.; Tahmasebi, P. Graph Neural Network for Groundwater Level Forecasting. J. Hydrol. 2023, 616, 128792. [Google Scholar] [CrossRef]
- Weiler, M.; McGlynn, B.L.; McGuire, K.J.; McDonnell, J.J. How Does Rainfall Become Runoff? A Combined Tracer and Runoff Transfer Function Approach. Water Resour. Res. 2003, 39, 2003WR002331. [Google Scholar] [CrossRef]
- Wu, Z.; Pan, S.; Long, G.; Jiang, J.; Zhang, C. Graph WaveNet for Deep Spatial-Temporal Graph Modeling. arXiv 2019, arXiv:1906.00121. [Google Scholar]
- Gao, S.; Zhang, S.; Huang, Y.; Han, J.; Zhang, T.; Wang, G. A Hydrological Process-Based Neural Network Model for Hourly Runoff Forecasting. Environ. Model. Softw. 2024, 176, 106029. [Google Scholar] [CrossRef]
- Liu, J.; Bian, Y.; Lawson, K.; Shen, C. Probing the Limit of Hydrologic Predictability with the Transformer Network. J. Hydrol. 2024, 637, 131389. [Google Scholar] [CrossRef]
- Yin, H.; Zheng, Q.; Wei, C.; Liang, C.; Fan, M.; Zhang, X.; Zhang, Y. Monthly Streamflow Forecasting with Temporal-Periodic Transformer. J. Hydrol. 2025, 660, 133308. [Google Scholar] [CrossRef]
- Smith, J.A.; Baeck, M.L.; Villarini, G.; Krajewski, W.F. The Hydrology and Hydrometeorology of Flooding in the Delaware River Basin. J. Hydrometeorol. 2010, 11, 841–859. [Google Scholar] [CrossRef]
- Moore, R.B.; McKay, L.D.; Rea, A.H.; Bondelid, T.R.; Price, C.V.; Dewald, T.G.; Johnston, C.M. User’s Guide for the National Hydrography Dataset Plus (NHDPlus) High Resolution: U.S.; Geological Survey Open-File Report 2019–1096; U.S. Environmental Protection Agency: Washington, DC, USA, 2019; 66p. [Google Scholar]
- Lehner, B.; Grill, G. Global River Hydrography and Network Routing: Baseline Data and New Approaches to Study the World’s Large River Systems. Hydrol. Process. 2013, 27, 2171–2186. [Google Scholar] [CrossRef]
- Đurin, B.; Raič, M.; Banejad, H. Analysis of Homogeneity and Isotropy of the Flow in the Watercourses by Applying the RAPS and IPTA Methods. ACAE 2024, 15, 67–83. [Google Scholar] [CrossRef]
- Şen, Z. Innovative Trend Analysis Methodology. J. Hydrol. Eng. 2012, 17, 1042–1046. [Google Scholar] [CrossRef]
- Şen, Z.; Şişman, E.; Dabanli, I. Innovative Polygon Trend Analysis (IPTA) and Applications. J. Hydrol. 2019, 575, 202–210. [Google Scholar] [CrossRef]
- Gupta, H.V.; Kling, H.; Yilmaz, K.K.; Martinez, G.F. Decomposition of the Mean Squared Error and NSE Performance Criteria: Implications for Improving Hydrological Modelling. J. Hydrol. 2009, 377, 80–91. [Google Scholar] [CrossRef]
- Knoben, W.J.M.; Freer, J.E.; Woods, R.A. Technical Note: Inherent Benchmark or Not? Comparing Nash–Sutcliffe and Kling–Gupta Efficiency Scores. Hydrol. Earth Syst. Sci. 2019, 23, 4323–4331. [Google Scholar] [CrossRef]
- Potapov, P.; Hansen, M.C.; Pickens, A.; Hernandez-Serna, A.; Tyukavina, A.; Turubanova, S.; Zalles, V.; Li, X.; Khan, A.; Stolle, F.; et al. The Global 2000-2020 Land Cover and Land Use Change Dataset Derived From the Landsat Archive: First Results. Front. Remote Sens. 2022, 3, 856903. [Google Scholar] [CrossRef]
- Ke, W.; Hui-qin, W.; Ying, Y.; Li, M.; Yi, Z. Time Series Prediction Method Based on Pearson Correlation BP Neural Network. Opt. Precis. Eng. 2018, 26, 2805–2813. [Google Scholar] [CrossRef]
- Bin-lin, Y.; Wen-sheng, W.; Man, Y. Application of NNBR model based different similarity index in medium-long-term runoff prediction. Water Resour. Power 2017, 35, 14–17. [Google Scholar]
- Chen, N.; Majda, A.J. Predicting Observed and Hidden Extreme Events in Complex Nonlinear Dynamical Systems with Partial Observations and Short Training Time Series. Chaos Interdiscip. J. Nonlinear Sci. 2020, 30, 033101. [Google Scholar] [CrossRef]
- Liu, J.; Cho, H.-S.; Osman, S.; Jeong, H.-G.; Lee, K. Review of the Status of Urban Flood Monitoring and Forecasting in TC Region. Trop. Cyclone Res. Rev. 2022, 11, 103–119. [Google Scholar] [CrossRef]
- Wang, F.; Mu, J.; Zhang, C.; Wang, W.; Bi, W.; Lin, W.; Zhang, D. Deep Learning Model for Real-Time Flood Forecasting in Fast-Flowing Watershed. J. Flood Risk Manag. 2025, 18, e70036. [Google Scholar] [CrossRef]
- Luo, Y.; Zhou, Y.; Chen, H.; Xiong, L.; Guo, S.; Chang, F.-J. Exploring a Spatiotemporal Hetero Graph-Based Long Short-Term Memory Model for Multi-Step-Ahead Flood Forecasting. J. Hydrol. 2024, 633, 130937. [Google Scholar] [CrossRef]
- Gao, S.; Huang, Y.; Zhang, S.; Han, J.; Wang, G.; Zhang, M.; Lin, Q. Short-Term Runoff Prediction with GRU and LSTM Networks without Requiring Time Step Optimization during Sample Generation. J. Hydrol. 2020, 589, 125188. [Google Scholar] [CrossRef]
- Heidari, E.; Samadi, V.; Khan, A.A. Leveraging Recurrent Neural Networks for Flood Prediction and Assessment. Hydrology 2025, 12, 90. [Google Scholar] [CrossRef]
- Rugină, A.M. Alternative Hydraulic Modeling Method Based on Recurrent Neural Networks: From HEC-RAS to AI. Hydrology 2025, 12, 207. [Google Scholar] [CrossRef]
- Ding, Y.; Zhu, Y.; Feng, J.; Zhang, P.; Cheng, Z. Interpretable Spatio-Temporal Attention LSTM Model for Flood Forecasting. Neurocomputing 2020, 403, 348–359. [Google Scholar] [CrossRef]
- Sawicz, K.; Wagener, T.; Sivapalan, M.; Troch, P.A.; Carrillo, G. Catchment Classification: Empirical Analysis of Hydrologic Similarity Based on Catchment Function in the Eastern USA. Hydrol. Earth Syst. Sci. 2011, 15, 2895–2911. [Google Scholar] [CrossRef]
- Tukey, J.W. Exploratory Data Analysis; Addison-Wesley Series in Behavioral Science; Addison-Wesley Pub. Co.: Reading, MA, USA, 1977; ISBN 978-0-201-07616-5. [Google Scholar]
- Rousseeuw, P.J. Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis. J. Comput. Appl. Math. 1987, 20, 53–65. [Google Scholar] [CrossRef]













| Metrics | Equation | Range |
|---|---|---|
| Root Mean Squared Error (RMSE) | , close to 0 is better | |
| Mean Absolute Error (MAE) | , close to 0 is better | |
| Kling–Gupta Efficiency (KGE) | , close to 1 is better |
| Cluster ID | Representative Node | Hydrological Response Pattern | Hydrological Signatures |
|---|---|---|---|
| Cluster 0 | node_30 | Medium-Flow High-Decay Predictability | Medium-flow, medium coefficient of variation (CV), and good short-term forecast accuracy but significant performance degradation in long-term forecasts. |
| Cluster 1 | node_6 | Medium-Flow Stable Prediction | Medium-flow, low-flow variability, and excellent short-term forecasting capability with maintained stability in long-term forecasts. |
| Cluster 2 | node_45 | High-Flow Control Gauge | Extremely high-flow, high short-term forecasting accuracy, and low-performance decay rate in long-term forecasts. |
| Cluster 3 | node_10 | Low-Flow High-Variability Challenge | Low-flow, high hydrological variability, and acceptable short-term forecasting accuracy but severe performance degradation in long-term forecasts. |
| Cluster 4 | node_19 | High-Flow General Performance | High-flow, moderate CV, and good short-term forecasting performance with moderate decay in long-term forecasts. |
| Cluster 5 | node_9 | High Accuracy Low-Decay Excellence | Exceptional short-term forecast accuracy, outstanding long-term forecasting stability, and stable hydrological conditions with low-variability. |
| Cluster 6 | node_41 | Low-Flow Low-Decay Stability | Low-flow conditions, moderate variability, and good short-term forecasting performance with relatively low-decay rate in long-term forecasts. |
| Horizon | Node | RMSE (m3/s) | MAE (m3/s) | KGE (-) |
|---|---|---|---|---|
| 1-step (12 h) | Overall | 16.88 | 5.33 | 0.961 |
| node_6 | 7.47 | 3.36 | 0.939 | |
| node_9 | 19.13 | 8.81 | 0.966 | |
| node_10 | 1.41 | 0.47 | 0.866 | |
| node_19 | 28.6 | 18.08 | 0.916 | |
| node_30 | 7.43 | 3.59 | 0.869 | |
| node_41 | 0.53 | 0.28 | 0.883 | |
| node_45 | 53.74 | 30.08 | 0.958 | |
| 3-step (36 h) | Overall | 31.87 | 8.98 | 0.956 |
| node_6 | 11.18 | 5.21 | 0.885 | |
| node_9 | 43.48 | 20.11 | 0.915 | |
| node_10 | 2.45 | 1.04 | 0.797 | |
| node_19 | 58.65 | 27.63 | 0.929 | |
| node_30 | 11.72 | 6.14 | 0.734 | |
| node_41 | 0.8 | 0.4 | 0.866 | |
| node_45 | 113.15 | 60.28 | 0.892 | |
| 6-step (72 h) | Overall | 57.60 | 14.10 | 0.855 |
| node_6 | 16.43 | 7.61 | 0.744 | |
| node_9 | 82.18 | 34.77 | 0.622 | |
| node_10 | 3.72 | 1.46 | 0.510 | |
| node_19 | 125.33 | 52.90 | 0.666 | |
| node_30 | 14.86 | 8.33 | 0.515 | |
| node_41 | 1.03 | 0.61 | 0.727 | |
| node_45 | 184.07 | 82.99 | 0.811 |
| Model Configuration | RMSE (m3/s) | MAE (m3/s) | KGE (-) | Abbreviation | Purpose |
|---|---|---|---|---|---|
| Full Model | 31.87 | 8.98 | 0.956 | Full | Complete proposed model |
| Static + Pearson + Cosine + Transformer | 31.88 | 8.49 | 0.940 | w/o LTPE | Ablates local perception |
| Static + Cosine + LTPE + Transformer | 35.89 | 10.04 | 0.937 | w/o Pearson | Ablates Pearson graph |
| Static + Pearson + LTPE + Transformer | 33.26 | 8.83 | 0.929 | w/o Cosine | Ablates cosine graph |
| Static Graph + LTPE + Transformer | 34.52 | 9.94 | 0.907 | w/o Pearson and Cosine | Ablates cosine graph and Pearson graph |
| P | 36.85 | 9.99 | 0.885 | w/o Graph | Tests temporal module only |
| Transformer Baseline | 40.28 | 9.73 | 0.900 | Base | Baseline model |
| Cluster ID | Full | w/o Cosine | w/o Pearson | w/o Pearson and Cosine | w/o Graph |
|---|---|---|---|---|---|
| Cluster_0 | 0.886 | 0.813 | 0.800 | 0.704 | 0.709 |
| Cluster_1 | 0.862 | 0.849 | 0.865 | 0.819 | 0.794 |
| Cluster_2 | 0.938 | 0.899 | 0.865 | 0.868 | 0.784 |
| Cluster_3 | 0.802 | 0.781 | 0.790 | 0.746 | 0.694 |
| Cluster_4 | 0.938 | 0.886 | 0.868 | 0.831 | 0.752 |
| Cluster_5 | 0.883 | 0.845 | 0.861 | 0.805 | 0.760 |
| Cluster_6 | 0.882 | 0.820 | 0.814 | 0.816 | 0.740 |
| Method | Horizon | RMSE (m3/s) | MAE (m3/s) | KGE (-) |
|---|---|---|---|---|
| LSTM | 1-step (12 h) | 24.48 | 6.85 | 0.951 |
| 3-step (36 h) | 58.37 | 16.14 | 0.849 | |
| 6-step (72 h) | 62.54 | 15.99 | 0.808 | |
| GRU | 1-step (12 h) | 19.77 | 6.40 | 0.946 |
| 3-step (36 h) | 46.06 | 12.77 | 0.890 | |
| 6-step (72 h) | 67.20 | 18.91 | 0.781 | |
| Standard Transformer (Trans) | 1-step (12 h) | 20.97 | 5.02 | 0.941 |
| 3-step (36 h) | 40.28 | 9.73 | 0.900 | |
| 6-step (72 h) | 68.54 | 16.34 | 0.812 | |
| TGCN | 1-step (12 h) | 25.80 | 6.39 | 0.951 |
| 3-step (36 h) | 42.79 | 11.90 | 0.835 | |
| 6-step (72 h) | 59.44 | 14.75 | 0.781 | |
| A3T-GCN | 1-step (12 h) | 24.60 | 5.96 | 0.943 |
| 3-step (36 h) | 47.90 | 11.01 | 0.874 | |
| 6-step (72 h) | 63.23 | 16.32 | 0.783 | |
| DynaSTG-Former (Ours) | 1-step (12 h) | 16.88 | 5.33 | 0.961 |
| 3-step (36 h) | 31.87 | 8.98 | 0.956 | |
| 6-step (72 h) | 57.60 | 14.10 | 0.855 |
| Node_ID | Q_Mean (m3/s) | Q_CV (-) | Stream | KGE1-Step (-) | KGE3-Step (-) | KGE6-Step (-) | KGE_Decay (1 to 6 Step) |
|---|---|---|---|---|---|---|---|
| node_45 | 428.0 | 0.210 | Mainstem | 0.958 | 0.892 | 0.811 | 15.33% |
| node_37 | 3.8 | 0.200 | Tributary | 0.914 | 0.638 | 0.417 | 54.34% |
| node_17 | 7.0 | 0.167 | Tributary | 0.773 | 0.693 | 0.461 | 40.41% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, B.; Li, Q.; Zhou, X.; Deng, M.; Ling, H. Dynamic Graph Transformer with Spatio-Temporal Attention for Streamflow Forecasting. Hydrology 2025, 12, 322. https://doi.org/10.3390/hydrology12120322
Li B, Li Q, Zhou X, Deng M, Ling H. Dynamic Graph Transformer with Spatio-Temporal Attention for Streamflow Forecasting. Hydrology. 2025; 12(12):322. https://doi.org/10.3390/hydrology12120322
Chicago/Turabian StyleLi, Bo, Qingping Li, Xinzhi Zhou, Mingjiang Deng, and Hongbo Ling. 2025. "Dynamic Graph Transformer with Spatio-Temporal Attention for Streamflow Forecasting" Hydrology 12, no. 12: 322. https://doi.org/10.3390/hydrology12120322
APA StyleLi, B., Li, Q., Zhou, X., Deng, M., & Ling, H. (2025). Dynamic Graph Transformer with Spatio-Temporal Attention for Streamflow Forecasting. Hydrology, 12(12), 322. https://doi.org/10.3390/hydrology12120322

