Rapid Prediction Approach for Water Quality in Plain River Networks: A Data-Driven Water Quality Prediction Model Based on Graph Neural Networks
Abstract
1. Introduction
2. Methods and Model
2.1. Basic Theory
2.1.1. Multivariate Time Series Forecasting
2.1.2. Graph Neural Network
2.2. Construction of Spatiotemporal Graph Neural Network Water Quality Prediction Model
2.2.1. Overall Model Architecture
2.2.2. Adaptive Multiperiod Enhancement Module
2.2.3. Temporal Period Dependency Module
2.2.4. Multivariate Spatial Dependency Module
2.2.5. Prediction Strategy
2.2.6. Hybrid Loss Function Module
3. Case Study
3.1. Study Area
3.2. Station Selection
3.3. Dataset and Data Preprocessing
3.4. Prediction Tasks
3.5. Baseline Models
3.6. Evaluation Metrics
4. Results and Discussion
4.1. Spatiotemporal Characteristics of Water Quality Data
4.2. Hyperparameter Sensitivity Analysis
4.2.1. Top Frequency
4.2.2. Time Node
4.2.3. Spatial Adjacent Node
4.2.4. Time–Frequency Domain Error Adjustment Coefficient
4.3. Model Prediction Performance Comparison
4.4. Long-Term Prediction Ability of Models
4.5. Ablation Experiment Results
4.6. Elimination of Label Autocorrelation
5. Conclusions and Outlook
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Lai, Z.; Li, S.; Lv, G.; Pan, Z.; Fei, G. Watershed Delineation Using Hydrographic Features and a DEM in Plain River Network Region: Watershed Delineation in Plain River Network Region. Hydrol. Process. 2016, 30, 276–288. [Google Scholar] [CrossRef]
- Li, L.; Knapp, J.L.A.; Lintern, A.; Ng, G.-H.C.; Perdrial, J.; Sullivan, P.L.; Zhi, W. River Water Quality Shaped by Land–River Connectivity in a Changing Climate. Nat. Clim. Chang. 2024, 14, 225–237. [Google Scholar] [CrossRef]
- Xu, R.; Hu, S.; Wan, H.; Xie, Y.; Cai, Y.; Wen, J. A Unified Deep Learning Framework for Water Quality Prediction Based on Time-Frequency Feature Extraction and Data Feature Enhancement. J. Environ. Manag. 2024, 351, 119894. [Google Scholar] [CrossRef]
- Huan, S. A Novel Interval Decomposition Correlation Particle Swarm Optimization-Extreme Learning Machine Model for Short-Term and Long-Term Water Quality Prediction. J. Hydrol. 2023, 625, 130034. [Google Scholar] [CrossRef]
- Avila, R.; Horn, B.; Moriarty, E.; Hodson, R.; Moltchanova, E. Evaluating Statistical Model Performance in Water Quality Prediction. J. Environ. Manag. 2018, 206, 910–919. [Google Scholar] [CrossRef] [PubMed]
- Paliwal, R.; Sharma, P.; Kansal, A. Water Quality Modelling of the River Yamuna (India) Using QUAL2E-UNCAS. J. Environ. Manag. 2007, 83, 131–144. [Google Scholar] [CrossRef]
- Zhang, R.; Qian, X.; Li, H.; Yuan, X.; Ye, R. Selection of Optimal River Water Quality Improvement Programs Using QUAL2K: A Case Study of Taihu Lake Basin, China. Sci. Total Environ. 2012, 431, 278–285. [Google Scholar] [CrossRef] [PubMed]
- Tang, T.J.; Yang, S.; Peng, Y.; Yin, K.; Zou, R. Eutrophication Control Decision Making Using EFDC Model for Shenzhen Reservoir, China. Water Resour. 2017, 44, 308–314. [Google Scholar] [CrossRef]
- Douglas-Mankin, K.R.; Srinivasan, R.; Arnold, J.G. Soil and Water Assessment Tool (Swat) Model: Current Developments and Applications. Trans. ASABE 2010, 53, 1423–1431. [Google Scholar] [CrossRef]
- Wellen, C.; Kamran-Disfani, A.-R.; Arhonditsis, G.B. Evaluation of the Current State of Distributed Watershed Nutrient Water Quality Modeling. Environ. Sci. Technol. 2015, 49, 3278–3290. [Google Scholar] [CrossRef]
- Cui, F.; Park, C.; Kim, M. Application of Curve-Fitting Techniques to Develop Numerical Calibration Procedures for a River Water Quality Model. J. Environ. Manag. 2019, 249, 109375. [Google Scholar] [CrossRef]
- Jiang, L.; Li, Y.; Zhao, X.; Tillotson, M.R.; Wang, W.; Zhang, S.; Sarpong, L.; Asmaa, Q.; Pan, B. Parameter Uncertainty and Sensitivity Analysis of Water Quality Model in Lake Taihu, China. Ecol. Model. 2018, 375, 1–12. [Google Scholar] [CrossRef]
- Huang, Y.; Cai, Y.; Dai, C.; He, Y.; Wan, H.; Guo, H.; Zhang, P. An Integrated Simulation-Optimization Approach for Combined Allocation of Water Quantity and Quality under Multiple Uncertainties. J. Environ. Manag. 2024, 363, 121309. [Google Scholar] [CrossRef] [PubMed]
- Talukdar, P.; Kumar, B.; Kulkarni, V.V. A Review of Water Quality Models and Monitoring Methods for Capabilities of Pollutant Source Identification, Classification, and Transport Simulation. Rev. Environ. Sci. Biotechnol. 2023, 22, 653–677. [Google Scholar] [CrossRef]
- Ilampooranan, I.; Van Meter, K.J.; Basu, N.B. A Race Against Time: Modeling Time Lags in Watershed Response. Water Resour. Res. 2019, 55, 3941–3959. [Google Scholar] [CrossRef]
- Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N. Prabhat Deep Learning and Process Understanding for Data-Driven Earth System Science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef]
- Wu, J.; Wang, Z. A Hybrid Model for Water Quality Prediction Based on an Artificial Neural Network, Wavelet Transform, and Long Short-Term Memory. Water 2022, 14, 610. [Google Scholar] [CrossRef]
- Chen, K.; Chen, H.; Zhou, C.; Huang, Y.; Qi, X.; Shen, R.; Liu, F.; Zuo, M.; Zou, X.; Wang, J.; et al. Comparative Analysis of Surface Water Quality Prediction Performance and Identification of Key Water Parameters Using Different Machine Learning Models Based on Big Data. Water Res. 2020, 171, 115454. [Google Scholar] [CrossRef]
- Najah Ahmed, A.; Binti Othman, F.; Abdulmohsin Afan, H.; Khaleel Ibrahim, R.; Ming Fai, C.; Shabbir Hossain, M.; Ehteram, M.; Elshafie, A. Machine Learning Methods for Better Water Quality Prediction. J. Hydrol. 2019, 578, 124084. [Google Scholar] [CrossRef]
- Ahmed, U.; Mumtaz, R.; Anwar, H.; Shah, A.A.; Irfan, R.; García-Nieto, J. Efficient Water Quality Prediction Using Supervised Machine Learning. Water 2019, 11, 2210. [Google Scholar] [CrossRef]
- Wu, W.; Dandy, G.C.; Maier, H.R. Protocol for Developing ANN Models and Its Application to the Assessment of the Quality of the ANN Model Development Process in Drinking Water Quality Modelling. Environ. Model. Softw. 2014, 54, 108–127. [Google Scholar] [CrossRef]
- Mahmoudi, N.; Orouji, H.; Fallah-Mehdipour, E. Integration of Shuffled Frog Leaping Algorithm and Support Vector Regression for Prediction of Water Quality Parameters. Water Resour. Manag. 2016, 30, 2195–2211. [Google Scholar] [CrossRef]
- Xu, J.; Xu, Z.; Kuang, J.; Lin, C.; Xiao, L.; Huang, X.; Zhang, Y. An Alternative to Laboratory Testing: Random Forest-Based Water Quality Prediction Framework for Inland and Nearshore Water Bodies. Water 2021, 13, 3262. [Google Scholar] [CrossRef]
- Baek, S.-S.; Pyo, J.; Chun, J.A. Prediction of Water Level and Water Quality Using a CNN-LSTM Combined Deep Learning Approach. Water 2020, 12, 3399. [Google Scholar] [CrossRef]
- Zhang, Y.-F.; Fitch, P.; Thorburn, P.J. Predicting the Trend of Dissolved Oxygen Based on the kPCA-RNN Model. Water 2020, 12, 585. [Google Scholar] [CrossRef]
- Hien Than, N.; Dinh Ly, C.; Van Tat, P. The Performance of Classification and Forecasting Dong Nai River Water Quality for Sustainable Water Resources Management Using Neural Network Techniques. J. Hydrol. 2021, 596, 126099. [Google Scholar] [CrossRef]
- Sun, W.; Chang, L.-C.; Chang, F.-J. Deep Dive into Predictive Excellence: Transformer’s Impact on Groundwater Level Prediction. J. Hydrol. 2024, 636, 131250. [Google Scholar] [CrossRef]
- Bao, Y.; Xiong, T.; Hu, Z. Multi-Step-Ahead Time Series Prediction Using Multiple-Output Support Vector Regression. Neurocomputing 2014, 129, 482–493. [Google Scholar] [CrossRef]
- Chevillon, G. Direct Multi-Step Estimation and Forecasting. J. Econ. Surv. 2007, 21, 746–785. [Google Scholar] [CrossRef]
- Taieb, S.B.; Atiya, A.F. A Bias and Variance Analysis for Multistep-Ahead Time Series Forecasting. IEEE Trans. Neural Netw. Learning Syst. 2016, 27, 62–76. [Google Scholar] [CrossRef] [PubMed]
- Hu, J.; Zheng, W. Multistage Attention Network for Multivariate Time Series Prediction. Neurocomputing 2020, 383, 122–137. [Google Scholar] [CrossRef]
- Wu, B.; Liang, X.; Zhang, S.; Xun, R. Advancesand Applications in Graph Neural Network. Chin. J. Comput. 2022, 45, 35–68. [Google Scholar]
- Gori, M.; Monfardini, G.; Scarselli, F. A New Model for Learning in Graph Domains. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Montreal, QC, Canada, 31 July–4 August 2005; IEEE: New York, NY, USA, 2005; Volume 2, pp. 729–734. [Google Scholar]
- Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M.; Monfardini, G. The Graph Neural Network Model. IEEE Trans. Neural Netw. 2009, 20, 61–80. [Google Scholar] [CrossRef] [PubMed]
- Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In Proceedings of the Advances in Neural Information Processing Systems29 (NIPS 2016), Barcelona, Spain, 5–10 December 2016; Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R., Eds.; Neural Information Processing Systems: San Diego, CA, USA, 2016. [Google Scholar]
- Yu, B.; Yin, H.; Zhu, Z. Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; pp. 3634–3640. [Google Scholar] [CrossRef]
- Bloemheuvel, S.; van den Hoogen, J.; Atzmueller, M. Graph Construction on Complex Spatiotemporal Data for Enhancing Graph Neural Network-Based Approaches. Int. J. Data Sci. Anal. 2024, 18, 157–174. [Google Scholar] [CrossRef]
- Li, Z.; Liu, H.; Zhang, C.; Fu, G. Real-Time Water Quality Prediction in Water Distribution Networks Using Graph Neural Networks with Sparse Monitoring Data. Water Res. 2024, 250, 121018. [Google Scholar] [CrossRef]
- Li, P.; Hao, H.; Zhang, Z.; Mao, X.; Xu, J.; Lv, Y.; Chen, W.; Ge, D. A Field Study to Estimate Heavy Metal Concentrations in a Soil-Rice System: Application of Graph Neural Networks. Sci. Total Environ. 2022, 832, 155099. [Google Scholar] [CrossRef]
- Zanfei, A.; Brentan, B.M.; Menapace, A.; Righetti, M.; Herrera, M. Graph Convolutional Recurrent Neural Networks for Water Demand Forecasting. Water Resour. Res. 2022, 58, e2022WR032299. [Google Scholar] [CrossRef]
- Xiao, Y.; Yin, H.; Zhang, Y.; Qi, H.; Zhang, Y.; Liu, Z. A Dual-stage Attention-based conv-LSTM Network for Spatio-temporal Correlation and Multivariate Time Series Prediction. Int. J. Intell. Syst. 2021, 36, 2036–2057. [Google Scholar] [CrossRef]
- Fan, J.; Zhang, K.; Huang, Y.; Zhu, Y.; Chen, B. Parallel Spatio-Temporal Attention-Based TCN for Multivariate Time Series Prediction. Neural Comput. Appl. 2023, 35, 13109–13118. [Google Scholar] [CrossRef]
- Hogrefe, C.; Vempaty, S.; Rao, S.T.; Porter, P.S. A Comparison of Four Techniques for Separating Different Time Scales in Atmospheric Variables. Atmos. Environ. 2003, 37, 313–325. [Google Scholar] [CrossRef]
- Welch, P. The Use of Fast Fourier Transform for the Estimation of Power Spectra: A Method Based on Time Averaging over Short, Modified Periodograms. IEEE Trans. Audio Electroacoust. 1967, 15, 70–73. [Google Scholar] [CrossRef]
- Hastaoglu, K.O.; Poyraz, F.; Erdogan, H.; Tiryakioglu, İ.; Ozkaymak, C.; Duman, H.; Gül, Y.; Guler, S.; Dogan, A.; Gul, Y. Determination of Periodic Deformation from InSAR Results Using the FFT Time Series Analysis Method in Gediz Graben. Nat. Hazards 2023, 117, 491–517. [Google Scholar] [CrossRef]
- Karatay, S. Estimation of Frequency and Duration of Ionospheric Disturbances over Turkey with IONOLAB-FFT Algorithm. J. Geodesy 2020, 94, 89. [Google Scholar] [CrossRef]
- Griffin, D.; Lim, J. Signal Estimation from Modified Short-Time Fourier Transform. IEEE Trans. Acoust. Speech Signal Process. 1984, 32, 236–243. [Google Scholar] [CrossRef]
- Liu, P.; Wang, J.; Sangaiah, A.K.; Xie, Y.; Yin, X. Analysis and Prediction of Water Quality Using LSTM Deep Neural Networks in IoT Environment. Sustainability 2019, 11, 2058. [Google Scholar] [CrossRef]
- Zhang, X.; Chen, X.; Zheng, G.; Cao, G. Improved Prediction of Chlorophyll-a Concentrations in Reservoirs by GRU Neural Network Based on Particle Swarm Algorithm Optimized Variational Modal Decomposition. Environ. Res. 2023, 221, 115259. [Google Scholar] [CrossRef] [PubMed]
- Peng, L.; Wu, H.; Gao, M.; Yi, H.; Xiong, Q.; Yang, L.; Cheng, S. TLT: Recurrent Fine-Tuning Transfer Learning for Water Quality Long-Term Prediction. Water Res. 2022, 225, 119171. [Google Scholar] [CrossRef]
- Zeng, A.; Chen, M.; Zhang, L.; Xu, Q. Are Transformers Effective for Time Series Forecasting? arXiv 2022, arXiv:2205.13504. [Google Scholar] [CrossRef]
- Wu, H.; Hu, T.; Liu, Y.; Zhou, H.; Wang, J.; Long, M. TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis. arXiv 2023, arXiv:2210.02186. [Google Scholar]
- Zhang, Y.; Rashid, A.; Guo, S.; Jing, Y.; Zeng, Q.; Li, Y.; Adyari, B.; Yang, J.; Tang, L.; Yu, C.-P.; et al. Spatial Autocorrelation and Temporal Variation of Contaminants of Emerging Concern in a Typical Urbanizing River. Water Res. 2022, 212, 118120. [Google Scholar] [CrossRef] [PubMed]
- van der Maaten, L.; Hinton, G. Visualizing Data Using T-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- Botev, Z.I.; Grotowski, J.F.; Kroese, D.P. Kernel Density Estimation via Diffusion. Ann. Statist. 2010, 38, 2916–2957. [Google Scholar] [CrossRef]
- Spearman, C. The Proof and Measurement of Association between Two Things. Am. J. Psychol. 1904, 15, 72–101. [Google Scholar] [CrossRef]
- Zhi, W.; Ouyang, W.; Shen, C.; Li, L. Temperature Outweighs Light and Flow as the Predominant Driver of Dissolved Oxygen in US Rivers. Nat. Water 2023, 1, 249–260. [Google Scholar] [CrossRef]
- Bouriqi, A.; Ouazzani, N.; Deliege, J.-F. Modeling the Impact of Urban and Industrial Pollution on the Quality of Surface Water in Intermittent Rivers in a Semi-Arid Mediterranean Climate. Hydrology 2024, 11, 150. [Google Scholar] [CrossRef]
- Kaiser, D.; Unger, D.; Qiu, G.; Zhou, H.; Gan, H. Natural and Human Influences on Nutrient Transport through a Small Subtropical Chinese Estuary. Sci. Total Environ. 2013, 450–451, 92–107. [Google Scholar] [CrossRef]
- Xiao, Y.; Zhang, C.; Zhang, T.; Luan, B.; Liu, J.; Zhou, Q.; Li, C.; Cheng, H. Transport Processes of Dissolved and Particulate Nitrogen and Phosphorus over Urban Road Surface during Rainfall Runoff. Sci. Total Environ. 2024, 948, 174905. [Google Scholar] [CrossRef] [PubMed]
- Nguyen, K.T.N.; François, B.; Balasubramanian, H.; Dufour, A.; Brown, C. Prediction of Water Quality Extremes with Composite Quantile Regression Neural Network. Environ. Monit. Assess. 2023, 195, 284. [Google Scholar] [CrossRef]
- Chia, M.Y.; Koo, C.H.; Huang, Y.F.; Di Chan, W.; Pang, J.Y. Artificial Intelligence Generated Synthetic Datasets as the Remedy for Data Scarcity in Water Quality Index Estimation. Water Resour. Manag. 2023, 37, 6183–6198. [Google Scholar] [CrossRef]
- Chen, S.; Huang, J.; Wang, P.; Tang, X.; Zhang, Z. A Coupled Model to Improve River Water Quality Prediction towards Addressing Non-Stationarity and Data Limitation. Water Res. 2024, 248, 120895. [Google Scholar] [CrossRef]
Data Type | Number of Stations | Sampling Frequency | Data Length | Indicators | Data Volume |
---|---|---|---|---|---|
Water Quality Data | 8 | 4 h | 5806 | pH, DO, CODMn, NH3-N, TP, TN, WT, EC, NTU | 418,032 |
Hydrological Data | 2 | 1 day | 968 | Water level, Flow rate | 1936 |
Reservoir Operation Data | 1 | 1 day | 968 | Reservoir water level, Discharge flow | 1936 |
Total | 11 | - | - | - | 421,904 |
Prediction Task Category | Retrospective Time | Prediction Time Step | Target Prediction Indicators |
---|---|---|---|
Short-term Prediction | 42 (7 days) | 1 (4 h) | CODMn, DO, TP, TN |
42 (7 days) | 6 (1 day) | CODMn, DO, TP, TN | |
42 (7 days) | 12 (2 days) | CODMn, DO, TP, TN | |
Long-term Prediction | 42 (7 days) | 24 (4 days) | CODMn, DO, TP, TN |
42 (7 days) | 42 (7 days) | CODMn, DO, TP, TN |
Symbol | Hyperparameter Description | Setting Range | Symbol | Hyperparameter Description | Setting Range |
---|---|---|---|---|---|
d_model | Number of GCN layers | 32 | dropout | Dropout rate | 0.1 |
C | Data dimensionality expansion | 8 | batch_size | Batch size | 4 |
train_epochs | Number of training epochs | 10 | learning_rate | Learning rate | 0.01 |
z | Number of cycles | 1–10 | L | Number of heterogeneous nodes | 5–30 |
M | The coefficient for time-adjacent node selection per cycle | 5–40 | β | Frequency loss ratio | 0–1 |
H | Number of homogeneous nodes | 5–30 | Lossfeq_mode | Frequency domain loss transformation model | FFT |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yuan, M.; Li, Y.; Zhang, L.; Zhao, W.; Zhang, X.; Li, J. Rapid Prediction Approach for Water Quality in Plain River Networks: A Data-Driven Water Quality Prediction Model Based on Graph Neural Networks. Water 2025, 17, 2543. https://doi.org/10.3390/w17172543
Yuan M, Li Y, Zhang L, Zhao W, Zhang X, Li J. Rapid Prediction Approach for Water Quality in Plain River Networks: A Data-Driven Water Quality Prediction Model Based on Graph Neural Networks. Water. 2025; 17(17):2543. https://doi.org/10.3390/w17172543
Chicago/Turabian StyleYuan, Man, Yong Li, Linglei Zhang, Wenjie Zhao, Xingnong Zhang, and Jia Li. 2025. "Rapid Prediction Approach for Water Quality in Plain River Networks: A Data-Driven Water Quality Prediction Model Based on Graph Neural Networks" Water 17, no. 17: 2543. https://doi.org/10.3390/w17172543
APA StyleYuan, M., Li, Y., Zhang, L., Zhao, W., Zhang, X., & Li, J. (2025). Rapid Prediction Approach for Water Quality in Plain River Networks: A Data-Driven Water Quality Prediction Model Based on Graph Neural Networks. Water, 17(17), 2543. https://doi.org/10.3390/w17172543