Comparison of Multiple Machine Learning Methods for Correcting Groundwater Levels Predicted by Physics-Based Models
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Area and Datasets
2.2. Methods for Predicting Groundwater Levels
2.2.1. Physics-Based Models
2.2.2. Random Forest Model
2.2.3. Extreme Gradient Boost Model
2.2.4. Long Short-Term Memory Model
2.3. Comparative Experimental Setup
2.4. Model Configuration and Parameterization
2.5. Evaluation Metrics
3. Results
3.1. Verification of MODFLOW Model
3.2. Performance Evaluation of Multiple Machine Learning Models
3.3. Comparison of the Correction Effect during Prediction Period
3.3.1. Comparison of Correlation and Error
3.3.2. Comparison of Dynamic Trends
4. Discussion
4.1. Temporal Variation Characteristics of the Correction Effect on Accuracy
4.2. Spatial Variation Characteristics of the Correction Effect on Accuracy
4.3. Applicability of Different Methods and Feature Variables for Correcting Predicted GWLs
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Moghaddam, H.K.; Moghaddam, H.K.; Kivi, Z.R.; Bahreinimotlagh, M.; Alizadeh, M.J. Developing comparative mathematic models, BN and ANN for forecasting of groundwater levels. Groundw. Sustain. Dev. 2019, 9, 100237. [Google Scholar] [CrossRef]
- Long, D.; Yang, W.; Scanlon, B.R.; Zhao, J.; Liu, D.; Burek, P.; Pan, Y.; You, L.; Wada, Y. South-to-North Water Diversion stabilizing Beijing’s groundwater levels. Nat. Commun. 2020, 11, 3665. [Google Scholar] [CrossRef]
- Dangar, S.; Asoka, A.; Mishra, V. Causes and implications of groundwater depletion in India: A review. J. Hydrol. 2021, 596, 126103. [Google Scholar] [CrossRef]
- Taylor, R.G.; Scanlon, B.; Doell, P.; Rodell, M.; van Beek, R.; Wada, Y.; Longuevergne, L.; Leblanc, M.; Famiglietti, J.S.; Edmunds, M.; et al. Ground water and climate change. Nat. Clim. Change 2013, 3, 322–329. [Google Scholar] [CrossRef]
- Famiglietti, J.S. The global groundwater crisis. Nat. Clim. Change 2014, 4, 945–948. [Google Scholar] [CrossRef]
- Richey, A.S.; Thomas, B.F.; Lo, M.-H.; Famiglietti, J.S.; Swenson, S.; Rodell, M. Uncertainty in global groundwater storage estimates in a Total Groundwater Stress framework. Water Resour. Res. 2015, 51, 5198–5216. [Google Scholar] [CrossRef]
- Hellwig, J.; de Graaf, I.E.M.; Weiler, M.; Stahl, K. Large-Scale Assessment of Delayed Groundwater Responses to Drought. Water Resour. Res. 2020, 56, e2019WR025441. [Google Scholar] [CrossRef]
- Doell, P.; Mueller Schmied, H.; Schuh, C.; Portmann, F.T.; Eicker, A. Global-scale assessment of groundwater depletion and related groundwater abstractions: Combining hydrological modeling with information from well observations and GRACE satellites. Water Resour. Res. 2014, 50, 5698–5720. [Google Scholar] [CrossRef]
- Neuman, S.P.; Wierenga, P.J. A Comprehensive Strategy of Hydrogeologic Modeling and Uncertainty Analysis for Nuclear Facilities and Sites; NUREG/CR-6805, Prepared for US Nuclear Regulatory Commission; United States Environmental Protection Agency: Washington, DC, USA, 2003; Volume 309.
- Cooley, R.L. A Theory for Modeling Ground-Water Flow in Heterogeneous Media; U.S. Geological Survey Professional Paper; U.S. Geological Survey: Reston, VA, USA, 2004; Volume 1679, p. 220.
- Doherty, J.; Christensen, S. Use of paired simple and complex models to reduce predictive bias and quantify uncertainty. Water Resour. Res. 2011, 47, W12534. [Google Scholar] [CrossRef]
- Refsgaard, J.C.; van der Sluijs, J.P.; Brown, J.; van der Keur, P. A framework for dealing with uncertainty due to model structure error. Adv. Water Resour. 2006, 29, 1586–1597. [Google Scholar] [CrossRef]
- Hunt, R.J.; Welter, D.E. Taking Account of “Unknown Unknowns”. Ground Water 2010, 48, 477. [Google Scholar] [CrossRef]
- Hill, M.C.; Tiedeman, C.R. Effective Groundwater Model Calibration: With Analysis of Data, Sensitivities, Predictions, and Uncertainty; Center for Integrated Data Analytics: Middleton, WI, USA, 2007.
- Liu, Y.; Gupta, H.V. Uncertainty in hydrologic modeling: Toward an integrated data assimilation framework. Water Resour. Res. 2007, 43, W07401. [Google Scholar] [CrossRef]
- Vrugt, J.A.; ter Braak, C.J.F.; Clark, M.P.; Hyman, J.M.; Robinson, B.A. Treatment of input uncertainty in hydrologic modeling: Doing hydrology backward with Markov chain Monte Carlo simulation. Water Resour. Res. 2008, 44, W00B09. [Google Scholar] [CrossRef]
- Xu, T.; Valocchi, A.J.; Choi, J.; Amir, E. Use of Machine Learning Methods to Reduce Predictive Error of Groundwater Models. Groundwater 2014, 52, 448–460. [Google Scholar] [CrossRef] [PubMed]
- Demissie, Y.K.; Valocchi, A.J.; Minsker, B.S.; Bailey, B.A. Integrating a calibrated groundwater flow model with error-correcting data-driven models to improve predictions. J. Hydrol. 2009, 364, 257–271. [Google Scholar] [CrossRef]
- Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; Prabhat, F. Deep learning and process understanding for data-driven Earth system science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef] [PubMed]
- Jiang, Z.; Yang, S.; Liu, Z.; Xu, Y.; Shen, T.; Qi, S.; Pang, Q.; Xu, J.; Liu, F.; Xu, T. Can ensemble machine learning be used to predict the groundwater level dynamics of farmland under future climate: A 10-year study on Huaibei Plain. Environ. Sci. Pollut. Res. 2022, 29, 44653–44667. [Google Scholar] [CrossRef]
- Wu, M.; Feng, Q.; Wen, X.; Yin, Z.; Yang, L.; Sheng, D. Deterministic Analysis and Uncertainty Analysis of Ensemble Forecasting Model Based on Variational Mode Decomposition for Estimation of Monthly Groundwater Level. Water 2021, 13, 139. [Google Scholar] [CrossRef]
- Quoc Bao, P.; Kumar, M.; Di Nunno, F.; Elbeltagi, A.; Granata, F.; Islam, A.R.M.T.; Talukdar, S.; Nguyen, X.C.; Ahmed, A.N.; Duong Tran, A. Groundwater level prediction using machine learning algorithms in a drought-prone area. Neural Comput. Appl. 2022, 34, 10751–10773. [Google Scholar] [CrossRef]
- Sun, J.; Hu, L.; Li, D.; Sun, K.; Yang, Z. Data-driven models for accurate groundwater level prediction and their practical significance in groundwater management. J. Hydrol. 2022, 608, 127630. [Google Scholar] [CrossRef]
- Zhang, X.; He, J.; He, B.; Sun, J. Assessment, formation mechanism, and different source contributions of dissolved salt pollution in the shallow groundwater of Hutuo River alluvial-pluvial fan in the North China Plain. Environ. Sci. Pollut. Res. 2019, 26, 35742–35756. [Google Scholar] [CrossRef] [PubMed]
- Zhang, P.; Hao, Q.; Fei, Y.; Li, Y.; Zhu, Y.; Li, J. Simulation-optimization model for groundwater replenishment from the river: A case study in the Hutuo River alluvial fan, China. Water Supply 2022, 22, 6994–7005. [Google Scholar] [CrossRef]
- Harbaugh, A.W. MODFLOW-2005, The US Geological Survey Modular Groundwater Model-the Groundwater Flow Process; U.S. Geological Survey: Reston, VA, USA, 2005.
- Sahin, E.K.; Colkesen, I.; Kavzoglu, T. A comparative assessment of canonical correlation forest, random forest, rotation forest and logistic regression methods for landslide susceptibility mapping. Geocarto Int. 2020, 35, 341–363. [Google Scholar] [CrossRef]
- Groemping, U. Variable Importance Assessment in Regression: Linear Regression versus Random Forest. Am. Stat. 2009, 63, 308–319. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Belgiu, M.; Dragut, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. 2016, 114, 24–31. [Google Scholar] [CrossRef]
- Baker, H.; Hallowell, M.R.; Tixier, A.J.P. AI-based prediction of independent construction safety outcomes from universal attributes. Autom. Constr. 2020, 118, 103146. [Google Scholar] [CrossRef]
- Barzegar, R.; Razzagh, S.; Quilty, J.; Adamowski, J.; Pour, H.K.; Booij, M.J. Improving GALDIT-based groundwater vulnerability predictive mapping using coupled resampling algorithms and machine learning models. J. Hydrol. 2021, 598, 126370. [Google Scholar] [CrossRef]
- Kavzoglu, T.; Teke, A. Predictive Performances of Ensemble Machine Learning Algorithms in Landslide Susceptibility Mapping Using Random Forest, Extreme Gradient Boosting (XGBoost) and Natural Gradient Boosting (NGBoost). Arab. J. Sci. Eng. 2022, 47, 7367–7385. [Google Scholar] [CrossRef]
- Naghibi, S.A.; Ahmadi, K.; Daneshi, A. Application of Support Vector Machine, Random Forest, and Genetic Algorithm Optimized Random Forest Models in Groundwater Potential Mapping. Water Resour. Manag. 2017, 31, 2761–2775. [Google Scholar] [CrossRef]
- Rahmati, O.; Pourghasemi, H.R.; Melesse, A.M. Application of GIS-based data driven random forest and maximum entropy models for groundwater potential mapping: A case study at Mehran Region, Iran. Catena 2016, 137, 360–372. [Google Scholar] [CrossRef]
- Choubin, B.; Rahmati, O. Water Engineering Modeling and Mathematic Tools; Elsevier: Amsterdam, The Netherlands, 2021. [Google Scholar]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Jing, H.; He, X.; Tian, Y.; Lancia, M.; Cao, G.; Crivellari, A.; Guo, Z.; Zheng, C. Comp arison and interpretation of data-driven models for simulating site-specific human-impacted groundwater dynamics in the North China Plain. J. Hydrol. 2023, 616, 128751. [Google Scholar] [CrossRef]
- Li, L.; Qiao, J.; Yu, G.; Wang, L.; Li, H.-Y.; Liao, C.; Zhu, Z. Interpretable tree-based ensemble model for predicting beach water quality. Water Res. 2022, 211, 118078. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Greff, K.; Srivastava, R.K.; Koutnik, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A Search Space Odyssey. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 2222–2232. [Google Scholar] [CrossRef]
- Bowes, B.D.; Sadler, J.M.; Morsy, M.M.; Behl, M.; Goodall, J.L. Forecasting Groundwater Table in a Flood Prone Coastal City with Long Short-term Memory and Recurrent Neural Networks. Water 2019, 11, 1098. [Google Scholar] [CrossRef]
- Guo, X.; Gui, X.; Xiong, H.; Hu, X.; Li, Y.; Cui, H.; Qiu, Y.; Ma, C. Critical role of climate factors for groundwater potential mapping in arid regions: Insights from random forest, XGBoost, and LightGBM algorithms. J. Hydrol. 2023, 621, 129599. [Google Scholar] [CrossRef]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; The MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Yang, S.; Yang, D.; Chen, J.; Zhao, B. Real-time reservoir operation using recurrent neural networks and inflow forecast from a distributed hydrological model. J. Hydrol. 2019, 579, 124229. [Google Scholar] [CrossRef]
- Moriasi, D.N.; Arnold, J.G.; Liew, M.W.V.; Bingner, R.L.; Harmel, R.D.; Veith, T.L. Model Evaluation Guidelines for Systematic Quantification of Accuracy in Watershed Simulations. Trans. ASABE 2007, 50, 885–900. [Google Scholar] [CrossRef]
- Yin, W.; Fan, Z.; Tangdamrongsub, N.; Hu, L.; Zhang, M. Comparison of physical and data-driven models to forecast groundwater level changes with the inclusion of GRACE—A case study over the state of Victoria, Australia. J. Hydrol. 2021, 602, 126735. [Google Scholar] [CrossRef]
- Zhang, M.; Hu, L.; Yao, L.; Yin, W. Surrogate Models for Sub-Region Groundwater Management in the Beijing Plain, China. Water 2017, 9, 766. [Google Scholar] [CrossRef]
- Zhang, M.; Hu, L.; Yao, L.; Yin, W. Numerical studies on the influences of the South-to-North Water Transfer Project on groundwater level changes in the Beijing Plain, China. Hydrol. Process. 2018, 32, 1858–1873. [Google Scholar] [CrossRef]







| Data Types | Spatial Scale | Time Series | Time Scale | 
|---|---|---|---|
| Precipitation | 12 meteorological observation stations | January 2015–May 2019 | Monthly | 
| Ecological water replenishment | September 2018–May 2019 | Monthly | |
| Groundwater exploitation | 12 districts | January 2015–May 2019 | Monthly | 
| Groundwater level | 39 wells | January 2015–May 2019 | Monthly | 
| Model | Hyperparameter | Ranges | 
|---|---|---|
| RF | Max_depth | 1–20 | 
| Min_samples_leaf | 1–5 | |
| N_estimators | 1–500 | |
| XGBoost | Colsamaple_bytree | 0–0.9 | 
| Eta | 0.001–0.1 | |
| Gamma | 0.1–0.5 | |
| Max_depth | 2–10 | |
| Min_child_weight | 1–8 | |
| LSTM | Time step | 1–25 | 
| Number of neurons | {, , , , , } | 
| Model | NSE | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
| MODFLOW | 0.98 | |||||||||
| RF1 | 0.85 | 0.90 | 0.66 | 0.93 | 0.87 | 0.89 | 0.97 | 0.53 | 0.66 | 0.84 | 
| RF2 | 0.92 | 0.92 | 0.73 | 0.96 | 0.94 | 0.93 | 0.98 | 0.73 | 0.84 | 0.80 | 
| XGBoost1 | 0.92 | 0.92 | 0.79 | 0.96 | 0.89 | 0.91 | 0.98 | 0.51 | 0.70 | 0.77 | 
| XGBoost2 | 0.87 | 0.89 | 0.70 | 0.95 | 0.93 | 0.93 | 0.96 | 0.76 | 0.83 | 0.90 | 
| LSTM1 | 0.87 | 0.54 | 0.79 | 0.72 | 0.73 | 0.66 | 0.81 | 0.66 | 0.91 | 0.57 | 
| LSTM2 | 0.97 | 0.92 | 0.98 | 0.95 | 0.95 | 0.89 | 0.91 | 0.64 | 0.97 | 0.73 | 
| Models | Metrics | Well Numbers | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | ||
| RF1 | PR | −0.40 | −0.61 | −0.17 | −0.91 | 0.23 | 0.44 | −0.09 | −0.48 | −0.48 | 0.60 | 
| RF2 | −0.45 | −0.32 | −0.18 | −1.57 | 0.75 | 0.90 | 0.37 | −0.02 | −0.34 | 0.62 | |
| XGBoost1 | −0.54 | −0.42 | −0.18 | −0.91 | 0.22 | 0.44 | −0.09 | −0.73 | −0.99 | 0.57 | |
| XGBoost2 | −0.68 | −0.84 | −0.11 | −1.19 | 0.46 | 1.08 | 0.43 | 0.07 | −0.05 | −0.41 | |
| LSTM1 | 0.01 | −0.08 | −0.05 | 0.08 | 0.02 | 0.35 | −0.02 | −0.27 | 0.13 | 0.35 | |
| LSTM2 | 0.06 | 0.02 | −0.04 | −0.04 | 0.97 | 1.26 | 0.76 | 0.21 | 0.06 | 0.98 | |
| RF1 | RMSE | 0.27 | 0.46 | 0.52 | 0.46 | 0.18 | 0.20 | 1.23 | −0.21 | −1.22 | 0.28 | 
| RF2 | 0.22 | 0.45 | 0.84 | 0.43 | 0.27 | 0.07 | 0.87 | −0.48 | −1.23 | 0.31 | |
| XGBoost1 | 0.19 | 0.48 | 0.87 | 0.45 | 0.12 | 0.32 | 1.02 | −0.03 | −0.80 | 0.25 | |
| XGBoost2 | 0.41 | 0.49 | 0.72 | 0.47 | 0.41 | 0.11 | 1.26 | −0.51 | −1.49 | 0.46 | |
| LSTM1 | −0.71 | 0.19 | 0.39 | −0.12 | 0.11 | 0.44 | −0.42 | −0.18 | −0.62 | 0.32 | |
| LSTM2 | −0.97 | −0.51 | −0.57 | −0.17 | −0.19 | −0.41 | −0.59 | −0.83 | −1.59 | −0.36 | |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shuai, G.; Zhou, Y.; Shao, J.; Cui, Y.; Zhang, Q.; Jin, C.; Xu, S. Comparison of Multiple Machine Learning Methods for Correcting Groundwater Levels Predicted by Physics-Based Models. Sustainability 2024, 16, 653. https://doi.org/10.3390/su16020653
Shuai G, Zhou Y, Shao J, Cui Y, Zhang Q, Jin C, Xu S. Comparison of Multiple Machine Learning Methods for Correcting Groundwater Levels Predicted by Physics-Based Models. Sustainability. 2024; 16(2):653. https://doi.org/10.3390/su16020653
Chicago/Turabian StyleShuai, Guanyin, Yan Zhou, Jingli Shao, Yali Cui, Qiulan Zhang, Chaowei Jin, and Shuyuan Xu. 2024. "Comparison of Multiple Machine Learning Methods for Correcting Groundwater Levels Predicted by Physics-Based Models" Sustainability 16, no. 2: 653. https://doi.org/10.3390/su16020653
APA StyleShuai, G., Zhou, Y., Shao, J., Cui, Y., Zhang, Q., Jin, C., & Xu, S. (2024). Comparison of Multiple Machine Learning Methods for Correcting Groundwater Levels Predicted by Physics-Based Models. Sustainability, 16(2), 653. https://doi.org/10.3390/su16020653
 
         
                                                


 
       