Machine Learning-Driven Calibration of MODFLOW Models: Comparing Random Forest and XGBoost Approaches
Abstract
1. Introduction
2. The Study Area and Its Geology
3. Materials and Methods
3.1. Methodology
3.2. MODFLOW Model
3.3. Machine Learning Models
3.3.1. Random Forest Model
3.3.2. XGBoost Model
4. Results
4.1. RF Model Results
4.2. XGBoost Model Results
5. Discussion
6. Conclusions
Funding
Data Availability Statement
Conflicts of Interest
References
- Baalousha, H.M. Predictive uncertainty analysis for a highly parameterized karst aquifer using null-space Monte Carlo. Front. Water 2024, 6, 1384983. [Google Scholar] [CrossRef]
- Carrera, J.; Alcolea, A.; Medina, A.; Hidalgo, J.; Slooten, L.J. Inverse problem in hydrogeology. Hydrogeol. J. 2005, 13, 206–222. [Google Scholar] [CrossRef]
- McLaughlin, D.; Townley, L.R. A Reassessment of the Groundwater Inverse Problem. Water Resour. Res. 1996, 32, 1131–1161. [Google Scholar] [CrossRef]
- Eaton, T.T. Heterogeneity in sedimentary aquifers: Challenges for characterization and flow modeling. Sediment. Geol. 2006, 184, 183–186. [Google Scholar] [CrossRef]
- Anderson, M.P.; Woessner, W.W.; Hunt, R.J. Applied Groundwater Modeling: Simulation of Flow and Advective Transport, 2nd ed.; Academic Press: San Diego, CA, USA, 2015. [Google Scholar]
- Moore, C.; Doherty, J. The cost of uniqueness in groundwater model calibration. Adv. Water Resour. 2006, 29, 605–623. [Google Scholar] [CrossRef]
- Freeze, R.A.; Cherry, J.A. Groundwater; Prentice Hall: Englewood Cliffs, NJ, USA, 1979. [Google Scholar]
- Hill, M.C.; Tiedeman, C.R. Effective Groundwater Model Calibration: With Analysis of Data, Sensitivities, Predictions, and Uncertainty; Wiley-Interscience: Hoboken, NJ, USA, 2007. [Google Scholar]
- Doherty, J.; Hunt, R.J. Approaches to highly parameterized inversion: A guide to using PEST for groundwater-model calibration. U.S. Geol. Surv. Sci. Investig. Rep. 2010. [Google Scholar] [CrossRef]
- Boughton, W. Calibrations of a daily rainfall-runoff model with poor quality data. Environ. Model. Softw. 2006, 21, 1114–1128. [Google Scholar] [CrossRef]
- Zhou, H.; Gómez-Hernández, J.J.; Li, L. Inverse methods in hydrogeology: Evolution and recent trends. Adv. Water Resour. 2014, 63, 22–37. [Google Scholar] [CrossRef]
- Knowling, M.J.; Werner, A.D. Estimability of recharge through groundwater model calibration: Insights from a field-scale steady-state example. J. Hydrol. 2016, 540, 973–987. [Google Scholar] [CrossRef]
- Poeter, E.P.; Hill, M.C. Inverse Models: A Necessary Next Step in Ground—Water Modeling. Groundwater 1997, 35, 250–260. [Google Scholar] [CrossRef]
- Doherty, J. Calibration and Uncertainty Analysis for Complex Environmental Models; Watermark Numerical Computing: Brisbane, Australia, 2015. [Google Scholar]
- Yeh, W.W.G.; Lee, C.H. Review of parameter identification procedures in groundwater hydrology: The inverse problem. Water Resour. Res. 2007, 43, W02403. [Google Scholar] [CrossRef]
- Doherty, J. PEST: Model-Independent Parameter Estimation, 5th ed.; Watermark Numerical Computing: Brisbane, Australia, 2010. [Google Scholar]
- Tonkin, M.J.; Doherty, J. A hybrid regularized inversion methodology for highly parameterized environmental models. Water Resour. Res. 2005, 41, W10412. [Google Scholar] [CrossRef]
- Certes, C.; de Marsily, G. Application of the pilot point method to the identification of aquifer transmissivities. Adv. Water Resour. 1991, 14, 284–300. [Google Scholar] [CrossRef]
- Cooley, R.L. A Theory for Modeling Ground-Water Flow in Heterogeneous Media; U.S. Geological Survey Professional Paper 1679; U.S. Geological Survey: Reston, VA, USA, 2004. [Google Scholar]
- Hendricks Franssen, H.J.; Kinzelbach, W. Real-time groundwater flow modeling with the Ensemble Kalman Filter: Joint estimation of states and parameters and the filter inbreeding problem. Water Resour. Res. 2008, 44, W09408. [Google Scholar] [CrossRef]
- Hunt, R.J.; Doherty, J.; Tonkin, M.J. Are models too simple? Arguments for increased parameterization. Groundwater 2007, 45, 254–262. [Google Scholar] [CrossRef] [PubMed]
- Jacob, D.; Ackerer, P.; Baalousha, H.M.; Delay, F. Large-Scale Water Storage in Aquifers: Enhancing Qatar’s Groundwater Resources. Water 2021, 13, 2405. [Google Scholar] [CrossRef]
- Keating, E.H.; Doherty, J.; Vrugt, J.A.; Kang, Q. Optimization and uncertainty assessment of strongly nonlinear groundwater models with high parameter dimensionality. Water Resour. Res. 2010, 46, W10517. [Google Scholar] [CrossRef]
- Vrugt, J.A.; ter Braak, C.J.; Clark, M.P.; Hyman, J.M.; Robinson, B.A. Treatment of input uncertainty in hydrologic modeling: Doing hydrology backward with Markov chain Monte Carlo simulation. Water Resour. Res. 2008, 44, W00B09. [Google Scholar] [CrossRef]
- Xu, T.; Valocchi, A.J.; Choi, J.; Amir, E. Use of Machine Learning Methods to Reduce Predictive Error of Groundwater Models. Groundwater 2014, 52, 448–460. [Google Scholar] [CrossRef]
- Payne, K.; Chami, P.; Odle, I.; Yawson, D.O.; Paul, J.; Maharaj-Jagdip, A.; Cashman, A. Machine Learning for Surrogate Groundwater Modelling of a Small Carbonate Island. Hydrology 2022, 10, 2. [Google Scholar] [CrossRef]
- Di Salvo, C. Improving Results of Existing Groundwater Numerical Models Using Machine Learning Techniques: A Review. Water 2022, 14, 2307. [Google Scholar] [CrossRef]
- Asher, M.J.; Croke, B.F.W.; Jakeman, A.J.; Peeters, L.J.M. A review of surrogate models and their application to groundwater modeling. Water Resour. Res. 2015, 51, 5957–5973. [Google Scholar] [CrossRef]
- Müller, J.; Park, J.; Sahu, R.; Varadharajan, C.; Arora, B.; Faybishenko, B.; Agarwal, D. Surrogate optimization of deep neural networks for groundwater predictions. J. Glob. Optim. 2021, 81, 203–231. [Google Scholar] [CrossRef]
- Luo, J.; Ma, X.; Ji, Y.; Li, X.; Song, Z.; Lu, W. Review of machine learning-based surrogate models of groundwater contaminant modeling. Environ. Res. 2023, 238, 117268. [Google Scholar] [CrossRef] [PubMed]
- Nearing, G.S.; Tian, Y.; Gupta, H.V.; Clark, M.P.; Harrison, K.W.; Weijs, S.V. A philosophical basis for hydrological uncertainty. Hydrol. Sci. J. 2016, 61, 1666–1678. [Google Scholar] [CrossRef]
- Cloke, H.L.; Pappenberger, F.; Van Andel, S.J.; Schaake, J.; Thielen, J.; Ramos, M. Hydrological ensemble prediction systems. Hydrol. Process. 2013, 27, 1–4. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Soomro, K.; Bhutta, M.N.M.; Khan, Z.; Tahir, M.A. Smart city big data analytics: An advanced review. WIREs Data Min. Knowl. Discov. 2019, 9, e1319. [Google Scholar] [CrossRef]
- Ajjur, S.B.; Ghamdi, S.G.A.; Baalousha, H.M. Sustainable development of Qatar aquifers under global warming impact. Int. J. Glob. Warm. 2021, 25, 323. [Google Scholar] [CrossRef]
- Eccleston, B.L.; Pike, J.G.; Harhash, I. The Water Resources of Qatar and Their Development; Technical Report No. 5; Food and Agriculture Organization (FAO) of the United Nations: Doha, Qatar, 1981; Volume 2. [Google Scholar]
- Al-Hajari, S. Geology of the Tertiary and Its Influence on the Aquifer System of Qatar and Eastern Arabia. Ph.D. Thesis, University of South Carolina, Columbia, SC, USA, 1990. [Google Scholar]
- Baalousha, H.M.; Fahs, M.; Ramasomanana, F.; Younes, A. Effect of Pilot-Points Location on Model Calibration: Application to the Northern Karst Aquifer of Qatar. Water 2019, 11, 679. [Google Scholar] [CrossRef]
- Bilal, H.; Govindan, R.; Al-Ansari, T. Investigation of Groundwater Depletion in the State of Qatar and Its Implication to Energy Water and Food Nexus. Water 2021, 13, 2464. [Google Scholar] [CrossRef]
- Aloui, S.; Zghibi, A.; Mazzoni, A.; Abushaikha, A.S.; Elomri, A. Assessing groundwater quality and suitability in Qatar: Strategic insights for sustainable water management and environmental protection. Environ. Sustain. Indic. 2025, 25, 100582. [Google Scholar] [CrossRef]
- Alhaj, M.; Mohammed, S.; Darwish, M.; Hassan, A.; Al-Ghamdi, S.G. A review of Qatar’s water resources, consumption and virtual water trade. Desalin. Water Treat. 2017, 90, 70–85. [Google Scholar] [CrossRef]
- Baalousha, H.M.; Barth, N.; Ramasomanana, F.H.; Ahzi, S. Groundwater recharge estimation and its spatial distribution in arid regions using GIS: A case study from Qatar karst aquifer. Model. Earth Syst. Environ. 2018, 4, 1319–1329. [Google Scholar] [CrossRef]
- Baalousha, H. Estimation of natural groundwater recharge in Qatar using GIS. In Proceedings of the 21st International Congress on Modelling and Simulation (MODSIM2015), Gold Coast, Australia, 29 November–4 December 2015; Weber, T., McPhee, M.J., Anderssen, R.S., Eds.; Modelling and Simulation Society of Australia and New Zealand: Canberra, Australia, 2015. [Google Scholar] [CrossRef]
- Baalousha, H.M.; Tawabini, B.; Seers, T.D. Fuzzy or Non-Fuzzy? A Comparison between Fuzzy Logic-Based Vulnerability Mapping and DRASTIC Approach Using a Numerical Model. A Case Study from Qatar. Water 2021, 13, 1288. [Google Scholar] [CrossRef]
- Harbaugh, A.W. MODFLOW-2005, the U.S. Geological Survey Modular Ground-Water Model—The Ground-Water Flow Process; U.S. Geological Survey Techniques and Methods 6–A16; U.S. Geological Survey: Reston, VA, USA, 2005. [Google Scholar] [CrossRef]
- Baalousha, H.M.; Ramasomanana, F.; Fahs, M.; Seers, T.D. Measuring and Validating the Actual Evaporation and Soil Moisture Dynamic in Arid Regions under Unirrigated Land Using Smart Field Lysimeters and Numerical Modeling. Water 2022, 14, 2787. [Google Scholar] [CrossRef]
- Ho, T.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995; Volume 1, pp. 278–282. [Google Scholar] [CrossRef]
- Borup, D.; Christensen, B.J.; Mühlbach, N.S.; Nielsen, M.S. Targeting predictors in random forest regression. Int. J. Forecast. 2023, 39, 841–868. [Google Scholar] [CrossRef]
- Baalousha, H.M. Machine Learning Approaches for Groundwater Vulnerability Assessment in Arid Environments: Enhancing DRASTIC with ANN and Random Forest. Groundw. Sustain. Dev. 2025, 30, 101496. [Google Scholar] [CrossRef]
- Schlumberger Water Services. Studying and Developing the Natural and Artificial Recharge of the Groundwater in Aquifer in the State of Qatar; Ministry of Environment: Doha, Qatar, 2009. [Google Scholar]
Performance Measure | Training Dataset | Testing Dataset |
---|---|---|
Mean Absolute Error (MAE) | 1.4 | 3.78 |
Root Mean Square Error (RMSE) | 4.7 | 12.93 |
Coefficient of Determination R2 | 0.99 | 0.93 |
Performance Measure | Training Dataset | Testing Dataset |
---|---|---|
Mean Absolute Error (MAE) | 8.9 | 9.3 |
Root Mean Square Error (RMSE) | 22.1 | 23.3 |
Coefficient of Determination R2 | 0.86 | 0.85 |
Feature | RF | XGBoost |
---|---|---|
Column | 0.27 | 0.22 |
Row | 0.44 | 0.49 |
Head | 0.1 | 0.1 |
Head squared | 0.09 | 0.1 |
Log head | 0.1 | 0.09 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Baalousha, H.M. Machine Learning-Driven Calibration of MODFLOW Models: Comparing Random Forest and XGBoost Approaches. Geosciences 2025, 15, 303. https://doi.org/10.3390/geosciences15080303
Baalousha HM. Machine Learning-Driven Calibration of MODFLOW Models: Comparing Random Forest and XGBoost Approaches. Geosciences. 2025; 15(8):303. https://doi.org/10.3390/geosciences15080303
Chicago/Turabian StyleBaalousha, Husam Musa. 2025. "Machine Learning-Driven Calibration of MODFLOW Models: Comparing Random Forest and XGBoost Approaches" Geosciences 15, no. 8: 303. https://doi.org/10.3390/geosciences15080303
APA StyleBaalousha, H. M. (2025). Machine Learning-Driven Calibration of MODFLOW Models: Comparing Random Forest and XGBoost Approaches. Geosciences, 15(8), 303. https://doi.org/10.3390/geosciences15080303