Selecting a Time-Series Model to Predict Drinking Water Extraction in a Semi-Arid Region in Chihuahua, Mexico
Abstract
:1. Introduction
1.1. Literature Review
1.2. Overview of Relevant Research
2. Materials and Methods
2.1. Study Site
2.2. Junta Municipal de Agua y Sameamiento (JMAS) Meoqui
2.3. Data Collection
2.3.1. Use of Urban Wells
2.3.2. Total Extraction
2.3.3. Extraction by Type of Well
2.4. Data Analyses
2.5. Time-Series Methods
2.5.1. Train/Test
- Split time series into training and testing sets.
- Make a train/test set (12 months).
- Visualize the train/test split.
- Modeling.
- ‘Auto ARIMA’ function from forecast.
- ‘Prophet’ algorithm from Prophet.
2.5.2. Machine Learning Models
- Create preprocessing recipe.
- Create model specifications.
- Use workflow to combine model specifications and preprocessing, and fit model.
2.5.3. Preprocessing Recipe
2.5.4. Prophet Boost
2.5.5. The Modeltime Workflow
- Modeltime table. The modeltime table employs a system of identification numbers and the creation of generic descriptions to assist in the organization and tracking of models.
- Calibration. Model calibration is utilized to quantify errors and estimate confidence intervals. Model calibration was conducted on the out-of-sample data set (also referred to as the testing set) to generate the actual values, fitted values, and residuals for the testing set.
- Forecast (testing set).
- The calibration of data allows for the visualization of testing predictions, which may be regarded as a forecast.
- The subsequent step is to calculate the accuracy of the testing process in order to facilitate a comparison of the models.
- Analyze results. The optimal model is selected based on an evaluation of the accuracy measures and forecast results.
- Model evaluation and selection. In the field of data science and machine learning, the assessment of predictive model performance is of paramount importance. In the context of regression problems, where the objective is to predict continuous numerical values, one of the fundamental metrics employed for evaluation is the mean absolute error (MAE).
- 6.
- Refitting. Models are refitted as a best practice prior to forecasting the future. Subsequently, a process of retraining on comprehensive data sets is undertaken. As the models are dependent on the ‘date’ feature, the option ‘h’ (horizon) is employed for forecasting purposes. This value was set to ‘12 months’ in order to forecast the following 12 months of data.
3. Results
3.1. Time-Series Analysis
3.1.1. Extraction from All the Wells
3.1.2. Raw Water Wells
3.1.3. Urban Wells
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AICc | Akaike information criterion corrected |
ARIMA | Autoregressive integrated moving average |
CONAHCYT | Consejo Nacional de Humanidades, Ciencia y Tecnología |
CI | Confidence interval |
IFAI | Instituto Federal de Acceso a la Información |
JMAS | Junta Municipal de Aguas y Saneamiento |
MAE | Mean average error |
RSq | R-squared (coefficient of determination) |
RMSE | Root mean square error |
SARIMA | Seasonal autoregressive integrated moving average |
WHO | World Health Organization |
References
- Peydayesh, M.; Mezzenga, R. Protein nanofibrils for next generation sustainable water purification. Nat. Commun. 2021, 12, 3248. [Google Scholar] [CrossRef] [PubMed]
- Buttinelli, R.; Cortignani, R.; Caracciolo, F. Irrigation water economic value and productivity: An econometric estimation for maize grain production in Italy. Agric. Water Manag. 2024, 295, 108757. [Google Scholar] [CrossRef]
- Postel, S.L. Water and world population growth. J. Am. Water Work. Assoc. 2000, 92, 131–138. [Google Scholar] [CrossRef]
- Stockholm Environment Institute. 6 Clean Water and Sanitation. 2024. Available online: https://www.government.se/contentassets/0be76988b3444b0881b6513daaf5bb26/6---Clean-water-and-sanitation.pdf (accessed on 1 September 2024).
- MacAllister, D.J. Groundwater Decline is Global but Not Universal. 2024. Available online: https://www.nature.com/articles/d41586-024-00070-3 (accessed on 1 September 2024).
- Henao, C.; Lis-Gutiérrez, J.P.; Lis-Gutiérrez, M.; Ariza-Salazar, J. Determinants of efficient water use and conservation in the Colombian manufacturing industry using machine learning. Humanit. Soc. Sci. Commun. 2024, 11, 1–11. [Google Scholar] [CrossRef]
- Roy, S.; Taloor, A.K.; Bhattacharya, P. A geospatial approach for understanding the spatio-temporal variability and projection of future trend in groundwater availability in the Tawi basin, Jammu, India. Groundw. Sustain. Dev. 2023, 21, 100912. [Google Scholar] [CrossRef]
- Yagbasan, O.; Demir, V.; Yazicigil, H. Trend Analyses of Meteorological Variables and Lake Levels for Two Shallow Lakes in Central Turkey. Water 2020, 12, 414. [Google Scholar] [CrossRef]
- Tejada, A., Jr.; Talento, M.S.; Ebal, L.P.; Villar, C.; Dinglasan, B.L. Forecasting of Monthly Closing Water Level of Angat Dam in the Philippines: SARIMA Modeling Approach. J. Environ. Sci. Manag. 2023, 26, 42–51. [Google Scholar] [CrossRef]
- Niknam, A.; Zare, H.K.; Hosseininasab, H.; Mostafaeipour, A.; Herrera, M. A Critical Review of Short-Term Water Demand Forecasting Tools—What Method Should I Use? Sustainability 2022, 14, 5412. [Google Scholar] [CrossRef]
- Zafra-Mejía, C.A.; Rondón-Quintana, H.A.; Urazán-Bonells, C.F. ARIMA and TFARIMA Analysis of the Main Water Quality Parameters in the Initial Components of a Megacity’s Drinking Water Supply System. Hydrology 2024, 11, 10. [Google Scholar] [CrossRef]
- Agaj, T.; Budka, A.; Janicka, E.; Bytyqi, V. Using ARIMA and ETS models for forecasting water level changes for sustainable environmental management. Sci. Rep. 2024, 14, 22444. [Google Scholar] [CrossRef]
- Barrientos-Torres, D.; Martinez-Ríos, E.A.; Navarro-Tuch, S.A.; Pablos-Hach, J.L.; Bustamante-Bello, R. Water Flow Modeling and Forecast in a Water Branch of Mexico City through ARIMA and Transfer Function Models for Anomaly Detection. Water 2023, 15, 2792. [Google Scholar] [CrossRef]
- Silva, A.C.d.; Silva, F.d.G.B.d.; Valério, V.E.d.M.; Silva, A.T.Y.L.; Marques, S.M.; Reis, J.A.T.d. Application of data prediction models in a real water supply network: Comparison between arima and artificial neural networks. RBRH 2024, 29, e12. [Google Scholar] [CrossRef]
- Cheema, M.A.; Hanif, M.; Albalawi, O.; Mahmoud, E.E.; Nabi, M. Evaluating water-related health risks in East and Central Asian Islamic Nations using predictive models (2020–2030). Sci. Rep. 2024, 14, 16837. [Google Scholar] [CrossRef] [PubMed]
- Zuo, H.; Gou, X.; Wang, X.; Zhang, M. A Combined Model for Water Quality Prediction Based on VMD-TCN-ARIMA Optimized by WSWOA. Water 2023, 15, 4227. [Google Scholar] [CrossRef]
- Jesus, E.d.S.d. Modelos de aprendizagem de máquina para previsão da demanda de água da região metropolitana de Salvador, Bahia. Neural Comput. Appl. 2023, 35, 19669–19683. [Google Scholar] [CrossRef]
- Niknam, A.R.R.; Sabaghzadeh, M.; Barzkar, A.; Shishebori, D. Comparing ARIMA and various deep learning models for long-term water quality index forecasting in Dez River, Iran. Environ. Sci. Pollut. Res. 2024. [CrossRef]
- Jaya, N.A.; Arsyad, M.; Palloan, P. Estimation of Groundwater River Availability in Leang Lonrong Cave Using ARIMA Model and Econophysics Valuation Approach. Adv. Soc. Humanit. Res. 2024, 2, 737–754. [Google Scholar] [CrossRef]
- Xu, J. Forecasting Water Demand With the Long Short-Term Memory Deep Learning Mode. Int. J. Inf. Technol. Syst. Approach (IJITSA) 2024, 17, 1–18. [Google Scholar] [CrossRef]
- Drogkoula, M.; Kokkinos, K.; Samaras, N. A Comprehensive Survey of Machine Learning Methodologies with Emphasis in Water Resources Management. Appl. Sci. 2023, 13, 2147. [Google Scholar] [CrossRef]
- Aquil, M.A.I.; Ishak, W.H.W. Comparison of Machine Learning Models in Forecasting Reservoir Water Level. J. Adv. Res. Appl. Sci. Eng. Technol. 2023, 31, 137–144. [Google Scholar] [CrossRef]
- Pires, C.; Martins, M.V. Enhancing Water Management: A Comparative Analysis of Time Series Prediction Models for Distributed Water Flow in Supply Networks. Water 2024, 16, 1827. [Google Scholar] [CrossRef]
- Dinerstein, E.; Olson, D.; Atchley, J.; Loucks, C.; Contreras-Balderas, S.; Abell, R.; Iñigo-Elias, E.; Enkerlin, E.; Williams, C.; Castilleja, G. Ecoregion-Based Conservation in the Chihuahuan Desert: A Biological Assessment, 2nd ed.; World Wildlife Fund (WWF): Washington, DC, USA, 2001; p. 92. [Google Scholar]
- Legarreta-González, M.A.; Meza-Herrera, C.A.; Rodríguez-Martínez, R.; Chávez-Tiznado, C.S.; Véliz-Deras, F.G. Time Series Analysis to Estimate the Volume of Drinking Water Consumption in the City of Meoqui, Chihuahua, Mexico. Water 2024, 16, 2634. [Google Scholar] [CrossRef]
- JMAS Meoqui. Junta Municipal de Aguas y Saneamiento Meoqui. Available online: http://www.jmasmeoqui.gob.mx/historia.html (accessed on 1 September 2024).
- R Core Team. R: A Language and Environment for Statistical Computing. 2024. Available online: https://www.R-project.org/ (accessed on 1 September 2024).
- Robinson, D.; Hayes, A.; Couch, S. broom: Convert Statistical Objects into Tidy Tibbles. R Package Version 1.0.6. 2024. Available online: https://CRAN.R-project.org/package=broom (accessed on 19 September 2024).
- Kuhn, M.; Frick, H. dials: Tools for Creating Tuning Parameter Values. 2024. R Package Version 1.3.0. Available online: https://CRAN.R-project.org/package=dials (accessed on 19 September 2024).
- Wickham, H.; François, R.; Henry, L.; Müller, K.; Vaughan, D. dplyr: A Grammar of Data Manipulation. 2023. R Package Version 1.1.4. Available online: https://CRAN.R-project.org/package=dplyr (accessed on 19 September 2024).
- Kahle, D.; Wickham, H. ggmap: Spatial Visualization with ggplot2. R J. 2013, 5, 144–161. [Google Scholar] [CrossRef]
- Wickham, H. ggplot2: Elegant Graphics for Data Analysis; Springer: New York, NY, USA, 2016. [Google Scholar]
- Couch, S.P.; Bray, A.P.; Ismay, C.; Chasnovski, E.; Baumer, B.S.; Çetinkaya Rundel, M. infer: An R package for tidyverse-friendly statistical inference. J. Open Source Softw. 2021, 6, 3661. [Google Scholar] [CrossRef]
- Grolemund, G.; Wickham, H. Dates and Times Made Easy with lubridate. J. Stat. Softw. 2011, 40, 1–25. [Google Scholar] [CrossRef]
- Kuhn, M. modeldata: Data Sets Useful for Modeling Examples. R Package Version 1.4.0. 2024. Available online: https://CRAN.R-project.org/package=modeldata (accessed on 19 September 2024).
- Dancho, M. modeltime: The Tidymodels Extension for Time Series Modeling. R Package Version 1.3.0. 2024. Available online: https://CRAN.R-project.org/package=modeltime (accessed on 19 September 2024).
- Kuhn, M.; Vaughan, D. parsnip: A Common API to Modeling and Analysis Functions. R Package Version 1.2.1. 2024. Available online: https://CRAN.R-project.org/package=parsnip (accessed on 19 September 2024).
- Wickham, H.; Henry, L. purrr: Functional Programming Tools. R Package Version 1.0.2. 2023. Available online: https://CRAN.R-project.org/package=purrr (accessed on 19 September 2024).
- Wickham, H.; Hester, J.; Bryan, J. readr: Read Rectangular Text Data. R Package Version 2.1.5. 2024. Available online: https://CRAN.R-project.org/package=readr (accessed on 19 September 2024).
- Kuhn, M.; Wickham, H.; Hvitfeldt, E. recipes: Preprocessing and Feature Engineering Steps for Modeling. R Package Version 1.1.0. 2024. Available online: https://CRAN.R-project.org/package=recipes (accessed on 19 September 2024).
- Wickham, H. Reshaping Data with the reshape Package. J. Stat. Softw. 2007, 21, 1–20. [Google Scholar] [CrossRef]
- Frick, H.; Chow, F.; Kuhn, M.; Mahoney, M.; Silge, J.; Wickham, H. rsample: General Resampling Infrastructure. R Package Version 1.2.1. 2024. Available online: https://CRAN.R-project.org/package=rsample (accessed on 19 September 2024).
- Wickham, H.; Pedersen, T.L.; Seidel, D. scales: Scale Functions for Visualization. R Package Version 1.3.0. 2023. Available online: https://CRAN.R-project.org/package=scales (accessed on 19 September 2024).
- Wickham, H. stringr: Simple, Consistent Wrappers for Common String Operations. 2023. R Package Version 1.5.1. Available online: https://CRAN.R-project.org/package=stringr (accessed on 19 September 2024).
- Müller, K.; Wickham, H. tibble: Simple Data Frames. R Package Version 3.2.1. 2023. Available online: https://CRAN.R-project.org/package=tibble (accessed on 19 September 2024).
- Kuhn, M.; Wickham, H. Tidymodels: A Collection of Packages for Modeling and Machine Learning Using Tidyverse Principles. 2020. Available online: https://www.tidymodels.org (accessed on 19 September 2024).
- Wickham, H.; Vaughan, D.; Girlich, M. tidyr: Tidy Messy Data. 2024. R Package Version 1.3.1. Available online: https://CRAN.R-project.org/package=tidyr (accessed on 19 September 2024).
- Wickham, H.; Averick, M.; Bryan, J.; Chang, W.; McGowan, L.D.; François, R.; Grolemund, G.; Hayes, A.; Henry, L.; Hester, J.; et al. Welcome to the tidyverse. J. Open Source Softw. 2019, 4, 1686. [Google Scholar] [CrossRef]
- Dancho, M.; Vaughan, D. timetk: A Tool Kit for Working with Time Series. 2023. R Package Version 2.9.0. Available online: https://CRAN.R-project.org/package=timetk (accessed on 19 September 2024).
- Barth, M. tinylabels: Lightweight Variable Labels. 2023. R Package Version 0.2.4. Available online: https://CRAN.R-project.org/package=tinylabels (accessed on 19 September 2024).
- Pohlert, T. trend: Non-Parametric Trend Tests and Change-Point Detection. 2023. R Package Version 1.1.6. Available online: https://CRAN.R-project.org/package=trend (accessed on 19 September 2024).
- Kuhn, M. tune: Tidy Tuning Tools. 2024. R Package Version 1.2.1. Available online: https://CRAN.R-project.org/package=tune (accessed on 19 September 2024).
- Vaughan, D.; Couch, S. workflows: Modeling Workflows. 2024. R Package Version 1.1.4. Available online: https://CRAN.R-project.org/package=workflows (accessed on 19 September 2024).
- Kuhn, M.; Couch, S. workflowsets: Create a Collection of ‘tidymodels’ Workflows. 2024. R Package Version 1.1.0. Available online: https://CRAN.R-project.org/package=workflowsets (accessed on 19 September 2024).
- Kuhn, M.; Vaughan, D.; Hvitfeldt, E. yardstick: Tidy Characterizations of Model Performance. 2024. Available online: https://CRAN.R-project.org/package=yardstick (accessed on 19 September 2024).
- Alsharif, M.H.; Younes, M.K.; Kim, J. Time series ARIMA model for prediction of daily and monthly average global solar radiation: The case study of Seoul, South Korea. Symmetry 2019, 11, 240. [Google Scholar] [CrossRef]
- Purnaningrum, E.; Athoillah, M. SVM approach for forecasting international tourism arrival in East Java. J. Phys. Conf. Ser. 2021, 1863, 012060. [Google Scholar] [CrossRef]
- Neudakhina, Y.; Trofimov, V. An ANN-based intelligent system for forecasting monthly electric energy consumption. In Proceedings of the 2021 3rd International Conference on Control Systems, Mathematical Modeling, Automation and Energy Efficiency (SUMMA), Lipetsk, Russia, 10–12 November 2021; IEEE: New York, NY, USA, 2021; pp. 544–547. [Google Scholar] [CrossRef]
- de Medrano, R.; de Buen Remiro, V.; Aznarte, J.L. SOCAIRE: Forecasting and monitoring urban air quality in Madrid. Environ. Model. Softw. 2021, 143, 105084. [Google Scholar] [CrossRef]
- Deb, C.; Zhang, F.; Yang, J.; Lee, S.E.; Shah, K.W. A review on time series forecasting techniques for building energy consumption. Renew. Sustain. Energy Rev. 2017, 74, 902–924. [Google Scholar] [CrossRef]
- Taylor, S.J.; Letham, B. Forecasting at scale. Am. Stat. 2018, 72, 37–45. [Google Scholar] [CrossRef]
- Toharudin, T.; Pontoh, R.S.; Caraka, R.E.; Zahroh, S.; Lee, Y.; Chen, R.C. Employing long short-term memory and Facebook prophet model in air temperature forecasting. Commun.-Stat.-Simul. Comput. 2023, 52, 279–290. [Google Scholar] [CrossRef]
- Satrio, C.B.A.; Darmawan, W.; Nadia, B.U.; Hanafiah, N. Time series analysis and forecasting of coronavirus disease in Indonesia using ARIMA model and PROPHET. Procedia Comput. Sci. 2021, 179, 524–532. [Google Scholar] [CrossRef]
- Amber, K.P.; Aslam, M.W.; Mahmood, A.; Kousar, A.; Younis, M.Y.; Akbar, B.; Chaudhary, G.Q.; Hussain, S.K. Energy consumption forecasting for university sector buildings. Energies 2017, 10, 1579. [Google Scholar] [CrossRef]
- Paudel, S.; Elmitri, M.; Couturier, S.; Nguyen, P.H.; Kamphuis, R.; Lacarrière, B.; Le Corre, O. A relevant data selection method for energy consumption prediction of low energy building based on support vector machine. Energy Build. 2017, 138, 240–256. [Google Scholar] [CrossRef]
- Aslam, S.; Herodotou, H.; Mohsin, S.M.; Javaid, N.; Ashraf, N.; Aslam, S. A survey on deep learning methods for power load and renewable energy forecasting in smart microgrids. Renew. Sustain. Energy Rev. 2021, 144, 110992. [Google Scholar] [CrossRef]
- Voyant, C.; Notton, G.; Kalogirou, S.; Nivet, M.L.; Paoli, C.; Motte, F.; Fouilloy, A. Machine learning methods for solar radiation forecasting: A review. Renew. Energy 2017, 105, 569–582. [Google Scholar] [CrossRef]
- Fernández-Delgado, M.; Cernadas, E.; Barro, S.; Amorim, D. Do we need hundreds of classifiers to solve real world classification problems? J. Mach. Learn. Res. 2014, 15, 3133–3181. [Google Scholar]
- Callens, A.; Morichon, D.; Abadie, S.; Delpey, M.; Liquet, B. Using Random forest and Gradient boosting trees to improve wave forecast at a specific location. Appl. Ocean. Res. 2020, 104, 102339. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Ordóñez, C.; Lasheras, F.S.; Roca-Pardiñas, J.; de Cos Juez, F.J. A hybrid ARIMA–SVM model for the study of the remaining useful life of aircraft engines. J. Comput. Appl. Math. 2019, 346, 184–191. [Google Scholar] [CrossRef]
- Shahriar, S.A.; Kayes, I.; Hasan, K.; Hasan, M.; Islam, R.; Awang, N.R.; Hamzah, Z.; Rak, A.E.; Salam, M.A. Potential of ARIMA-ANN, ARIMA-SVM, DT and CatBoost for atmospheric PM2. 5 forecasting in Bangladesh. Atmosphere 2021, 12, 100. [Google Scholar] [CrossRef]
- Dave, E.; Leonardo, A.; Jeanice, M.; Hanafiah, N. Forecasting Indonesia exports using a hybrid model ARIMA-LSTM. Procedia Comput. Sci. 2021, 179, 480–487. [Google Scholar] [CrossRef]
- Shi, J.; Guo, J.; Zheng, S. Evaluation of hybrid forecasting approaches for wind speed and power generation time series. Renew. Sustain. Energy Rev. 2012, 16, 3471–3480. [Google Scholar] [CrossRef]
- Azad, A.S.; Sokkalingam, R.; Daud, H.; Adhikary, S.K.; Khurshid, H.; Mazlan, S.N.A.; Rabbani, M.B.A. Water Level Prediction through Hybrid SARIMA and ANN Models Based on Time Series Analysis: Red Hills Reservoir Case Study. Sustainability 2022, 14, 1843. [Google Scholar] [CrossRef]
- Ahmadpour, A.; Mirhashemi, S.H.; Panahi, M. Comparative evaluation of classical and SARIMA-BL time series hybrid models in predicting monthly qualitative parameters of Maroon river. Appl. Water Sci. 2023, 13, 71. [Google Scholar] [CrossRef]
- Yang, Z.; Dong, D.; Chen, Y.; Wang, R. Water Inflow Forecasting Based on Visual MODFLOW and GS-SARIMA-LSTM Methods. Water 2024, 16, 2749. [Google Scholar] [CrossRef]
- Liu, G.; Savic, D.; Fu, G. Short-term water demand forecasting using data-centric machine learning approaches. J. Hydroinformatics 2023, 25, 895–911. [Google Scholar] [CrossRef]
- Rajballie, A.; Tripathi, V.; Chinchamee, A. Water consumption forecasting models—A case study in Trinidad (Trinidad and Tobago). Water Supply 2022, 22, 5434–5447. [Google Scholar] [CrossRef]
- Monir, M.M.; Sarker, S.C.; Islam, M.N. Assessing the changing trends of groundwater level with spatiotemporal scale at the northern part of Bangladesh integrating the MAKESENS and ARIMA models. Model. Earth Syst. Environ. 2024, 10, 443–464. [Google Scholar] [CrossRef]
- Montgomery, M.A.; Elimelech, M. Water and sanitation in developing countries: Including health in the equation. Environ. Sci. Technol. 2007, 41, 17–24. [Google Scholar] [CrossRef] [PubMed]
- Neme Castillo, O.; Valderrama Santibañez, A.L.; Chiatchoua, C. Determinants of productive water consumption and effects on economic activity in Mexico. Econ. Soc. Territ. 2021, 21, 505–537. [Google Scholar] [CrossRef]
- Fondo Mexicano para la Conservación de la Naturaleza; Fundación Este País; Fondo para la Comunicación y Educación Ambiental. Libro Verde; Fondo Mexicano para la Conservación de la Naturaleza: Benito Juárez, Mexico, 2017. [Google Scholar]
- Larraz, B.; García-Rubio, N.; Gámez, M.; Sauvage, S.; Cakir, R.; Raimonet, M.; Pérez, J.M.S. Socio-Economic Indicators for Water Management in the South-West Europe Territory: Sectorial Water Productivity and Intensity in Employment. Water 2024, 16, 959. [Google Scholar] [CrossRef]
- Rahim, M.S.; Nguyen, K.A.; Stewart, R.A.; Giurco, D.; Blumenstein, M. Machine learning and data analytic techniques in digital water metering: A review. Water 2020, 12, 294. [Google Scholar] [CrossRef]
- Shannon, M.A.; Bohn, P.W.; Elimelech, M.; Georgiadis, J.G.; Mariñas, B.J.; Mayes, A.M. Science and technology for water purification in the coming decades. Nature 2008, 452, 301–310. [Google Scholar] [CrossRef]
- Katic, P.; Grafton, R.Q. Optimal groundwater extraction under uncertainty: Resilience versus economic payoffs. J. Hydrol. 2011, 406, 215–224. [Google Scholar] [CrossRef]
Project; Year; Country | Methodology | Results |
---|---|---|
ARIMA and TFARIMA analysis of the main water quality parameters in the initial components of a megacity’s drinking water supply system; 2024; Colombia [11]. | ARIMA and TFARIMA | The autoregressive term of the models is a valuable tool for examining the transfer of effects between components of a drinking water supply system. The moving average term is similarly useful for investigating the impact of external factors on water quality in each drinking water supply system component. |
Using ARIMA and ETS models for forecasting water level changes for sustainable environmental management; 2024; Kosovo [12]. | ARIMA and error trend and seasonality, or exponential smoothing (ETS). | The results demonstrate the applicability of the models utilized in this research, as evidenced by the root mean square error and the mean absolute error. |
Water flow modeling and forecast in a water branch of Mexico City through ARIMA and transfer function models for anomaly detection; 2024; México [13]. | Autoregressive integrated moving average models and transfer function models generated via the Box–Jenkins approach to modeling the water flow in water distribution systems for anomaly detection. | The two methods were employed to identify the optimal model type for each variable within the analyzed water branch. The results demonstrated that the seasonal ARIMA models exhibited a lower mean absolute percentage error compared to the fitted transfer function models. |
Application of data prediction models in a real water supply network: comparison between ARIMA and artificial neural networks; 2024; Brazil [14]. | ARIMA and multilayer perceptron artificial neural networks. | The ARIMA model exhibited the greatest predictive efficacy for the data set under consideration, with a mean absolute percentage error of 8.54%. |
Evaluating water-related health risks in East and Central Asian Islamic Nations using predictive models (2020–2030); 2024; Tajikistan, Armenia, Azerbaijan, Central Asia, Kazakhstan, Kyrgyzstan, Mongolia, Turkmenistan, and Uzbekistan [15]. | ARIMA, exponential smoothing method, support vector machine, and artificial neural networks. | The results indicate that support vector machines are the most accurate method for forecasting deaths and disability-adjusted life years, outperforming autoregressive integrated moving average, exponential smoothing, and neural networks. |
A combined model for water quality prediction based on VMD-TCN-ARIMA Optimized by WSWOA; 2023; China [16]. | Variational mode decomposition–temporal convolutional networks–autoregressive integrated moving average (VMD-TCN-ARIMA) optimized by weighted swarm whale search algorithm (WSWOA). | The data pertaining to the water quality characteristic of dissolved oxygen, the root mean square error of the proposed model, and the computational time were reduced by 41.05% and 26.06%, respectively. This had the further beneficial effect of improving the accuracy and efficiency of the prediction. |
Machine learning models for forecasting water demand for the metropolitan region of Salvador, Bahia; 2023; Brazil [17]. | Hybrid SVR-ANN model. | The results demonstrated the feasibility of employing the proposed model in comparison to other traditional models, including multilayer perceptron, support vector regression, short long-term memory, and autoregressive integrated moving average. |
Comparing ARIMA and various deep learning models for long-term water quality index forecasting in Dez River, Iran; 2024; Iran [18]. | ARIMA and five deep learning models including Simple_RNN, LSTM, CNN, GRU, and MLP. | The findings suggest that the ARIMA model exhibits inferior performance compared to the deep learning models. The deep learning models exhibit comparable results, as evidenced by their similar statistical index values. |
Integrating digital twins and artificial intelligence multi-modal Transformers into water resource management: overview and advanced predictive framework; 2024; Indonesia, [19]. | ARIMA. | The ARIMA model (0,1,1) was identified as the most suitable for predicting water discharge, with a mean absolute percentage error (MAPE) of 33.7%. |
Forecasting water demand with the long short-term memory deep learning mode; 2024; China [20]. | Integrated ARIMA-LSTM deep learning model, combining ARIMA’s proficiency in linear trend and seasonal modeling with LSTM’s strength in capturing nonlinear time dependencies. | The ARIMA-LSTM model exhibits favorable outcomes, exceeding the performance of standalone models in terms of accuracy. In the validation phase, the model exhibited a high coefficient of determination (R2) of 0.98 and a notably low root mean square error (RMSE) of 2.94. |
A comprehensive survey of machine learning methodologies with emphasis in water resources management; 2024; Greece [21]. | Provide a comparative mapping of all ML methodologies to specific water management tasks. | While ML methodologies offer promising solutions in water management, they are not without challenges. These include issues related to data quality and quantity, interpretability and explainability, generalization, and integration with domain knowledge. Incomplete or inaccurate data can result in unreliable predictions. |
Comparison of machine learning models in forecasting reservoir water level; 2023; Malaysia [22]. | Twelve algorithms were chosen and employed: (1) linear regression, (2) passive aggressive, (3) decision tree, (4) random forest, (5) extra tree, (6) Adaboost, (7) GradientBoost, (8) MVR, (9) LSTM encoder–decoder model, (10) BI-LSTM, (11) ARIMA, and (12) VARMAX. | The ARMAX model demonstrates the highest R-squared value. This suggests that the data set is a time series with a seasonal component. In contrast, the ARIMA model is unable to produce satisfactory results when a seasonal component is included. The aforementioned argument is corroborated by the mean absolute error (MAE) and root mean square error (RMSE) values of both models. |
Enhancing water management: a comparative analysis of time series prediction models for distributed water flow in supply networks; 2024; Portugal [23]. | Holt–Winters, ARIMA, LSTM, and Prophet. | Classical models such as Holt–Winters and ARIMA demonstrate superior performance for medium-term predictions, whereas modern models, particularly LSTM, exhibit remarkable proficiency in long-term forecasting by effectively capturing seasonal patterns. |
Mean | SD | Max | Sum | |
---|---|---|---|---|
Raw water | 20,629 | 19,767 | 70,667 | 3,713,294 |
Urban | 55,720 | 48,865 | 179,717 | 63,520,284 |
Package | Version | Reference |
---|---|---|
broom | 1.0.6 | [28] |
dials | 1.3.0 | [29] |
dplyr | 1.1.4 | [30] |
ggmap | 4.0.0 | [31] |
ggplot2 | 3.5.1 | [32] |
infer | 1.0.7 | [33] |
lubridate | 1.9.3 | [34] |
modeldata | 1.4.0 | [35] |
modeltime | 1.3.0 | [36] |
parsnip | 1.2.1 | [37] |
purrr | 1.0.2 | [38] |
readr | 2.1.5 | [39] |
recipes | 1.1.0 | [40] |
reshape2 | 1.4.4 | [41] |
rsample | 1.2.1 | [42] |
scales | 1.3.0 | [43] |
stringr | 1.5.1 | [44] |
tibble | 3.2.1 | [45] |
tidymodels | 1.2.0 | [46] |
tidyr | 1.3.1 | [47] |
tidyverse | 2.0.0 | [48] |
timetk | 2.9.0 | [49] |
tinylabels | 0.2.4 | [50] |
trend | 1.1.6 | [51] |
tune | 1.2.1 | [52] |
workflows | 1.1.4 | [53] |
workflowsets | 1.1.0 | [54] |
yardstick | 1.3.1 | [55] |
Well | Model | AICc |
---|---|---|
1 | SARIMA(1,1,1)(0,0,1)12 | 570.83 |
4 | ARIMA(0,0,0) with zero mean | 233.81 |
5 | ARIMA(2,1,1)(0,0,1)12 | 4665.09 |
6 | ARIMA(0,1,3) with drift | 3630.78 |
7 | ARIMA(1,0,2) with non-zero mean | 4609.78 |
F1 | SARIMA(3,0,0)(1,0,0)12 with non-zero mean | 3488.10 |
F2 | ARIMA(0,1,0) | 3422.91 |
F4 | ARIMA(1,1,0) | 1283.47 |
F5 | SARIMA(0,1,0)(1,0,0)12 | 967.17 |
F6 | SARIMA(0,0,2)(0,0,1)12 with non-zero mean | 1282.99 |
MAE | RMSE | RSq | |
---|---|---|---|
SARIMA | 71,853.85 | 74,886.56 | 0.02 |
Prophet | 101,261.53 | 106,831.02 | 0.01 |
Prophet Boost | 89,280.00 | 95,367.02 | 0.00 |
Lower Confidence Interval | Prediction | Upper Confidence Interval | |
---|---|---|---|
January | 205,093.2 | 355,025.0 | 504,956.7 |
February | 169,504.1 | 319,435.8 | 469,367.6 |
March | 189,629.4 | 339,561.2 | 489,492.9 |
April | 192,240.9 | 342,172.7 | 492,104.4 |
May | 199,216.3 | 349,148.1 | 499,079.9 |
June | 199,554.8 | 349,486.6 | 499,418.3 |
July | 202,196.7 | 352,128.5 | 502,060.3 |
August | 204,191.0 | 354,122.8 | 504,054.5 |
September | 210,553.8 | 360,485.6 | 510,417.3 |
October | 218,087.3 | 368,019.1 | 517,950.8 |
November | 211,911.7 | 361,843.5 | 511,775.3 |
December | 206,874.8 | 356,806.6 | 506,738.4 |
MAE | RMSE | RSq | |
---|---|---|---|
SARIMA | 71,853.85 | 74,886.56 | 0.02 |
Prophet | 41,177.48 | 42,475.08 | 0.54 |
Prophet Boost | 26,559.21 | 29,474.93 | 0.46 |
Lower Confidence Interval | Prediction | Upper Confidence Interval | |
---|---|---|---|
All months | 52,179.3 | 64,399.0 | 76,618.7 |
MAE | RMSE | RSq | |
---|---|---|---|
SARIMA | 61,454.38 | 63,890.99 | 0.00 |
Prophet | 78,664.64 | 85,045.98 | 0.00 |
Prophet Boost | 77,530.39 | 84,446.16 | 0.02 |
Lower Confidence Interval | Prediction | Upper Confidence Interval | |
---|---|---|---|
January | 157,043.2 | 284,960.5 | 412,877.9 |
February | 129,089.8 | 257,007.1 | 384,924.5 |
March | 153,353.5 | 281,270.8 | 409,188.2 |
April | 156,632.2 | 284,549.5 | 412,466.8 |
May | 168,341.4 | 296,258.8 | 424,176.1 |
June | 173,717.3 | 301,634.6 | 429,551.9 |
July | 176,261.4 | 304,178.7 | 432,096.1 |
August | 176,752.3 | 304,669.6 | 432,586.9 |
September | 176,658.7 | 304,576.0 | 432,493.3 |
October | 186,670.0 | 314,587.4 | 442,504.7 |
November | 181,468.6 | 309,385.9 | 437,303.3 |
December | 179,746.4 | 307,663.8 | 435,581.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Legarreta-González, M.A.; Meza-Herrera, C.A.; Rodríguez-Martínez, R.; Loya-González, D.; Chávez-Tiznado, C.S.; Contreras-Villarreal, V.; Véliz-Deras, F.G. Selecting a Time-Series Model to Predict Drinking Water Extraction in a Semi-Arid Region in Chihuahua, Mexico. Sustainability 2024, 16, 9722. https://doi.org/10.3390/su16229722
Legarreta-González MA, Meza-Herrera CA, Rodríguez-Martínez R, Loya-González D, Chávez-Tiznado CS, Contreras-Villarreal V, Véliz-Deras FG. Selecting a Time-Series Model to Predict Drinking Water Extraction in a Semi-Arid Region in Chihuahua, Mexico. Sustainability. 2024; 16(22):9722. https://doi.org/10.3390/su16229722
Chicago/Turabian StyleLegarreta-González, Martín Alfredo, César A. Meza-Herrera, Rafael Rodríguez-Martínez, Darithsa Loya-González, Carlos Servando Chávez-Tiznado, Viridiana Contreras-Villarreal, and Francisco Gerardo Véliz-Deras. 2024. "Selecting a Time-Series Model to Predict Drinking Water Extraction in a Semi-Arid Region in Chihuahua, Mexico" Sustainability 16, no. 22: 9722. https://doi.org/10.3390/su16229722
APA StyleLegarreta-González, M. A., Meza-Herrera, C. A., Rodríguez-Martínez, R., Loya-González, D., Chávez-Tiznado, C. S., Contreras-Villarreal, V., & Véliz-Deras, F. G. (2024). Selecting a Time-Series Model to Predict Drinking Water Extraction in a Semi-Arid Region in Chihuahua, Mexico. Sustainability, 16(22), 9722. https://doi.org/10.3390/su16229722