Wavelet-Enhanced Machine Learning for Seawater Alkalinity Prediction in the Arabian Gulf Using Monitored Water-Quality Variables
Abstract
1. Introduction
2. Methodology
2.1. Data Acquisition and Quality Control
2.2. Normalization of Input Variables
2.3. Feature Importance Analysis
2.4. Wavelet-Based Feature Enrichment
2.5. Machine Learning Models
2.5.1. Random Forest Regression
2.5.2. Gradient Boosting Regression
2.5.3. Extreme Gradient Boosting
2.5.4. Support Vector Regression
2.5.5. K-Nearest Neighbors
2.6. Model Performance Evaluation
2.7. Sensitivity Analysis
3. Results
3.1. Exploratory Data Analysis of Water Quality Variables and Alkalinity
3.2. Statistical Characteristics and Interrelationships of Water Quality Variables
3.3. Predictive Performance of Machine Learning Models
3.3.1. Baseline Model Performance Using Original Predictors
3.3.2. Effect of Wavelet-Based Feature Enrichment on Model Performance
3.3.3. Comparative Evaluation of Baseline and Wavelet-Enhanced Models
3.4. Hyperparameter Configuration for Baseline and Wavelet-Enriched Models
3.5. Error Diagnostics and Model Uncertainty Analysis
3.5.1. Diagnostic Evaluation of Baseline Model Performance
3.5.2. Diagnostic Evaluation of Wavelet-Enriched Model Performance
3.6. Sensitivity Analysis Results
4. Discussion
4.1. Model Performance and Algorithmic Behavior
4.2. Effect of Wavelet-Based Feature Enrichment
4.3. Diagnostic Evaluation and Model Reliability
4.4. Practical Implications for Coastal Seawater Monitoring
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Hosseini, H.; Saadaoui, I.; Moheimani, N.; Al Saidi, M.; Al Jamali, F.; Al Jabri, H.; Hamadou, R.B. Marine Health of the Arabian Gulf: Drivers of Pollution and Assessment Approaches Focusing on Desalination Activities. Mar. Pollut. Bull. 2021, 164, 112085. [Google Scholar] [CrossRef] [PubMed]
- Tremblay, L.A.; Chariton, A.A.; Li, M.-S.; Zhang, Y.; Horiguchi, T.; Ellis, J.I. Monitoring the Health of Coastal Environments in the Pacific Region-A Review. Toxics 2023, 11, 277. [Google Scholar] [CrossRef] [PubMed]
- Lattemann, S.; Höpner, T. Environmental Impact and Impact Assessment of Seawater Desalination. Desalination 2008, 220, 1–15. [Google Scholar] [CrossRef]
- Ghaffour, N.; Missimer, T.M.; Amy, G.L. Technical Review and Evaluation of the Economics of Water Desalination: Current and Future Challenges for Better Water Supply Sustainability. Desalination 2013, 309, 197–207. [Google Scholar] [CrossRef]
- Vasou, P.; Krokos, G.; Langodan, S.; Sofianos, S.; Hoteit, I. Contribution of Surface and Lateral Forcing to the Arabian Gulf Warming Trend. Front. Mar. Sci. 2024, 10, 1260058. [Google Scholar] [CrossRef]
- Ibrahim, H.D.; Xue, P.; Eltahir, E.A.B. Multiple Salinity Equilibria and Resilience of Persian/Arabian Gulf Basin Salinity to Brine Discharge. Front. Mar. Sci. 2020, 7, 550181. [Google Scholar] [CrossRef]
- Campos, E.J.D.; Gordon, A.L.; Kjerfve, B.; Vieira, F.; Cavalcante, G. Freshwater Budget in the Persian (Arabian) Gulf and Exchanges at the Strait of Hormuz. PLoS ONE 2020, 15, e0233090. [Google Scholar] [CrossRef] [PubMed]
- Lachkar, Z.; Mehari, M.; Lévy, M.; Paparella, F.; Burt, J.A. Recent Expansion and Intensification of Hypoxia in the Arabian Gulf and Its Drivers. Front. Mar. Sci. 2022, 9, 891378. [Google Scholar] [CrossRef]
- Paparella, F.; D’Agostino, D.; Burt, J.A. Long-Term, Basin-Scale Salinity Impacts from Desalination in the Arabian/Persian Gulf. Sci. Rep. 2022, 12, 20549. [Google Scholar] [CrossRef] [PubMed]
- Florence, T.M.; Batley, G.E.; Benes, P. Chemical Speciation in Natural Waters. C R C Crit. Rev. Anal. Chem. 1980, 9, 219–296. [Google Scholar] [CrossRef]
- Middelburg, J.J.; Soetaert, K.; Hagens, M. Ocean Alkalinity, Buffering and Biogeochemical Processes. Rev. Geophys. 2020, 58, e2019RG000681. [Google Scholar] [CrossRef] [PubMed]
- Rheuban, J.E.; Gassett, P.R.; McCorkle, D.C.; Hunt, C.W.; Liebman, M.; Bastidas, C.; O’Brien-Clayton, K.; Pimenta, A.R.; Silva, E.; Vlahos, P.; et al. Synoptic Assessment of Coastal Total Alkalinity through Community Science. Environ. Res. Lett. 2021, 16, 024009. [Google Scholar] [CrossRef] [PubMed]
- Egleston, E.S.; Sabine, C.L.; Morel, F.M.M. Revelle Revisited: Buffer Factors That Quantify the Response of Ocean Chemistry to Changes in DIC and Alkalinity. Glob. Biogeochem. Cycles 2010, 24, GB1002. [Google Scholar] [CrossRef]
- Schaap, A.; Papadimitriou, S.; Mawji, E.; Walk, J.; Hammermeister, E.; Mowlem, M.; Loucaides, S. Autonomous Sensor for In Situ Measurements of Total Alkalinity in the Ocean. ACS Sens. 2025, 10, 795–803. [Google Scholar] [CrossRef] [PubMed]
- Sonnichsen, C.; Atamanchuk, D.; Hendricks, A.; Morgan, S.; Smith, J.; Grundke, I.; Luy, E.; Sieben, V.J. An Automated Microfluidic Analyzer for In Situ Monitoring of Total Alkalinity. ACS Sens. 2023, 8, 344–352. [Google Scholar] [CrossRef] [PubMed]
- Seelmann, K.; Aßmann, S.; Körtzinger, A. Characterization of a Novel Autonomous Analyzer for Seawater Total Alkalinity: Results from Laboratory and Field Tests. Limnol. Oceanogr. Methods 2019, 17, 515–532. [Google Scholar] [CrossRef]
- Rosenau, N.A.; Galavotti, H.; Yates, K.K.; Bohlen, C.C.; Hunt, C.W.; Liebman, M.; Brown, C.A.; Pacella, S.R.; Largier, J.L.; Nielsen, K.J.; et al. Integrating High-Resolution Coastal Acidification Monitoring Data Across Seven United States Estuaries. Front. Mar. Sci. 2021, 8, 679913. [Google Scholar] [CrossRef] [PubMed]
- Shyu, H.-Y.; Castro, C.J.; Bair, R.A.; Lu, Q.; Yeh, D.H. Development of a Soft Sensor Using Machine Learning Algorithms for Predicting the Water Quality of an Onsite Wastewater Treatment System. ACS Environ. Au 2023, 3, 308–318. [Google Scholar] [CrossRef] [PubMed]
- Qiu, L.; Jiang, K.; Li, Q.; Yuan, D.; Chen, J.; Yang, B.; Achterberg, E.P. Variability of Total Alkalinity in Coastal Surface Waters Determined Using an in-Situ Analyzer in Conjunction with the Application of a Neural Network-Based Prediction Model. Sci. Total Environ. 2024, 908, 168271. [Google Scholar] [CrossRef] [PubMed]
- Nourani, V.; Hosseini Baghanam, A.; Adamowski, J.; Kisi, O. Applications of Hybrid Wavelet–Artificial Intelligence Models in Hydrology: A Review. J. Hydrol. 2014, 514, 358–377. [Google Scholar] [CrossRef]
- Mosavi, A.; Hosseini, F.S.; Choubin, B.; Abdolshahnejad, M.; Gharechaee, H.; Lahijanzadeh, A.; Dineva, A.A.; Mosavi, A.; Hosseini, F.S.; Choubin, B.; et al. Susceptibility Prediction of Groundwater Hardness Using Ensemble Machine Learning Models. Water 2020, 12, 2770. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
- Gomaa, M.N.; Mulla, D.J.; Galzki, J.C.; Sheikho, K.M.; Alhazmi, N.M.; Mohamed, H.E.; Hannachi, I.; Abouwarda, A.M.; Hassan, E.A.; Carmichael, W.W.; et al. Red Sea MODIS Estimates of Chlorophyll a and Phytoplankton Biomass Risks to Saudi Arabian Coastal Desalination Plants. J. Mar. Sci. Eng. 2020, 9, 11. [Google Scholar] [CrossRef]
- Rajabi-Kiasari, S.; Hasanlou, M. An Efficient Model for the Prediction of SMAP Sea Surface Salinity Using Machine Learning Approaches in the Persian Gulf. Int. J. Remote Sens. 2020, 41, 3221–3242. [Google Scholar] [CrossRef]
- Abedini, M.; Esmaeilpour, Y.; Gholami, H.; Bazrafshan, O.; Nafarzadegan, A.R. Change Analysis of Surface Water Clarity in the Persian Gulf and the Oman Sea by Remote Sensing Data and an Interpretable Deep Learning Model. Environ. Sci. Pollut. Res. Int. 2025, 32, 5987–6004. [Google Scholar] [CrossRef] [PubMed]
- García-Ibáñez, M.I.; Guallart, E.F.; Lucas, A.; Pascual, J.; Gasol, J.M.; Marrasé, C.; Calvo, E.; Pelejero, C. Two New Coastal Time-Series of Seawater Carbonate System Variables in the NW Mediterranean Sea: Rates and Mechanisms Controlling pH Changes. Front. Mar. Sci. 2024, 11, 1348133. [Google Scholar] [CrossRef]
- Qiu, L.; Esposito, M.; Martínez-Cabanas, M.; Achterberg, E.P.; Li, Q. Autonomous High-Frequency Time-Series Observations of Total Alkalinity in Dynamic Estuarine Waters. Mar. Chem. 2023, 257, 104332. [Google Scholar] [CrossRef]
- Broullón, D.; Pérez, F.F.; Velo, A.; Hoppema, M.; Olsen, A.; Takahashi, T.; Key, R.M.; Tanhua, T.; González-Dávila, M.; Jeansson, E.; et al. A Global Monthly Climatology of Total Alkalinity: A Neural Network Approach. Earth Syst. Sci. Data 2019, 11, 1109–1127. [Google Scholar] [CrossRef]
- Grbčić, L.; Družeta, S.; Mauša, G.; Lipić, T.; Lušić, D.V.; Alvir, M.; Lučin, I.; Sikirica, A.; Davidović, D.; Travaš, V.; et al. Coastal Water Quality Prediction Based on Machine Learning with Feature Interpretation and Spatio-Temporal Analysis. Environ. Model. Softw. 2022, 155, 105458. [Google Scholar] [CrossRef]
- Mohammed, M.A.A.; Miklós, R.; Darabos, E.; Szabó, N.P.; Szűcs, P. Chemometrics of Karst Systems: Monitoring Climate Impacts on Groundwater Quality in Garadna Spring, Northern Hungary, Using Self-Organizing Maps and Wavelet Transform Analysis. Results Eng. 2025, 28, 108298. [Google Scholar] [CrossRef]
- Takeshita, Y.; Frieder, C.A.; Martz, T.R.; Ballard, J.R.; Feely, R.A.; Kram, S.; Nam, S.; Navarro, M.O.; Price, N.N.; Smith, J.E. Including High Frequency Variability in Coastal Ocean Acidification Projections. Biogeosciences 2015, 12, 5853–5870. [Google Scholar] [CrossRef]
- Fettweis, M.; Riethmüller, R.; Van der Zande, D.; Desmit, X. Sample Based Water Quality Monitoring of Coastal Seas: How Significant Is the Information Loss in Patchy Time Series Compared to Continuous Ones? Sci. Total Environ. 2023, 873, 162273. [Google Scholar] [CrossRef] [PubMed]
- Nascimento, Â.; Biguino, B.; Borges, C.; Cereja, R.; Cruz, J.P.C.; Sousa, F.; Dias, J.; Brotas, V.; Palma, C.; Brito, A.C. Tidal Variability of Water Quality Parameters in a Mesotidal Estuary (Sado Estuary, Portugal). Sci. Rep. 2021, 11, 23112. [Google Scholar] [CrossRef] [PubMed]
- Al-Kaabi, A.; Al-Sulaiti, H.; Al-Ansari, T.; Mackey, H.R. Assessment of Water Quality Variations on Pretreatment and Environmental Impacts of SWRO Desalination. Desalination 2021, 500, 114831. [Google Scholar] [CrossRef]
- Nelson, N.G.; Muñoz-Carpena, R.; Neale, P.J.; Tzortziou, M.; Megonigal, J.P. Temporal Variability in the Importance of Hydrologic, Biotic, and Climatic Descriptors of Dissolved Oxygen Dynamics in a Shallow Tidal-Marsh Creek. Water Resour. Res. 2017, 53, 7103–7120. [Google Scholar] [CrossRef]
- Mallat, S. A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way|Guide Books|ACM Digital Library. Available online: https://dl.acm.org/doi/book/10.5555/1525499 (accessed on 15 January 2026).
- Torrence, C.; Compo, G.P. A Practical Guide to Wavelet Analysis. Bull. Am. Meteorol. Soc. 1998, 79, 61–78. [Google Scholar] [CrossRef]
- Nourani, V.; Kisi, Ö.; Komasi, M. Two Hybrid Artificial Intelligence Approaches for Modeling Rainfall–Runoff Process. J. Hydrol. 2011, 402, 41–59. [Google Scholar] [CrossRef]
- Wang, Y.; Yuan, Y.; Pan, Y.; Fan, Z.; Wang, Y.; Yuan, Y.; Pan, Y.; Fan, Z. Modeling Daily and Monthly Water Quality Indicators in a Canal Using a Hybrid Wavelet-Based Support Vector Regression Structure. Water 2020, 12, 1476. [Google Scholar] [CrossRef]
- Vapnik, V.N. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 2000; ISBN 978-1-4419-3160-3. [Google Scholar]
- Cover, T.; Hart, P. Nearest Neighbor Pattern Classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer Series in Statistics; Springer: New York, NY, USA, 2009; ISBN 978-0-387-84857-0. [Google Scholar]
- Louppe, G.; Wehenkel, L.; Sutera, A.; Geurts, P. Understanding Variable Importances in Forests of Randomized Trees. In Proceedings of the 27th International Conference on Neural Information Processing Systems—Volume 1, 5 December 2013; Curran Associates Inc.: Red Hook, NY, USA, 2013; Volume 1, pp. 431–439. [Google Scholar]
- Daubechies, I. Ten Lectures on Wavelets; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1992; ISBN 978-0-89871-274-2. [Google Scholar]
- Willmott, C.J.; Matsuura, K. Advantages of the Mean Absolute Error (MAE) over the Root Mean Square Error (RMSE) in Assessing Average Model Performance. Clim. Res. 2005, 30, 79–82. [Google Scholar] [CrossRef]
- Kohavi, R. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence—Volume 2, 20 August 1995; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1995; pp. 1137–1143. [Google Scholar]
- Kruskal, W.H.; Wallis, W.A. Use of Ranks in One-Criterion Variance Analysis. J. Am. Stat. Assoc. 1952, 47, 583–621. [Google Scholar] [CrossRef]
- Xiong, Y.; Zhang, T.; Sun, X.; Yuan, W.; Gao, M.; Wu, J.; Han, Z.; Xiong, Y.; Zhang, T.; Sun, X.; et al. Groundwater Quality Assessment Based on the Random Forest Water Quality Index—Taking Karamay City as an Example. Sustainability 2023, 15, 14477. [Google Scholar] [CrossRef]
- Masood, A.; Niazkar, M.; Zakwan, M.; Piraei, R.; Masood, A.; Niazkar, M.; Zakwan, M.; Piraei, R. A Machine Learning-Based Framework for Water Quality Index Estimation in the Southern Bug River. Water 2023, 15, 3543. [Google Scholar] [CrossRef]
- Reza Nikoo, M.; Bahrami, N.; Madani, K.; Al-Rawas, G.; Vanda, S.; Nazari, R. A Robust Decision-Making Framework to Improve Reservoir Water Quality Using Optimized Selective Withdrawal Strategies. J. Hydrol. 2024, 635, 131153. [Google Scholar] [CrossRef]
- Arias-Rodriguez, L.F.; Tüzün, U.F.; Duan, Z.; Huang, J.; Tuo, Y.; Disse, M.; Arias-Rodriguez, L.F.; Tüzün, U.F.; Duan, Z.; Huang, J.; et al. Global Water Quality of Inland Waters with Harmonized Landsat-8 and Sentinel-2 Using Cloud-Computed Machine Learning. Remote Sens. 2023, 15, 1390. [Google Scholar] [CrossRef]
- Carbureanu, M.; Gheorghe, C.G.; Carbureanu, M.; Gheorghe, C.G. A Machine Learning-Based Data-Driven Model for Predicting Wastewater Quality Parameters in the Industrial Domain. Appl. Sci. 2026, 16, 694. [Google Scholar] [CrossRef]
- Cutler, D.R.; Edwards, T.C., Jr.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J. Random Forests for Classification in Ecology. Ecology 2007, 88, 2783–2792. [Google Scholar] [CrossRef] [PubMed]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Che, Z.; Purushotham, S.; Cho, K.; Sontag, D.; Liu, Y. Recurrent Neural Networks for Multivariate Time Series with Missing Values. Sci. Rep. 2018, 8, 6085. [Google Scholar] [CrossRef] [PubMed]
- Wang, D.; Ding, H.; Singh, V.P.; Shang, X.; Liu, D.; Wang, Y.; Zeng, X.; Wu, J.; Wang, L.; Zou, X. A Hybrid Wavelet Analysis–Cloud Model Data-Extending Approach for Meteorologic and Hydrologic Time Series. J. Geophys. Res. Atmos. 2015, 120, 4057–4071. [Google Scholar] [CrossRef]
- Cook, R.D. Influential Observations in Linear Regression. J. Am. Stat. Assoc. 1979, 74, 169–174. [Google Scholar] [CrossRef]
- Wilks, D. Statistical Methods in the Atmospheric Sciences; Academic Press: London, UK, 2011; ISBN 978-0-12-385022-5. [Google Scholar]












| Variable | Unit | Minimum | Maximum | Mean | Standard Deviation | Median |
|---|---|---|---|---|---|---|
| Temperature | °C | 12.20 | 38.00 | 27.24 | 6.29 | 28.10 |
| pH | - | 7.95 | 8.28 | 8.04 | 0.04 | 8.02 |
| Electrical Conductivity | µS/cm | 59,000 | 66,100 | 62,162.70 | 1008.43 | 62,300 |
| Turbidity | NTU | 0.20 | 9.33 | 1.11 | 0.73 | 0.79 |
| Chloride | ppm | 24,184 | 27,233 | 25,642.32 | 449.83 | 25,725 |
| Alkalinity | ppm as CaCO3 | 112 | 151 | 129.62 | 5.42 | 128 |
| Residual Chlorine | ppm | 0.00 | 2.24 | 0.29 | 0.08 | 0.28 |
| Configuration | Metric | Kruskal–Wallis (H) | -Value | Significant at () |
|---|---|---|---|---|
| Baseline | R2 | 19.311 | 0.000513 | Yes |
| Baseline | RMSE | 20.817 | 0.000344 | Yes |
| Baseline | MAE | 21.397 | 0.000206 | Yes |
| Wavelet-enhanced | R2 | 21.807 | 0.000190 | Yes |
| Wavelet-enhanced | RMSE | 21.630 | 0.000237 | Yes |
| Wavelet-enhanced | MAE | 21.692 | 0.000204 | Yes |
| Model | Configuration | Key Hyperparameters |
|---|---|---|
| RF | Baseline & Wavelet | Number of trees = 500; Maximum depth = None; Minimum samples per split = 2; Feature selection = maximum features (1.0); Bootstrap sampling = True |
| GB | Baseline & Wavelet | Number of boosting stages = 100; Learning rate = 0.1; Maximum tree depth = 3; Loss function = squared error |
| XGB | Baseline & Wavelet | Number of trees = 800; Learning rate = 0.05; Maximum depth = 4; Subsample ratio = 0.9; Column subsample ratio = 0.9; I2 ) = 1.0 |
| SVR | Baseline & Wavelet | = scale |
| KNN | Baseline & Wavelet | ) = 15; Distance weighting = inverse distance; Distance metric = Euclidean |
| Wavelet Decomposition | Wavelet only | Wavelet type = Daubechies (db4); Decomposition level = 3; Feature construction = approximation and detail coefficients concatenated with original predictors |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Alhathloul, S.H.; Algurainy, Y. Wavelet-Enhanced Machine Learning for Seawater Alkalinity Prediction in the Arabian Gulf Using Monitored Water-Quality Variables. Water 2026, 18, 1578. https://doi.org/10.3390/w18131578
Alhathloul SH, Algurainy Y. Wavelet-Enhanced Machine Learning for Seawater Alkalinity Prediction in the Arabian Gulf Using Monitored Water-Quality Variables. Water. 2026; 18(13):1578. https://doi.org/10.3390/w18131578
Chicago/Turabian StyleAlhathloul, Saleh H., and Yazeed Algurainy. 2026. "Wavelet-Enhanced Machine Learning for Seawater Alkalinity Prediction in the Arabian Gulf Using Monitored Water-Quality Variables" Water 18, no. 13: 1578. https://doi.org/10.3390/w18131578
APA StyleAlhathloul, S. H., & Algurainy, Y. (2026). Wavelet-Enhanced Machine Learning for Seawater Alkalinity Prediction in the Arabian Gulf Using Monitored Water-Quality Variables. Water, 18(13), 1578. https://doi.org/10.3390/w18131578

