Abstract
To address the limitations of traditional groundwater quality assessment and prediction methods, this study integrates game theory and machine learning to investigate the drinking quality of groundwater in the southwestern Qinghai–Tibet Plateau. The results showed that the groundwater in the study area is generally weakly alkaline (mean pH: 8.08) and dominated by freshwater (mean TDS: 302.58 mg/L), with hardness levels mostly ranging from soft to medium. Major cations follow the concentration order: Ca2+ > Na+ > Mg2+ > K+; anions are in the sequence of HCO3− > SO42− > Cl−. The hydrochemical type is mainly Ca-HCO3. A few samples exceed the limit values specified in the Groundwater Quality Standard. Through multivariate statistical analysis, ion ratio analysis, and saturation index calculations, water-rock interaction is identified as the primary factor influencing groundwater chemistry. It consists of carbonate dissolution and silicate weathering, accompanied by cation exchange. The water quality index improved based on game theory, integrated subjective weights (from analytic hierarchy process) and objective weights (from entropy-weighted method), shows that the overall groundwater quality in the study area is good: 95.97% of the samples are high-quality water (WQI ≤ 50), more than 99% of the samples have a WQI < 150, which is suitable as drinking water sources; only 0.81% of the samples are of extremely poor quality, presumably related to local pollution. Linear regression achieved the best performance (R2 = 0.99, RMSE≈0.00) with strong stability, followed by support vector machines (test R2 = 0.98), while the extreme gradient boosting model showed overfitting. This study provides a scientific basis for groundwater management in river basins.
1. Introduction
Groundwater is one of the most important sources of drinking water worldwide, characterized by its relative stability and cleanliness [,]. Groundwater quality assessment is a core link in water resources management, environmental protection, and sustainable social development. It plays a crucial role in safeguarding human health and public health security, supporting the sustainable development of industrial and agricultural production, and maintaining the hydrological cycle and biodiversity of ecosystems [,]. In addition, groundwater quality prediction utilizes scientific methods to forecast the evolution of groundwater quality and the degree of pollution risk over a specified future period, providing forward-looking guidance for groundwater management [,].
The primary assessment methods for groundwater quality can be divided into the single-factor method and the comprehensive index method. The single-factor method focuses on comparing the concentration of a single pollutant in water with the assessment standards []. It determines the overall water quality based on the principle of “grading by the worst-performing factor”. Its advantages lie in simple logic, convenient calculation, and intuitive results, which enable the quick identification of key pollution factors in water. However, it overlooks the synergistic effects between pollutants, resulting in conservative assessment results that fail to accurately reflect the comprehensive pollution level of water [,]. In contrast, the comprehensive index method integrates information from multiple pollutant indicators by assigning weights, effectively mitigating this defect and rendering the results more scientific, thereby being widely adopted [,,]. Nevertheless, the results of the index method depend on the weights of indicators. Traditional index methods determine weights based on expert knowledge and experience, which are highly subjective [,,,,]. To address this, the entropy-weight water quality index performs objective weighting of indicators based on the theory of information entropy, avoiding artificial subjective biases and making the assessment results more consistent with reality [,,]. However, abnormal data values may significantly affect indicator weights in this method, and if there is a high correlation between indicators, duplicate weighting may occur. In response to this, the concept of game theory can be applied to coordinate subjective and objective indicator weights and achieve a balance in water quality assessment [,]. In addition, the advancement of technology has enabled the integration of traditional assessment methods with big data, artificial intelligence, and other technologies, resulting in the development of more efficient and accurate assessment systems [,,].
Groundwater quality prediction methods can be categorized into three types: empirical statistical methods, numerical simulation methods, and machine learning methods [,]. The empirical statistical methods establish regression models for prediction based on the statistical relationship between historical water quality monitoring data and influencing factors. They are characterized by being data-driven and easy to operate. However, they fail to reflect the physical migration and chemical transformation processes of pollutants in groundwater, resulting in low accuracy under complex hydrogeological conditions []. The numerical simulation methods, based on the principles of groundwater dynamics and hydrochemistry, construct mathematical models that reflect the “groundwater flow field-pollutant migration and transformation”. They simulate the future evolution of water quality through numerical calculations and are currently the most mainstream and accurate prediction methods [,,]. Nevertheless, they have high requirements for hydrogeological data, involve complex parameter calibration, and incur long modeling cycles and high costs. In contrast, machine learning methods extract nonlinear correlations from monitoring data using algorithms, eliminating the need to elucidate physical and chemical mechanisms [,,]. While ensuring a certain level of accuracy, they also offer the advantages of cost-effectiveness and efficiency.
The study area is situated on the Qinghai–Tibet Plateau, where groundwater serves as a vital source of drinking water for local residents. Its hydrochemical characteristics, quality status, and predictive modeling remain poorly understood. Moreover, the quantity of groundwater samples is limited and unevenly distributed due to the limitations of objective conditions. The current assessment methods are likely to underestimate or overestimate the weight of a certain indicator, thereby leading to incorrect conclusions. Therefore, this study collected groundwater samples from the study area, aiming to: (1) investigate groundwater formation mechanisms, (2) evaluate drinking water quality using an improved index, and (3) predict water quality through machine learning. Its novelty lies in using the concept of game theory to balance the weights of indicators and thereby evaluate water quality, reducing the deviations caused by the small sample size and poor quality of the data. And through machine learning algorithms, a water quality prediction model for the study area was established under the condition of lacking objective data. The findings will contribute to the scientific management of local groundwater resources.
2. Materials and Methods
2.1. Study Area
The study area is located in southwestern Tibet, between 27°13′ N and 31°49′ N, and 80°01′ E and 90°20′ E (Figure 1). Characterized by a cool and cold climate, it falls under the plateau temperate semi-arid climate zone. The annual average temperature ranges from approximately 0 °C to 6.5 °C, with an annual average precipitation of about 421.9 mm, primarily occurring during the rainy season (May to September). The area is rich in water resources, including major rivers such as the Yarlung Zangbo River (the largest river in Tibet) and the Nyangchu River. Most of the lakes here are inland lakes, mainly saltwater lakes or salt lakes, whose water is rich in mineral components such as salt, boron, calcium, and sodium. The types of land resources are diverse, including cultivated land, grasslands, woodlands, wastelands, lakes, and swamps. Cultivated land is mainly concentrated in the valley areas along the banks of the Yarlung Zangbo River and the Nyangchu River. Local human activities are primarily associated with agriculture and animal husbandry, including crop cultivation and livestock grazing. These practices may influence groundwater quality through fertilizer application, irrigation return flow, and surface runoff. Industrial activities are relatively limited and mainly concentrated around the urban area, where small-scale mining, food processing, and construction material industries are present. Moreover, domestic wastewater discharge from residential settlements may also alter the hydrochemical characteristics of the groundwater.
Figure 1.
(a) Location of the Qinghai–Tibet Plateau within China, (b) Location of the study area within the Qinghai–Tibet Plateau; (c) The distribution of groundwater samples within the study area.
The study area is situated between the middle sections of the Himalayan Mountain Range and the Gangdise-Nyainqêntanglha Mountain Range. With relatively high terrain in the north and south, its topography is complex and diverse, consisting essentially of high mountains, wide valleys, and lake basins, with an average altitude of over 4000 m. The geological characteristics are complex and varied, and their formation is closely related to the uplift of the Qinghai–Tibet Plateau and the tectonic movement of the Himalayan Mountains. Overall, it exhibits obvious signs of intense tectonic activity and significant north–south zonation differences. The southern part is dominated by ancient metamorphic rock series and weakly metamorphosed sedimentary rock series, which mainly include gneiss, marble, shale, slate, and phyllite. The central part is mainly composed of a mixed distribution of intermediate-acid intrusive rocks and sedimentary rocks from the late Mesozoic to the early Cenozoic, including granite, sandstone, and conglomerate. Cenozoic continental sedimentary rocks, including conglomerate, sandstone, siltstone, and mudstone, dominate the northern part. Quaternary unconsolidated sediments are also distributed in the valley areas. In addition, the primary recharge sources of groundwater in the study area include precipitation, glacial meltwater, and river recharge, while the discharge pathways comprise springs, wells, and rivers, among others.
2.2. Groundwater Sampling and Experimental Testing
A total of 129 groundwater samples were systematically collected in May 2024 with the well depths ranging 5–20 m to ensure representative coverage of the study area. The geographic coordinates of each sampling site were accurately determined and recorded using a handheld Global Positioning System (GPS) device. The spatial distribution of all sampling locations is illustrated in Figure 1. Prior to sampling, groundwater was pumped for more than 30 min to eliminate the influence of stagnant water. Field measurements of physicochemical parameters, including pH and total dissolved solids (TDS), were performed using a portable multi-meter (Multi 3400i, WTW, Munich, Germany). The 500 mL polyethylene bottles used for sample collection were first thoroughly rinsed with distilled water and subsequently rinsed more than three times with groundwater from the sampling point to ensure the representativeness and reliability of the collected samples. During sampling, bottles should be filled with groundwater to prevent air entrapment and minimize potential interference. Moreover, to evaluate sampling and analytical precision, field duplicate (co-located) samples and blanks at 10% of the sites were also collected. The blanks included field/equipment rinse blanks to check for contamination during sampling and handling. The collected bottles were stored at 4 °C and quickly transported (within a week) to the Sichuan Geological Survey Institute for further hydrochemical examination.
Cation concentrations (K+, Na+, Ca2+, Mg2+) were analyzed using inductively coupled plasma mass spectrometry (ICPMS7500ce, Agilent, Shanghai, China), whereas anion concentrations (Cl−, SO42−, NO3−, F−) were determined via ion chromatography (LC-10Advp, Shimadzu, Jinan, China). HCO3− levels and total hardness (TH) were quantified by titration. For ion chromatography/ICP-MS determinations, two independent runs per sample were performed. If the relative percent difference (RPD) between runs exceeded 10% for major ions/parameters or 15% for trace elements, the sample was re-analyzed and the mean of the acceptable runs was reported. Moreover, to verify the reliability of the groundwater measurements, an anion–cation charge balance calculation was performed for all water samples using Equation (1), and the charge balance error was maintained within ±5%.
2.3. Improved Drinking Water Quality Evaluation Approach Based on Game Theory
Game theory, an important modern mathematical optimization strategy, is utilized to calculate the proportion between the subjective weights (W1) derived from the analytic hierarchy process and the objective weights (W2) calculated by the entropy-weighted method, thus computing the comprehensive parameter weights and evaluating the groundwater quality accordingly [,,]. The calculation process is given in Equations (2)–(6).
where the d1* and d1* represent the weight coefficient for subjective weights (W1) and objective weights (W2), respectively; W is the comprehensive parameter weights. The qi, Cj, and Si denote the quantitative grading scale, concentration, and the corresponding hydrochemical parameter limit values specified in the Standard for Groundwater Quality (Class III) (GB/T 14848-2017) [], respectively.
2.3.1. Analytic Hierarchy Process
The subjective hydrochemical parameter weights are determined via the analytic hierarchy process (AHP), a multi-index subjective weighting method [,]. In this study, the target layer is the drinking water quality assessment. The criterion layer is divided into low-hazard indicators and medium-hazard indicators based on the hazard level of the indicators. In the solution layer, the low-hazard indicators include pH, TH, TDS, and seven major ions. These indicators pose minor hazards to human health and only cause harm when ingested at extremely high concentrations. The medium-hazard indicators include NO3− and F−, which can cause significant harm when their concentrations reach a certain level. The construction of the judgment matrix is carried out through the scaling method, and the parameter weights in each matrix can be calculated as Equations (7) and (8). Additionally, each matrix must pass the consistency test as specified in Equation (9) (CR < 0.1).
where represents the m-th power of the product of each row and m is the dimension of matrix; wi is the weight of each parameter. CI stands for consistency index, λmax and n are the maximum eigenvalue of the matrix and the number of parameters, respectively. In addition, CR is the consistency ratio, and RI can be obtained based on the value of n (Table 1).
Table 1.
The reference value of RI.
2.3.2. Entropy–Weighted Method
The objective parameter weights are calculated by the entropy weight method, and its specific process is as follows [,]: (1) Construct the parameter matrix X (Equation (10)); (2) Standardize matrix X according to Equation (11); (3) Calculate the information entropy (ej) and weight (wj) of each parameter (Equations (12)–(14)). Where the xij represents the measured value of the jth parameter of the ith sample. The m and n are the number of samples and parameters. The addition of 0.00001 is to avoid yij being 0.
2.4. Machine Learning Algorithms for Water Quality Prediction
This study employed linear regression (LR), support vector machines (SVM), and extreme gradient boosting (XGB) to evaluate model performance and identify the optimal approach for WQI-based water quality prediction in the study area (Text S1). LR offers high computational efficiency and strong interpretability, but its prediction accuracy is limited and it may not adequately capture complex nonlinear relationships among hydrochemical parameters []. The SVM exhibits robust resistance to overfitting and are relatively insensitive to missing values. However, it is slower when training on large-scale datasets, less interpretable, and its predictive performance can be sensitive to kernel selection and parameter tuning []. XGB can effectively capture nonlinear relationships between water quality indicators and environmental factors with high accuracy, although its interpretability is lower than that of LR, and it may exhibit overfitting when dealing with limited or heterogeneous samples [].
To comprehensively evaluate model accuracy, five performance indicators, Coefficient of Determination (R2), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Percentage of Non-Improved Absolute Scores (PNIAS), and Willmott Index (WI), were employed to assess the results. The indicators calculation methods are presented in Equations (15)–(17). R2 indicates the degree of fit and the consistency between predicted and observed values, ranging from 0 to 1, with values closer to 1 representing better agreement. RMSE and MAE quantify prediction errors, where smaller values correspond to higher prediction accuracy.
The dataset was split into a training set and a test set, comprising 82% and 18% of the data, respectively []. The test set was used to assess the model’s generalization performance. And the dataset’s input variables include pH, TDS, TH, K+, Na+, Ca2+, Mg2+, Cl−, SO42−, HCO3−, NO3−, F−, and the calculated WQI, representing key characteristics of groundwater that influence water quality [,]. More principal introductions of the three models can be found in the Supplementary Materials.
In addition, to gain a deeper understanding of the model’s behavior and to identify the key water quality parameters, this study employed sensitivity analysis approach, One-at-a-Time analysis (OAT), to assess the influence of input variables on the prediction results. Detailed information on the model framework and the sensitivity analysis method can be found in the Supplementary Materials (Text S2).
where WQIpredicted and WQIcalculated represent the ML-based predicted values and the calculated values of the WQI, respectively, and n is the number of samples.
3. Results
3.1. Hydrochemical Characteristics of Groundwater
The statistical characteristics of hydrochemical parameters are listed in Table 2. The pH value ranges from 7.00 to 9.10, with an average of 8.08. It suggests that the groundwater is slightly alkaline. A total of 12.90% samples exceeded the overall limit range of 6.5–8.5, which is possibly due to the mineral dissolution, human activities, and other factors. The coefficient of variation (CV) is 0.05, indicating a relatively stable pH level in the groundwater of the study area. The total dissolved solids (TDS) varied from 69.00 mg/L to 1660.00 mg/L, with 16.13% of samples exceeding the drinking water standard. Except for a few samples, all samples are from fresh water (Figure 2a). Furthermore, the total hardness (TH) value ranges from 15.00 mg/L to 1370.00 mg/L, with a mean value of 218.91. A total of 37.10% samples belong to soft water and moderate hard water, respectively. The CV (0.82) indicates high variability, reflecting significant differences in the contribution of rock weathering to calcium and magnesium in different regions, as well as the uneven distribution of total hardness.
Table 2.
Statistical characteristics of hydrochemical parameters of groundwater samples (mg/L).
Figure 2.
(a) Scatter diagram of TDS vs. TH; (b) Piper trilinear diagram of the groundwater.
Additionally, the contents of ions are as below: 0.15–20.30 (K+), 2.24–172.90 (Na+), 5.63–330.75 (Ca2+), 0.20–144.13 (Mg2+), 1.01–443.70 (Cl−), 1.45–951.66 (SO42−), 39.90–519.75 (HCO3−), 0.02–73.50 (NO3−), 0.03–0.79 (F−). The average concentrations of major cations in groundwater are ranked as follows: Ca2+ (60.19 mg/L) > Na+ (17.08 mg/L) > Mg2+ (15.68 mg/L) > K+ (1.52 mg/L), while the major anions are HCO3− (170.43 mg/L) > SO42− (82.29 mg/L) > Cl− (10.46 mg/L). It noted that the concentrations of both major cations and anions showed high variability (CV > 0.36), and only a few samples exceeded the drinking water limit. Therefore, the groundwater in these areas may have been affected by either local excessive water–rock interaction or human pollution. Correspondingly, the sample points mainly fall within the areas indicating Ca and HCO3− in the Piper diagram, supporting the dominant hydrochemical type of Ca-HCO3 type (Figure 2b). The NO3− and F− in groundwater are usually related to human activities, and their concentration ranges are 0.02–73.50 mg/L and 0.03–0.79 mg/L, respectively. Their CV values are higher than 0.5, and 4.03% of samples showed levels of NO3− that exceeded the drinking water standard, implying potential pollution.
3.2. Improved Assessment of Groundwater Drinking Quality
Subjective and objective parameter weights were calculated using the analytic hierarchy process (AHP) and the entropy-weighted method (EW), respectively. For AHP, judgment matrices were constructed separately for the criterion layer and the solution layer (Tables S1–S3), and the Consistency Ratio (CR) values of all matrices were less than 0.1 (0, 0.08, and 0), meeting the consistency requirement. Subsequently, improved parameter weights were obtained based on game theory (GM) and the hydrochemical parameter weights derived from the two aforementioned methods (Figure 3a). The weight of GM is generally located between that of AHP and EW, effectively avoiding the bias of a single method towards the water chemical indicators.
Figure 3.
(a) Weight of hydrochemical indicators by different methods; (b) Results of drinking water quality assessment.
The groundwater drinking quality index (WQI) was calculated using the improved parameter weights. As shown in Figure 3b, the WQI varied from 7.52 to 206.74, with an average value of 20.71. More than 95% of samples are considered to have excellent water quality (WQI ≤ 50), and the WQI of over 99% of samples is below 150, suggesting that the groundwater quality within the study area is generally good and suitable for use as a drinking water source. In addition, only one sample reaches an extremely poor rating, which may be related to local pollution. In general, the groundwater quality in the study area is good, but there is human-induced pollution in some areas. It should not be used as drinking water.
4. Discussion
4.1. Hydrochemical Process and Controlling Factors
4.1.1. Multivariate Statistical Analysis
Correlation and principal component analysis are widely employed to explore the potential connections and common sources among different ion components, thereby analyzing the hydrochemical processes of groundwater [,]. The Pearson coefficient is used to measure the correlation between variables, with its range spanning from −1 to 1. The two ends represent a completely positive correlation and a negative correlation, respectively. Additionally, the sample data passed the Kaiser-Meyer-Olkin (KMO = 0.51 > 0.50) test and Bartlett’s test of sphericity (p < 0.05), suggesting that factor analysis can be conducted.
As shown in Figure 4, there is a significant positive correlation between Cl− and Na+ (0.70), which is most likely attributed to the dissolution of evaporite minerals (e.g., halite). In contrast, the correlation between K+ and Na+ may result from the weathering of silicate minerals. K+ and Na+ are released during the weathering process of these minerals. For instance, potassium feldspar releases K+ upon weathering, while albite releases Na+. When these minerals occur in the same geological environment under similar weathering conditions, the aforementioned ions exhibit a correlation in groundwater. Besides this mechanism, the correlation between K+ and Na+ may also indicate cation exchange reactions between groundwater and surrounding rocks. Furthermore, the correlations among Cl−, Na+, and K+ might imply the influence of anthropogenic factors; for example, domestic sewage discharge, industrial wastewater, and leachate from landfills may also contain these ions. In addition, positive correlations are observed among HCO3− and Ca2+, Mg2+ (0.60–0.73), which are characteristic of the weathering and dissolution of carbonate rocks (e.g., calcite (CaCO3) and dolomite (CaMg(CO3)2). The correlation between Ca2+ and SO42− usually indicates the dissolution of gypsum (CaSO4·2H2O) or anhydrite (CaSO4). In addition, three principal components (PC1–PC3) with eigenvalues greater than 1 were extracted, and their variance explanations rates were 43.85%, 21.16%, and 9.87%, respectively. Where the PC1 is related to Ca2+, Mg2+, HCO3−, and SO42−, which may represent the source of carbonate weathering and dissolution; PC2, associated with Na+, K+, and Cl−, is considered as the source of silicate dissolution; PC3 is characterized by NO3− and F−, which might be related to human activities.
Figure 4.
PCA-CA diagram of hydrochemical indicators in the groundwater.
4.1.2. Ion Source Analysis
The analysis of ion ratios is a core tool for identifying the formation process of the main ion components in groundwater, as the ion compositions formed by different interaction processes exhibit inherent differences [,]. It can reveal the interaction between groundwater and rocks, the degree of evaporation and concentration, and whether it is affected by human activities by calculating the concentration ratios of the main ions in groundwater.
The Gibbs diagram can effectively identify the main controlling factors of groundwater chemical components. As shown in Figure 5a,b, most of the sample points fall within the “Rock dominance” area, indicating that the groundwater chemical evolution in the study area is primarily controlled by the weathering of rocks [,]. Meanwhile, a few points have shifted to the right, suggesting that they are also influenced by both the evaporation concentration process and precipitation discharge. In addition, the Gaillardet end-member diagram serves to distinguish the dissolution effects of different rocks []. The sample points are mainly distributed in the “Carbonate rocks” and “Silicate rocks” areas, while a few points are in the “Evaporite rocks” area (Figure 5c,d). This indicates that the groundwater chemistry in the study area is primarily influenced by the weathering of carbonate and silicate rocks, with a negligible contribution from the weathering of evaporite rocks. This viewpoint can also be reflected via the ratio of SO42− and Ca2+. Only a few sample points are close to the line y = x (Figure 5e), suggesting that the SO42− and Ca2+ in the groundwater are only slightly affected by the dissolution of evaporite mineral gypsum [,]. At the same time, most of the sample points deviated, indicating that other sources, such as the oxidation of sulfate minerals and the dissolution of other calcium-containing minerals, also affect the concentration of SO42− and Ca2+. In addition, the ratios of (HCO3− + SO42−) and (Ca2+ + Mg2+) for almost all the sample points are close to the line 1:1 (Figure 5f), indicating that the hydrochemical composition of groundwater is affected by the dissolution of both silicate and carbonate. Furthermore, the relationship between HCO3− and Ca2+ can be utilized to identify the dissolution of carbonate minerals [,]. As depicted in Figure 5g, a portion of the samples falls between the lines y = x and y = 2x, suggesting that they are controlled by the dissolution of both dolomite and calcite.
Figure 5.
Ion ratio diagrams of hydrochemical components. (a,b) Gibbs diagrams; (c) (Mg2+/Na+) vs. (Ca2+/Na+); (d) (HCO3−/Na+) vs. (Ca2+/Na+); (e) SO42− vs. Ca2+; (f) (Ca2++Mg2+) vs. (HCO3−/SO42−); (g) HCO3− vs. Ca2+; (h) CAI-II vs. CAI-I; (i) (NO3−/Na+) vs. (Cl−/Na+).
In addition, some samples fall below y = x, indicating the influence of other sources on Ca2+ and HCO3−. The chloro-alkaline index (CAI) is employed to indicate the cation exchange and adsorption between groundwater and the surrounding rocks (Equation (18)). The CAI-I and CAI-II are lower than zero, indicating the occurrence of the cation exchange process, while the rest are positive values, which support the reverse reaction [,]. As shown in Figure 5h, most of the samples exhibit negative CAI-I and CAI-II, suggesting that the exchange between Ca2+ and Mg2+ in the groundwater and Na+ in the surrounding rocks. Additionally, the ratio of NO3−/Na+ to Cl−/Na+ at the sample points mainly falls between “carbonates silicates” and “agricultural activities” (Figure 5i), implying that the NO3− within the groundwater primarily comes from agricultural practice and the dissolution of carbonates and silicates.
(Ca2+/Mg2+)groundwater + 2(Na+)rocks ⇋ 2(Na+)groundwater + (Ca2+/Mg2+)rocks
The saturation index (SI) serves as a key indicator for identifying the mineral dissolution equilibrium in groundwater, which can be calculated through the Phreeqc 3.0 software [,]. As shown in Figure 6, the saturation indices of most of the carbonate minerals (calcite, dolomite, and aragonite) in the samples are close to or slightly greater than 0, indicating that groundwater is close to saturating or in a weakly supersaturated state for these minerals. There may be a situation involving precipitation or dissolution, such as precipitation equilibrium of carbonate minerals. In addition, sulfate minerals such as Anhydrite and gypsum, as well as evaporite minerals such as sylvite and halite, have a SI mostly negative and with large absolute values, suggesting that the solution is not saturated with respect to these minerals. These minerals have a tendency to dissolve in water and are important sources of relevant ions in groundwater, such as SO42−, Na+, and K+. The dissolved CO2(g) has not reached saturation either, implying the possibility that the CO2 in the atmosphere or in the soil may dissolve into the groundwater. The dissolution of CO2 will affect the acidity of the groundwater, thereby influencing the dissolution-precipitation equilibrium of carbonate and other minerals.
Figure 6.
Saturation index of different minerals within the groundwater.
4.2. Water Quality Prediction Based on Machine Learning Approaches
To support effective water quality management in the study area, predictive models were developed using three machine learning algorithms, including LR, SVM, and XGB, based on the WQI. To ensure model robustness and mitigate overfitting, SVM and XGB underwent iterative parameter optimization, while LR was applied with its default settings. The final selected parameters for all models were presented in Table 3.
Table 3.
The optimal parameters used in the Machine-learning model.
Figure 7 presented the results of the training and test sets for the three water quality prediction models, while the corresponding evaluation metrics for each model were summarized in Table 4. The LR model achieved outstanding predictive accuracy, as evidenced by R2 and WI values of 0.9999 for both the training and test datasets. Furthermore, the error metrics, including RMSE, MAE, and PBIAS (%), were all close to zero, indicating negligible deviation between the predicted and observed values. Due to the very high fitting performance of the LR model on both the training and test data, a 5-fold cross-validation (CV) procedure was applied to examine the model’s stability []. The results demonstrated that the evaluation metrics of LR on the training set were almost identical to those from 5-fold cross-validation, with ΔR2, ΔRMSE, and ΔMAE values approaching zero. This consistency highlighted the robustness and reliability of the model, supporting its applicability for predictive analysis.
Figure 7.
The relationship between calculated WQIs and predicted WQIs. (a) LR for test set; (b) SVM for test set; (c) XGB for test set; (d) LR for training set; (e) SVM for training set; (f) XGB for training set.
Table 4.
Model evaluation values of the training set and test set.
In addition, the SVM model exhibited strong predictive performance, with R2 values of 0.98 and 0.97 for the test and training sets, respectively, and corresponding RMAE values of 1.55 and 4.01, as well as MAE values of 0.56 and 0.63. Five-fold cross-validation further confirmed the absence of overfitting, with performance differences between the training and validation sets (ΔR2, ΔRMSE, and ΔMAE) of 0.02, 0.59, and 0.94, respectively. In contrast, the XGB-based model showed evidence of overfitting, as indicated by ΔRMSE of 7.22 and ΔMAE of 2.80.
In contrast, the XGB-based model showed slightly higher discrepancies between the training and validation results (ΔRMSE = 7.22 and ΔMAE = 2.80), indicating a moderate tendency toward overfitting due to its higher model complexity. However, the ΔR2 value (0.20) remained within an acceptable range, and the model still demonstrated reasonable generalization performance after hyperparameter tuning and cross-validation. These differences were primarily attributed to the nonlinear and ensemble characteristics of the XGBoost algorithm, which can produce higher variance when dealing with limited or heterogeneous samples.
The Taylor diagram provides a comprehensive assessment of model performance by simultaneously representing three key statistical parameters: correlation, variability, and error [,]. As illustrated in Figure 8, the radial distance from the origin denoted the standard deviation, the azimuthal angle reflected the correlation coefficient between model predictions and the reference data, and the concentric contour lines indicated the centered RMSD. Thus, in this diagram, model points that were located closer to the reference point on the x-axis suggested higher overall agreement with the observations. And this reflected both stronger correlation and smaller deviations. The results clearly indicated that LR demonstrated superior performance compared with the other two models.
Figure 8.
Taylor diagrams of different machine learning methods for: (a) Test set; (b) Training set.
To further evaluate the influence of each input parameter on the LR model, a sensitivity analysis was conducted. The results (Figure 9) indicated that NO3− was the most important input variable, with its variation exerting the greatest impact on the WQI prediction results. SO42− and TH were the next most influential parameters, ranking second and third, respectively.
Figure 9.
Sensitivity analysis of the LR prediction model.
5. Conclusions
This study comprehensively employed multiple approaches to investigate the hydrochemical characteristics and formation mechanisms of groundwater in the study area, and separately evaluated and predicted the drinking water quality based on game theory and machine learning methods. The main conclusions are as follows:
- (1)
- The groundwater in the study area is weakly alkaline fresh water, with hardness ranging from soft to medium. The anions and cations with the highest contents are HCO3− and Ca2+, respectively, and the hydrochemical type is dominated by the Ca-HCO3 type.
- (2)
- Water-rock interaction is the dominant factor influencing the hydrochemical characteristics of groundwater. The formation of hydrochemistry is jointly affected by the weathering and dissolution of carbonate rocks and silicate rocks, accompanied by cation exchange and adsorption. In addition, groundwater is also subject to disturbances from certain human activities, such as agricultural practices and landfill disposal.
- (3)
- More than 95% of groundwater samples are considered as excellent water quality, suggesting that the groundwater quality is generally good and suitable for drinking in the study area. The extremely poor water quality may indicate local pollution.
- (4)
- Among the three machine learning models (LR, SVM, XGB) constructed for groundwater quality prediction, the LR model exhibits the optimal performance, with the highest prediction accuracy and strongest stability, followed by the SVM model. The XGB model, however, shows obvious overfitting and lower generalization ability. The LR model can provide reliable technical support for the dynamic monitoring and early warning of groundwater quality in the study area. The results of the sensitivity analysis indicated that NO3− was the most influential variable affecting the performance of the LR model.
Since the current sampling is based on pre-existing monitoring wells, the spatial distribution of sampling points is inherently uneven, leading to a higher number of samples collected in Rikaze and Benbu compared to Bailang. In addition, this study does not sufficiently account for temporal changes in groundwater quality, which can vary seasonally. To address these issues, sampling scope will be expanded by incorporating domestic water sources used by local residents, thereby improving the representativeness and overall quality of the dataset in future studies; the temporal changes in groundwater quality can be predicted combining machine learning and other methods, or continuously collect time-series monitoring data to more accurately analyze the seasonal variation in groundwater quality.
In addition, the three machine learning models (LR, SVM, and XGB) exhibited strong predictive performance and stability through 5-fold cross-validation; however, some limitations should still be acknowledged. Due to the lack of independent datasets, external validation could not be conducted in this study. Instead, 5-fold cross-validation was adopted to ensure model robustness and mitigate overfitting, which effectively enhanced the reliability of the results. Furthermore, the current analysis was based on groundwater samples collected primarily during a single hydrological season (dry period). As a result, temporal variations in groundwater quality, which may occur seasonally due to changes in recharge, temperature, and anthropogenic activities, were not fully captured. Future research should incorporate multi-seasonal or long-term monitoring data to better characterize the temporal dynamics of groundwater chemistry and their influence on model performance. The models demonstrated excellent predictive performance within the study area, but they cannot be directly applied to other regions worldwide without proper recalibration and adaptation. Differences in hydrogeological, climatic, and anthropogenic conditions may significantly affect groundwater quality patterns and model applicability. Therefore, future research should aim to integrate broader-scale and multi-source datasets, improve model transferability across diverse hydrogeological settings, and develop hybrid frameworks that combine data-driven and process-based approaches to enhance the global applicability and reliability of ground-water quality prediction.
Supplementary Materials
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/toxics13110985/s1, Text S1: Predicted models of water quality [,,,]; Text S2: One-at-a-Time (OAT)-Based Sensitivity Analysis; Table S1: The judgment matrix criterion layer; Table S2: The judgment matrix solution layer 1; Table S3: The judgment matrix solution layer 2.
Author Contributions
Conceptualization, W.W. and Y.Z.; methodology, X.W.; software, W.L.; validation, D.W. and Q.H.; formal analysis, D.W.; investigation, H.W.; resources, Y.W. (Ying Wang); data curation, Y.W. (Yangshuang Wang); writing—original draft preparation, X.H.; writing—review and editing, W.W.; visualization, B.Z.; supervision, Y.Z.; project administration, Q.H.; funding acquisition, Y.W. (Yangshuang Wang) All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by Sichuan Province Science and Technology program (2025YFHZ0269, 2025ZNSFSC0307), Sichuan Transportation Science and Technology program (2023-B-15), Yibin City Science and Technology program (YBSCXY2023020006, YBSCXY2023020007, 2024MZ001), and Fundamental Research Funds for the Central Universities (A0920502052501-23, 2682024CX068). We acknowledge the support from the Innovative Practice Bases of Geological Engineering and Surveying Engineering of Southwest Jiaotong University (YJG-2022-JD04).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The original contributions presented in this study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Appukuttan, A.; Aju, C.D.; Reghunath, R.; Srinivas, R.; Krishnan, K.A.; Arya, S. Exploring hydrochemical drivers of drinking water quality in a tropical river basin using self-organizing maps and explainable AI. Water Res. 2025, 284, 123884. [Google Scholar] [CrossRef]
- Zhang, Y.; He, Z.; Tian, H.; Huang, X.; Zhang, Z.; Liu, Y.; Xiao, Y.; Li, R. Hydrochemistry appraisal, quality assessment and health risk evaluation of shallow groundwater in the Mianyang area of Sichuan Basin, southwestern China. Environ. Earth Sci. 2021, 80, 576. [Google Scholar] [CrossRef]
- Zhang, Y.; Dai, Y.; Wang, Y.; Huang, X.; Xiao, Y.; Pei, Q. Hydrochemistry, quality and potential health risk appraisal of nitrate enriched groundwater in the Nanchong area, southwestern China. Sci. Total Environ. 2021, 784, 147186. [Google Scholar] [CrossRef]
- Zhong, C.; Wang, H.; Yang, Q. Hydrochemical interpretation of groundwater in Yinchuan basin using self-organizing maps and hierarchical clustering. Chemosphere 2022, 309, 136787. [Google Scholar] [CrossRef] [PubMed]
- El Yousfi, Y.; Himi, M.; El Ouarghi, H.; Aqnouy, M.; Benyoussef, S.; Gueddari, H.; Ait Hmeid, H.; Alitane, A.; Chaibi, M.; Zahid, M.; et al. Assessment and prediction of the water quality index for the groundwater of the Ghiss-Nekkor (Al Hoceima, Northeastern Morocco). Sustainability 2023, 15, 402. [Google Scholar] [CrossRef]
- Zhang, J.; Xiao, C.; Liang, X.; Yang, W.; Zhang, L.; Dai, R.; Li, W.; Ni, H. A bayesian maximum entropy fusion model for enhanced prediction and risk assessment of fluoride and arsenic contamination in groundwater. J. Contam. Hydrol. 2025, 274, 104664. [Google Scholar] [CrossRef]
- Yao, R.; Zhang, Y.; Sun, Z.; Zhao, X.; Zhang, H.; Zhang, X.; Luo, M.; Uddin, M.G.; Wang, Y.; Wang, Y. A novel grid-based technique for quantifying groundwater quality under land use/land cover changes to support improved groundwater management. J. Hydrol. 2025, 662, 133955. [Google Scholar] [CrossRef]
- Patel, N.; Bhatt, D. Insights of ground water quality assessment methods—A review. Mater. Today Proc. 2024. [Google Scholar] [CrossRef]
- Asma, B.; Şener, Ş. Appraisal of groundwater suitability and hydrochemical characteristics by using various water quality indices and statistical analyses in the Wadi Righ area, Algeria. Water Supply 2024, 24, 1938–1957. [Google Scholar] [CrossRef]
- Ali, S.; Verma, S.; Agarwal, M.B.; Islam, R.; Mehrotra, M.; Deolia, R.K.; Kumar, J.; Singh, S.; Mohammadi, A.A.; Raj, D.; et al. Groundwater quality assessment using water quality index and principal component analysis in the Achnera block, Agra district, Uttar Pradesh, northern India. Sci. Rep. 2024, 14, 5381. [Google Scholar] [CrossRef]
- Li, Z.; Wang, G.; Wang, X.; Wan, L.; Shi, Z.; Wanke, H.; Uugulu, S.; Uahengo, C.-I. Groundwater quality and associated hydrogeochemical processes in Northwest Namibia. J. Geochem. Explor. 2018, 186, 202–214. [Google Scholar] [CrossRef]
- Zhang, Y.; Jia, R.; Wu, J.; Wang, H.; Luo, Z. Uncertain in WQI-based groundwater quality assessment methods: A case study in east of Beijing, China. Environ. Earth Sci. 2022, 81, 202. [Google Scholar] [CrossRef]
- Şener, E.; Şener, Ş.; Varol, S. Appraisal of groundwater quality with WQI and human health risk assessment in Karamık wetland and surroundings (Afyonkarahisar/Turkey). Environ. Geochem. Health 2023, 45, 1499–1523. [Google Scholar] [CrossRef]
- Swain, P.K.; Biswal, T. Assessment of groundwater quality in terms of water quality index (WQI) and fluoride contamination of Nuapada district, Odisha, India. Appl. Water Sci. 2023, 13, 218. [Google Scholar] [CrossRef]
- Hamma, B.; Alodah, A.; Bouaicha, F.; Bekkouche, M.F.; Barkat, A.; Hussein, E.E. Hydrochemical assessment of groundwater using multivariate statistical methods and water quality indices (WQIs). Appl. Water Sci. 2024, 14, 33. [Google Scholar] [CrossRef]
- Zhang, Y.; Wei, D.; Xie, Z.; Li, H.; Yang, S.; Luo, M.; Vesković, J.; Onjia, A.; Wang, Y.; Wang, Y.; et al. Spatial source-oriented analysis and probabilistic health risk assessment of potentially toxic elements in soils integrating the geo-detector, APCS-MLR, and monte-carlo models. J. Environ. Chem. Eng. 2025, 13, 117983. [Google Scholar] [CrossRef]
- Sutradhar, S.; Mondal, P. Groundwater suitability assessment based on water quality index and hydrochemical characterization of suri sadar sub-division, west bengal. Ecol. Inform. 2021, 64, 101335. [Google Scholar] [CrossRef]
- Dashora, M.; Kumar, A.; Kumar, S.; Kumar, P.; Kumar, A.; Singh, C.K. Geochemical assessment of groundwater in a desertic region of India using chemometric analysis and entropy water quality index (EWQI). Nat. Hazards 2022, 112, 747–782. [Google Scholar] [CrossRef]
- Raheja, H.; Goel, A.; Pal, M. Assessment of groundwater quality and its vulnerability for safe drinking purpose. J. Hydroinform. 2024, 26, 2302–2324. [Google Scholar] [CrossRef]
- Yang, Y.; Li, P.; Elumalai, V.; Ning, J.; Xu, F.; Mu, D. Groundwater quality assessment using EWQI with updated water quality classification criteria: A case study in and around Zhouzhi county, Guanzhong basin (China). Expo. Health 2023, 15, 825–840. [Google Scholar] [CrossRef]
- Ding, F.; Chen, L.; Sun, C.; Zhang, W.; Yue, H.; Na, S. An upgraded groundwater quality evaluation based on hasse diagram technique & game theory. Ecol. Indic. 2022, 140, 109024. [Google Scholar] [CrossRef]
- Tian, R.; Wu, J. Groundwater quality appraisal by improved set pair analysis with game theory weightage and health risk estimation of contaminants for xuecha drinking water source in a loess area in northwest China. Hum. Ecol. Risk Assess. Int. J. 2019, 25, 132–157. [Google Scholar] [CrossRef]
- Niazkar, M.; Piraei, R.; Goodarzi, M.R.; Abedi, M.J. Comparative assessment of machine learning models for groundwater quality prediction using various parameters. Environ. Process. 2025, 12, 10. [Google Scholar] [CrossRef]
- Zhang, Y.; Xie, Z.; Liu, W.; Huang, J.; Chen, S.; Zhang, X.; Yang, C.; Li, J.; Kang, W.; Wang, Y. Source identification and health risk assessment of urban groundwater nitrate contamination in Chongqing, southwestern China. J. Hydrol. Reg. Stud. 2025, 62, 102792. [Google Scholar] [CrossRef]
- Raheja, H.; Goel, A.; Pal, M. Groundwater quality prediction using proximal hyperspectral sensing, GIS, and machine learning algorithms. Water Air Soil Pollut. 2025, 236, 375. [Google Scholar] [CrossRef]
- Thanh, N.N.; Chotpantarat, S.; Ngu, N.H.; Thunyawatcharakul, P.; Kaewdum, N. Integrating machine learning models with cross-validation and bootstrapping for evaluating groundwater quality in Kanchanaburi province, Thailand. Environ. Res. 2024, 252, 118952. [Google Scholar] [CrossRef] [PubMed]
- Moeinzadeh, H.; Yong, K.-T.; Withana, A. A critical analysis of parameter choices in water quality assessment. Water Res. 2024, 258, 121777. [Google Scholar] [CrossRef]
- Deb, S. Enhancing groundwater quality index prediction in data-scarce regions: Application of advanced artificial intelligence models in Nagaland, India. Dyn. Atmos. Ocean. 2025, 111, 101579. [Google Scholar] [CrossRef]
- Banaei, S.M.A.; Javid, A.H.; Hassani, A.H. Numerical simulation of groundwater contaminant transport in porous media. Int. J. Environ. Sci. Technol. 2021, 18, 151–162. [Google Scholar] [CrossRef]
- Chen, C.; He, W.; Zhou, H.; Xue, Y.; Zhu, M. A comparative study among machine learning and numerical models for simulating groundwater dynamics in the heihe river basin, northwestern China. Sci. Rep. 2020, 10, 3904. [Google Scholar] [CrossRef]
- Zhang, J.; Xiao, C.; Yang, W.; Liang, X.; Zhang, L.; Wang, X.; Dai, R. Improving prediction of groundwater quality in situations of limited monitoring data based on virtual sample generation and gaussian process regression. Water Res. 2024, 267, 122498. [Google Scholar] [CrossRef]
- Bui, D.T.; Khosravi, K.; Tiefenbacher, J.; Nguyen, H.; Kazakis, N. Improving prediction of water quality indices using novel hybrid machine-learning algorithms. Sci. Total Environ. 2020, 721, 137612. [Google Scholar] [CrossRef]
- Chan Kujiek, D.; Sahile, Z.A. Water quality assessment of Elgo river in Ethiopia using CCME, WQI and IWQI for domestic and agricultural usage. Heliyon 2024, 10, e23234. [Google Scholar] [CrossRef] [PubMed]
- Huang, Y.; Wang, C.; Wang, Y.; Lyu, G.; Lin, S.; Liu, W.; Niu, H.; Hu, Q. Application of machine learning models in groundwater quality assessment and prediction: Progress and challenges. Front. Environ. Sci. Eng. 2023, 18, 29. [Google Scholar] [CrossRef]
- Yan, Y.; Zhang, Y.; Yang, S.; Wei, D.; Zhang, J.; Li, Q.; Yao, R.; Wu, X.; Wang, Y. Optimized groundwater quality evaluation using unsupervised machine learning, game theory and monte-carlo simulation. J. Environ. Manag. 2024, 371, 122902. [Google Scholar] [CrossRef]
- Khiavi, A.N.; Tavoosi, M.; Kuriqi, A. Conjunct application of machine learning and game theory in groundwater quality mapping. Environ. Earth Sci. 2023, 82, 395. [Google Scholar] [CrossRef]
- Xu, Z.; Kong, F.; Cao, C.; Zhang, Z. Prediction and analysis of tunnel water inrush disasters in Chinese karst area based on variable weight-weighted bayesian network model. Carbonates Evaporites 2024, 40, 2. [Google Scholar] [CrossRef]
- GB/T 14848-2017; Natural Resources and Territory Spatial Planning (SAC/TC 93). State General Administration of the People’s Republic of China for Quality Supervision and Inspection and Quarantine (AQSIQ). Standardization Administration of the People’s Republic of China: Beijing, China, 2017.
- Gao, Y.; Qian, H.; Ren, W.; Wang, H.; Liu, F.; Yang, F. Hydrogeochemical characterization and quality assessment of groundwater based on integrated-weight water quality index in a concentrated urban area. J. Clean. Prod. 2020, 260, 121006. [Google Scholar] [CrossRef]
- Kumar, A.; Tripathi, M.P.; Khalkho, D.; Dewangan, R.; Baghel, S.; Kuriqi, A. Groundwater quality evaluation for drinking and irrigation using analytical hierarchy process with GIS in semi critical block of Chhattisgarh, India. Environ. Earth Sci. 2024, 83, 334. [Google Scholar] [CrossRef]
- Hosseininia, M.; Hassanzadeh, R. Groundwater quality assessment for domestic and agricultural purposes using GIS, hydrochemical facies and water quality indices: Case study of Rafsanjan plain, Kerman province, iran. Appl. Water Sci. 2023, 13, 84. [Google Scholar] [CrossRef]
- Nayak, A.; Matta, G.; Uniyal, D.P. Hydrochemical characterization of groundwater quality using chemometric analysis and water quality indices in the foothills of Himalayas. Environ. Dev. Sustain. 2023, 25, 14229–14260. [Google Scholar] [CrossRef] [PubMed]
- Masteali, S.H.; Bayat, M.; Bettinger, P.; Ghorbanpour, M. Uncertainty analysis of linear and non-linear regression models in the modeling of water quality in the Caspian sea basin: Application of monte-carlo method. Ecol. Indic. 2025, 170, 112979. [Google Scholar] [CrossRef]
- Rastgou, M.; He, Y.; Lou, R.; Jiang, Q. A comparison of metaheuristic optimizations with automated hyperparameter tuning methods in support vector machines algorithm for predicting soil water characteristic curve. Eng. Geol. 2025, 353, 108121. [Google Scholar] [CrossRef]
- Seo, J.; Kim, Y. Assessing the likelihood of drought impact occurrence with extreme gradient boosting: A case study on the public water supply in South Korea. J. Hydroinform. 2023, 25, 191–207. [Google Scholar] [CrossRef]
- Eid, M.H.; Mikita, V.; Eissa, M.; Ramadan, H.S.; Mohamed, E.A.; Abukhadra, M.R.; El-Sherbeeny, A.M.; Kovács, A.; Szűcs, P. An advanced approach for drinking water quality indexing and health risk assessment supported by machine learning modelling in Siwa oasis, Egypt. J. Hydrol. Reg. Stud. 2024, 56, 101967. [Google Scholar] [CrossRef]
- Prasun, A.; Singh, A. Assessment of groundwater contamination in Aurangabad, Bihar using WQI and geostatistical modeling. Stoch. Environ. Res. Risk Assess. 2025, 39, 789–811. [Google Scholar] [CrossRef]
- Sunitha, V.; Reddy, B.M. Geochemical characterization, deciphering groundwater quality using pollution index of groundwater (PIG), water quality index (WQI) and geographical information system (GIS) in hard rock aquifer, south India. Appl. Water Sci. 2022, 12, 41. [Google Scholar] [CrossRef]
- Stevenazzi, S.; Voudouris, K.; Ducci, D. Combination of hydrochemical graphical methods and multivariate statistical analysis to delineate groundwater bodies. J. Environ. Manag. 2025, 381, 125266. [Google Scholar] [CrossRef]
- Traore, A.; Mao, X.; Traore, A.; Yakubu, Y.; Sidibe, A.M. Multivariate statistical analysis of dominating groundwater mineralization and hydrochemical evolution in gao, northern mali. J. Earth Sci. 2024, 35, 1692–1703. [Google Scholar] [CrossRef]
- Zhang, Y.; Li, X.; Huang, J.; Zhang, J.; Chen, P.; Wang, Y.; Wang, Y.; Li, Q.; Pu, W.; Yuan, X. Multi−isotopes (H, O, Sr, and S) and element geochemistry driving the genesis of geothermal waters in a sedimentary basin, southwestern China. Geothermics 2025, 133, 103480. [Google Scholar] [CrossRef]
- Nagarajan, R.; Rajmohan, N.; Mahendran, U.; Senthamilkumar, S. Evaluation of groundwater quality and its suitability for drinking and agricultural use in Thanjavur city, Tamil Nadu, India. Environ. Monit. Assess. 2010, 171, 289–308. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Liu, Z.; Yang, J.; Wang, J.; Zhang, L.; Tan, Y.; Xiang, D. Hydrochemical characteristics and genetic analysis of groundwater in zhanjiang city, guangdong province, south China. Water 2025, 17, 698. [Google Scholar] [CrossRef]
- Hussein, H.; El Maghraby, M.M.M.S.; Abu Salem, H.S. Application of water quality index and statistical-hydrochemical techniques in groundwater assessment of the quaternary aquifer, southwest nile delta of egypt. Appl. Water Sci. 2024, 14, 143. [Google Scholar] [CrossRef]
- Omonona, O.V.; Okogbue, C.O. Hydrochemical evolution, geospatial groundwater quality and potential health risks associated with intake of nitrate via drinking water: Case of Gboko agricultural district, central Nigeria. Environ. Earth Sci. 2021, 80, 126. [Google Scholar] [CrossRef]
- Tanwer, N.; Deswal, M.; Khyalia, P.; Laura, J.S.; Khosla, B. Assessment of groundwater potability and health risk due to fluoride and nitrate in groundwater of Churu District of Rajasthan, India. Environ. Geochem. Health 2023, 45, 4219–4241. [Google Scholar] [CrossRef]
- Deng, C.; Zhang, Y.; Yuan, X.; Wang, Y.; Lv, G.; Li, X. Genesis of high-sulfate geothermal water in the Namcha Barwa syntaxis, eastern Xizang. Hydrogeol. Eng. Geol. 2025, 52, 173–189. [Google Scholar] [CrossRef]
- Qu, S.; Liao, F.; Wang, G.; Wang, X.; Shi, Z.; Liang, X.; Duan, L.; Liu, T. Hydrochemical evolution of groundwater in overburden aquifers under the influence of mining activity: Combining hydrochemistry and groundwater dynamics analysis. Environ. Earth Sci. 2023, 82, 135. [Google Scholar] [CrossRef]
- Subba Rao, N.; Das, R.; Sahoo, H.K.; Gugulothu, S. Hydrochemical characterization and water quality perspectives for groundwater management for urban development. Groundw. Sustain. Dev. 2024, 24, 101071. [Google Scholar] [CrossRef]
- Singh, A.; Raju, A.; Chandniha, S.K.; Singh, L.; Tyagi, I.; Karri, R.R.; Kumar, A. Hydrogeochemical characterization of groundwater and their associated potential health risks. Environ. Sci. Pollut. Res. 2022, 30, 14993–15008. [Google Scholar] [CrossRef]
- Zhang, Y.; Gao, S.; Hu, C.; Zhao, Z.; Gao, Z.; Liu, J. Hydrochemical assessment of groundwater utilizing statistical analysis, integrated geochemical methods, and EWQI: A case study of Laiwu region, north China. Environ. Monit. Assess. 2024, 196, 1222. [Google Scholar] [CrossRef]
- Dong, Z.; Zhang, L.; Wang, C.; Zou, Y. Hydrochemical evolution and nitrate contamination sources in laiwu groundwater: Insights from hydrochemistry and dual-isotope analysis. Phys. Chem. Earth Parts A/B/C 2025, 141, 104083. [Google Scholar] [CrossRef]
- Sakthi Priya, R.; Antony Ravindran, A.; Richard Abishek, S. Spatial assessment of submarine groundwater discharge influence on aquifer water quality in the coastal region of Chettikulam to Kolachel, southern India: Using SMI and HFE-D techniques. Environ. Geochem. Health 2025, 47, 112. [Google Scholar] [CrossRef] [PubMed]
- Yan, Y.; Zhang, Y.; Sun, Z.; Xie, Z.; Yao, R.; Chen, S.; Uddin, M.G.; Pu, Y.; Yang, C.; Wang, Y.; et al. Using unsupervised machine learning and positive matrix factorization models to drive groundwater chemistry and associated health risks in a coal—Mining rural region. J. Hydrol. 2025, 661, 133691. [Google Scholar] [CrossRef]
- Zhang, Y.; Yan, Y.; Yao, R.; Wei, D.; Huang, X.; Luo, M.; Wei, C.; Chen, S.; Yang, C. Natural background levels, source apportionment and health risks of potentially toxic elements in groundwater of highly urbanized area. Sci. Total Environ. 2024, 935, 173276. [Google Scholar] [CrossRef]
- Dimple; Singh, P.K.; Rajput, J.; Kumar, D.; Gaddikeri, V.; Elbeltagi, A. Combination of discretization regression with data-driven algorithms for modeling irrigation water quality indices. Ecol. Inform. 2023, 75, 102093. [Google Scholar] [CrossRef]
- Gholami, V.; Khaleghi, M.R.; Salimi, E.T. Comparison of extreme gradient boosting, deep learning, and self-organizing map methods in predicting groundwater depth. Environ. Earth Sci. 2025, 84, 179. [Google Scholar] [CrossRef]
- Nafouanti, M.B.; Li, J.; Mustapha, N.A.; Uwamungu, P.; AL-Alimi, D. Prediction on the fluoride contamination in groundwater at the datong basin, northern China: Comparison of random forest, logistic regression and artificial neural network. Appl. Geochem. 2021, 132, 105054. [Google Scholar] [CrossRef]
- Tsuchihara, T.; Yoshimoto, S.; Shirahata, K.; Nakazato, H.; Ishida, S. Analysis of groundwater-level fluctuation and linear regression modeling for prediction of initial groundwater level during irrigation of rice paddies in the Nasunogahara alluvial fan, central Japan. Environ. Earth Sci. 2023, 82, 473. [Google Scholar] [CrossRef]
- Zhang, K.; Wang, X.; Liu, T.; Wei, W.; Zhang, F.; Huang, M.; Liu, H. Enhancing water quality prediction with advanced machine learning techniques: An extreme gradient boosting model based on long short-term memory and autoencoder. J. Hydrol. 2024, 644, 132115. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).