Next Article in Journal
A New Custom Deep Learning Model Coupled with a Flood Index for Multi-Step-Ahead Flood Forecasting
Previous Article in Journal
Evaluating Water Level Variability Under Different Sluice Gate Operation Strategies: A Case Study of the Long Xuyen Quadrangle, Vietnam
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Extraction of Major Groundwater Ions from Total Dissolved Solids and Mineralization Using Artificial Neural Networks: A Case Study of the Aflou Syncline Region, Algeria

1
Department of Natural and Life Sciences, Faculty of Science, University Center of Aflou El Cherif Bouchoucha, Aflou 03001, Algeria
2
Biological Systems and Geomatics Laboratory, Faculty of Natural and Life Sciences, University Mustapha Stambouli of Mascara, Mascara 29000, Algeria
3
Laboratory of Water Science and Technology, Department of Hydraulics, Faculty of Science and Technology, University Mustapha Stambouli of Mascara, Mascara 29000, Algeria
4
Department of Hydraulics, Faculty of Technology, University of Tlemcen, Tlemcen 13000, Algeria
5
Department of Hydro Science and Engineering Research, Korea Institute of Civil Engineering and Building Technology, Goyang 10223, Republic of Korea
6
Department of Railroad Construction and Safety Engineering, Dongyang University, Yeongju 36040, Republic of Korea
*
Author to whom correspondence should be addressed.
Hydrology 2025, 12(5), 103; https://doi.org/10.3390/hydrology12050103
Submission received: 19 March 2025 / Revised: 21 April 2025 / Accepted: 22 April 2025 / Published: 25 April 2025
(This article belongs to the Section Hydrological and Hydrodynamic Processes and Modelling)

Abstract

:
Global water demand due to population growth and agricultural development has led to widespread overexploitation of groundwater, particularly in semi-arid regions. The traditional hydrochemistry monitoring system still suffers from limited laboratory accessibility and high costs. This study aims to predict the major ions of groundwater, including Ca2+, Mg2+, Na+, SO42−, Cl, K+, HCO3, and NO3, utilizing two field-measurable parameters (i.e., total dissolved solids (TDS) and mineralization (MIN)) in the Aflou syncline region, Algeria. A multilayer perceptron (MLP) model optimized with Levenberg–Marquardt backpropagation (LMBP) provided the greatest predictive accuracy for the different ions of SO42−, Mg2+, Na+, Ca2+, and Cl with R2 = (0.842, 0.980, 0.759, 0.945, 0.895), RMSE = (53.660, 12.840, 14.960, 36.460, 30.530) (mg/L), and NSE = (0.840, 0.978, 0.754, 0.941, 0.892) in the testing phase, respectively. However, the predictive accuracy for the remaining ions of K+, HCO3, and NO3 was supplied as R2 = (0.045, 0.366, 0.004), RMSE = (6.480, 41.720, 40.460) (mg/L), and NSE = (0.003, 0.361, −0.933), respectively. The performance of our model (LMBP-MLP) was validated in adjacent and similar geological locations, including Aflou, Madna, and Ain Madhi. In addition, LMBP-MLP showed very promising results, with performance similar to that in the original research region.

1. Introduction

Global demand for water is expected to surge by 2050 due to population growth, economic growth, and changing consumption patterns. Estimates suggest that as many as 6 billion people could face water shortages if demand increases by 20 to 30 percent from the current levels [1]. Agriculture, which accounts for 70% of global freshwater withdrawals, will increase competition for resources, especially in dry areas [2,3]. These challenges were clearly illustrated in Algeria, where groundwater quality, a major source of irrigation in arid regions, is declining due to salinity and agricultural pollution [4,5]. For this reason, groundwater quality monitoring is very important for sustainable water resource management [6,7]. Extensive sampling campaigns and extensive water chemistry scans are required to monitor the amount of degradation. These scans should include the important chemical ions in water such as Ca2+, Mg2+, Na+, K+, HCO3, Cl, and SO42− [8,9]. Effective monitoring, however, still requires significant resources and ongoing sampling and analysis. This highlights the urgency of innovative solutions such as artificial neural networks to streamline water quality assessment [10].
Artificial intelligence (AI) has emerged as an innovative tool to simplify water quality assessment. An example of the early application of AI includes [11], who predicted the drinking water quality index (WQI) of Baghdad using artificial neural networks (ANNs) and identified pH and chloride as the main factors involved (R2 = 0.973). Subsequent work by [12] optimized the ANN architecture for prediction, showing that a simpler MLP-4-5-4 model displayed better accuracy (R2 = 0.989) than a deeper network. Based on these foundations, ref. [13] carried out nitrate concentration predictions by integrating land use data with pH, conductivity, and temperature, highlighting the adaptability of ANNs to multivariable systems. More recently, ref. [9] developed an ANN to predict ion concentrations (Ca2+, Mg2+, Na+, K+, HCO3, Cl, and SO42−) directly from electrical conductivity (EC), achieving high accuracy within the trained EC range. These developments are consistent with a broader trend in ANN-based environmental modeling, with hybrid approaches combining physical and data-driven models gaining popularity.
These methodological innovations have been applied to address regional challenges. Ref. [14] compared radial basis function neural networks (RBF-NNs) and probabilistic neural networks (PNNs) in Iraq’s Alnekheeb Basin. They found that the PNN was superior in assessing irrigation suitability through salinity and sodium uptake ratios. Similarly, ref. [15] utilized ANNs to predict groundwater salinity, outperforming conventional regression models and enabling tailored irrigation strategies for salinity-sensitive crops in Spain’s Campo de Cartagena. Furthermore, ref. [16] demonstrated the scalability of ANNs in stressed groundwater layers, achieving perfect TDS prediction (R2 = 0.984) in the Babylonian region of Iraq. However, there are still gaps in applying these technologies to regions with complex evaporite geology, such as the semi-arid regions of North Africa.
In this research, the authors focused on the application of ANN techniques in the Aflou syncline region of Algeria, a region with distinct geological and climatic features which is dependent on groundwater stored in sandstone strata influenced by Aptian gypsum and Triassic evaporite [17]. Here, the increase in the number of wells and the intensive exploitation of groundwater resources have accelerated evaporation and dissolution, increasing the risk of salinity [4]. The novelty of the current research is that it can predict major ions of groundwater including SO42−, Mg2+, Na+, Ca2+, Cl, K+, HCO3, and NO3, employing ANNs optimized with various learning algorithms based on two field-measured parameters (i.e., total dissolved solids (TDS) and mineralization (MIN) values). Model performance was evaluated by utilizing statistical measures of accuracy and visual comparisons. The materials, including the study area and data collection, are provided in the next section. Also, artificial neural networks, optimization algorithms, model development, and measures of accuracy are explained one by one in Section 3. The results and discussion are presented in Section 4. Finally, the main conclusions are presented in Section 5.

2. Materials

2.1. Study Area and Data Collection

Aflou syncline region is located north of Djebel Amour, situated in the Central Sahara Atlas about 300 km southwest of Algiers at 1400 m above sea level (Figure 1). From its geographical coordinates (i.e., 34.11° N and 2.10° E), it is located in a mountainous area that acts as a natural barrier between the Sahara Atlas and the Sahara Plateau. This high terrain further exacerbates climatic contrasts, protecting the region from Mediterranean influences and creating a semi-arid climate with relatively cool temperatures and limited rainfall [17]. Geologically, the area is part of the Saharan Atlas Fold Belt, composing of Mesozoic sediments that date from the Triassic to the Cretaceous. These deposits reflect alternating marine and continental deposits, with limestone, limestone-rich beds, and sandstone-dominated strata.
The groundwater in this area flows generally in a southwest/northeast direction, and the piezometric water level varies between 1440 and 1320 m. The water level in piezometer decreases unexpectedly, suggesting that the sandstone syncline brings about aquifer channelization. Groundwater flow occurs on both sides of the syncline, and a cone of depression related to well operation is observed in the Aflou area. The hydraulic gradient ranges from 0.0005 upstream to more than 0.003 downstream, which is related to the permeability of area [18].
In arid and semi-arid regions, it is important to understand the geochemical processes for analyzing the water quality of aquifer system. This aquifer is located on a slope that is oriented in the SSW-NNE direction, and is over 80 km long and 10 km wide. Considering the multiple sources of error that could affect the results, it is important to clarify that the analysists retained only wells with a depth of less than 45 m, which is the majority, and excluded water level points with ion balance errors exceeding 5%. To accomplish this research, therefore, 153 groundwater samples were collected from wells distributed throughout the study area and were analyzed at the National Office of Water Resources (NAWR) Hydrology Laboratory. These datasets form the basis for modeling correlation among total dissolved solids (TDS), mineralization (MIN), and major ion concentrations employing ANNs. For this purpose, the dataset was split into three subsets: training (75%), validation (15%), and test (10%). The training subset was utilized to adjust the model parameters, while the validation subset was utilized for fine-tuning hyperparameters and to mitigate overfitting. Finally, the test subset was utilized to assess the model’s generalization capability, evaluating its performance on new data.

2.2. Mineralization

Among the datasets, mineralization (MIN), a French nomenclature, indicates the process by which water absorbs dissolved minerals as it moves through the Earth’s crust, interacting with rocks, sediments, and other geological structures. It refers to the inorganic fraction of total dissolved solids (TDS) in a water sample that remains after evaporation at 180 °C followed by ignition at 550 °C. It represents the non-volatile, thermally stable dissolved solids, primarily composed of inorganic salts and minerals. They mainly consist of cations (positive ions such as calcium (Ca2+), magnesium (Mg2+), sodium (Na+), and potassium (K+)) and anions (negative ions such as chloride (Cl), sulfate (SO42−), carbonate (CO32−), bicarbonate (HCO3), and nitrate (NO3)). The methods for measuring MIN include direct laboratory procedure and indirect conversion factor method.

2.2.1. Laboratory Procedure

The required equipment for laboratory procedure consists of drying oven (180 °C); muffle furnace (550 °C); analytical balance (0.1 mg precision); platinum or porcelain crucible; 0.45 µm membrane filter; and beaker, pipette, and deionized/distilled water desiccator. In addition, step-by-step laboratory procedure involves the following: (1) transferring the filtered sample into a pre-weighed crucible; (2) evaporating and drying it in an oven at 180 °C until it reaches complete dryness; and (3) cooling the crucible in a desiccator, and then weighing it. The residue weight corresponds to the TDS (in mg/L).
The process for calculating MIN is defined as ignition. It can be outlined as follows: (1) take the crucible containing the TDS residue; (2) place it in a muffle furnace at 550 °C for at least 1 h; and (3) cool it in a desiccator, and then weigh it again. The remaining mass represents the MIN (mg/L) (i.e., [(weight after ignition − empty crucible weight)]/(sample volume in liters)). It is important to note, for laboratory procedure, that crucibles should always be cooled in a desiccator to avoid moisture absorption. Also, all weights must be recorded with high precision (±0.1 mg).

2.2.2. The Conversion Factor Method

The conversion factor method is one of the simplest and fastest field methods used to estimate mineralization in water. It relies on measuring the electrical conductivity (EC) of waterbody. In this method, a conversion factor previously calculated from chemical analysis data of similar water samples in the same geographical area or geological layer is employed. The procedures for measuring MIN utilizing the conversion factor method can be outlined as follows:
Step 1. Measure the EC. The EC of waterbody is measured utilizing a reliable device, ensuring it is properly calibrated. EC is measured in microsiemens per centimeter (µS/cm), which reflects the waterbody’s ability to conduct electricity, directly related to the concentration of dissolved ions in the water.
Step 2. Utilize the previously calculated conversion factor. The conversion factor, which was previously calculated from earlier chemical analyses of waterbody samples in the same region or geological layer, is applied. Also, the conversion factor is a constant number derived from the relationship between EC and MIN in that area. It reflects the relative distribution of mineral ions such as Ca2+, Mg2+, Na+, K+, Cl, SO42−, HCO3, and NO3 in the waterbody.
Step 3. Calculate MIN. After measuring EC, the measured value is multiplied by the conversion factor to obtain the MIN value in mg/L or ppm (i.e., EC (µS/cm) × conversion factor).
It is important to note, that the accuracy of conversion factor method depends heavily on the accuracy of conversion factor obtained from previous chemical data. Therefore, it should be representative of the area where MIN is being measured. Also, it is advisable to periodically verify the calculated values by performing laboratory analyses on some samples to ensure the field estimates are accurate.
The advantage of conversion factor method is that it is quick and effective in the field, allowing for data collection from multiple locations. Also, it does not require complex chemical analyses in the laboratory procedure. On the other hand, the disadvantage of conversion factor method is that it heavily depends on the accuracy of conversion factor, which may not always be perfectly representative of all conditions. Furthermore, organic pollution or the presence of non-mineral components may affect the measurement results.
Therefore, the conversion factor method is an effective and field-friendly method for measuring MIN, especially in locations where laboratory access is difficult. However, to ensure accurate measurements, it is crucial that the conversion factor is derived from precise data, and periodic verification of results is required.

3. Methodology

3.1. Artificial Neural Networks and Optimization Algorithms

A multilayer perceptron (MLP), also known as a feedforward connected neural network (FCNN), is a fundamental architecture in deep learning in which every neuron in one layer is connected to all neurons in the next layer, allowing the network to learn nonlinear and complicated relationships in the data [19].
The training concept of MLP is the process of optimizing weights and biases to minimize the loss function, and it is usually accomplished utilizing the backpropagation method, a gradient-based optimization algorithm [20]. Backpropagation applies the chain rule to compute the gradient of the loss function for each weight, allowing the network to iteratively adjust its parameters [21]. However, standard backpropagation can be slow to converge or unstable, which has led to the development of advanced optimization algorithms.
These optimizing algorithms include (1) Levenberg–Marquardt (trainlm), which combines gradient descent and Gauss–Newton methods for fast convergence, but requires significant memory; (2) conjugate gradient with Polak–Ribière updates (traincgp), which is memory efficient and suitable for large networks; (3) gradient descent with momentum and adaptive learning rate (traingdx), which utilizes momentum to accelerate convergence and adapts the learning rate dynamically; (4) one-step secant (trainoss), which approximates the Hessian matrix to reduce computational complexity; (5) BFGS quasi-Newton (trainbfg), a second-order optimization method that approximates the inverse Hessian for faster convergence; (6) conjugate gradient with Powell–Beale restarts (traincgb), which periodically resets the search direction to avoid stagnation; (7) gradient descent with adaptive learning rate (traingda), which adjusts the learning rate based on gradient behavior; (8) resilient backpropagation (trainrp), which updates weights based on the sign of the gradient rather than its magnitude, making it robust to gradient vanishing; and (9) conjugate gradient with Fletcher–Reeves updates (traincgf), another conjugate gradient method which ensures efficient optimization [22,23,24,25,26].
The addressed optimization algorithms are implemented in various machine learning frameworks, such as MATLAB(R2021a)’s neural network toolbox, and are chosen based on problem requirements, including network size, data complexity, and computational constraints. For example, Levenberg–Marquardt (trainlm) algorithm is often utilized for small-to-medium-sized networks because of its speed, whereas conjugate gradient with Polak–Ribière updates (traincgp) and conjugate gradient with Powell–Beale restarts (traincgb) algorithms are preferred for large networks because of their memory efficiency. Also, the choice of optimization algorithm depends on the characteristics of loss surface. In addition, BFGS quasi-Newton (trainbfg), a second-order method, is effective on smooth, convex surfaces, whereas gradient descent with momentum and adaptive learning rate (traingdx), a first-order method, is more versatile on non-convex terrain [24,27,28].

3.2. Model Development

In this research, a Levenberg–Marquardt backpropagation multilayer perceptron (LMBP-MLP) was trained to predict the concentrations of important ions (i.e., Ca2+, Mg2+, Na+, SO42−, Cl, K+, HCO3, and NO3) in water utilizing measurements of MIN, TDS, and some ions (i.e., Mg2+, Na+, and SO42−).
The selection of appropriate hyperparameters, including the number of neurons in the hidden layer, the type of activation function, the type of learning function, and the learning rate of ANN model, played a critical role in the model development.
PCA analysis, which was employed for data preprocessing of LMBP-MLP, can select the most important features from large quantities of data while maintaining the most appropriate information from the initial data [29]. The selection of important features for each ANN model (LMBP-MLP) was guided by a correlation heatmap of Pearson correlation coefficients, ensuring that the most relevant variables were utilized for each ion prediction (see Figure 2). The Pearson correlation coefficient quantifies the linear relationship between two variables, ranging from −1 (perfectly negative) to +1 (perfectly positive), with 0 indicating no linear correlation. It is commonly employed to assess the association between two variables [30]. Also, a correlation matrix helps to identify factors that exhibit statistical association based on the Pearson correlation coefficient, which quantifies the linear relationship between two variables. Figure 2 demonstrated that MIN and TDS showed strong correlations with most ions, especially SO42−, Mg2+, Na+, Ca2+, and Cl, with correlation coefficients exceeding 0.85. This suggested that these elements originated primarily from evaporite deposits, such as gypsum and saltpeter, found in the Triassic and Aptian formations, and from limestone layers embedded within the Baremian–Aptian sandstone formations, which constitute the most important aquifer system in the region. In contrast, the remaining factors showed weak correlations, with coefficients less than 0.53. In particular, NO3 and HCO3 displayed low correlation values, reflecting that the two substances have different origins. In addition, nitrates (NO3) mainly come from agricultural fertilizers, while bicarbonates (HCO3) are produced by dissolution of calcite, the main mineral matrix of the Barem–Aptian sandstone. Potassium ions (K+), derived from the dissolution of the rare mineral sylvin, is present in low concentrations and contributes minimally to MIN and TDS.
The heatmap also highlights a very strong correlation for (MIN, SO42−) and (TDS, SO42−) (PCC > 0.95), followed by Na+ and Mg2+, both of which exhibit significant correlations (PCC > 0.92) with MIN and TDS. Both Na+ and Mg2+ displayed important correlations with SO42− (PCC > 0.86). Also, a notable correlation was observed between Ca2+ and Cl with MIN and TDS (PCC > 0.85). A moderate correlation was found between SO42− and Cl (PCC > 0.71). In contrast, the remaining ions (i.e., K+, HCO3, and NO3) did not display any significant correlations with the other studied elements. Depending on results of this analysis, feature variables suitable for each ANN model were selected as shown in Table 1.

3.3. Measures of Accuracy

To evaluate the performance of developed model, the authors employed two main statistical measures of accuracy, namely the coefficient of determination (R2) (Equation (1)), root mean square error (RMSE) (Equation (2)), and Nash–Sutcliffe efficiency (NSE) (Equation (3)). R2 quantifies the proportion of variance in the dependent variable that can be predicted by the independent variables, providing insight into the explanatory power of developed model. Also, it assesses the predictive accuracy by comparing the model’s performance to the mean of the measured data, with values closer to one indicating a better fit [7,31]. Also, RMSE gives us a simple way to interpret predictive accuracy by measuring the average size of the error between the predicted and measured values [32,33]. In addition, NSE, a dimensionless statistical measure, can be interpreted as the coefficient of efficiency, and can be utilized to indicate the relative evaluation of developed model achievement [34,35].
R 2 = i = 1 n [ ( Z i Z i ¯ ) ( Z i Z i ¯ ) i = 1 n [ ( Z i Z i ¯ ) 2   i = 1 n ( Z i Z i ¯ ) 2
R M S E = 1 n i = 1 n ( Z i Z i ¯ ) 2
N S E = 1 i = 1 n ( Z i Z i ) 2 i = 1 n ( Z i Z i ¯ ) 2
where, Z i   = the predicted values, Z i = the measured values, Z i ¯ = the mean of the measured values, Z i ¯ = the mean of the predicted values, and n = the number of data available.
In addition, the authors incorporated ion balance (a chemical index) to assess the model’s predictive ability to maintain chemical balance, which is especially important for applications involving water quality or environmental chemistry. Also, the addressed measurements provide a comprehensive assessment of the model’s predictive accuracy and reliability.
Ionic equilibrium is often evaluated via the charge balance (CB) index (Equation (4)), which is an important metric for assessing chemical consistency of a solution, especially in water quality research [36]. The interpretation of charge balance values depends on specific thresholds defined for the analysis context, such as in Equations (5) and (6).
C B = C A ( C + A ) × 100
C = M g 2 + 12.15 + C a 2 + 20.04 + K + 39.01 + N a + 22.99
A = C l 35.45 + S o 4 2 48.03 + H C O 3 61.02 + N O 3 62
where ΣC = sum of cations (meq/L) and ΣA = sum of anions (meq/L).
For instance, a |CB| value of less than 5 indicates good ionic balance, reflecting a high degree of chemical consistency. A |CB| between 5 and 8 suggests moderate ionic balance, while a |CB| greater than 8 signifies poor ionic balance, indicating potential issues with the chemical composition. However, these thresholds can vary depending on the study’s requirements. In other cases, |CB| < 6 may be considered good, 6 ≤ |CB| ≤ 12 is moderate, and |CB| > 12 is poor. For more lenient assessments, thresholds such as |CB| < 10 (good), 10 ≤ |CB| ≤ 20 (moderate), and |CB| > 20 (poor) might be applied. These ranges help to classify the reliability of ion balances, ensuring the accuracy and validity of chemical data in environmental or analytical studies [37].

3.4. Hyperparameters Selection

The activation functions utilized in all developed models include the sigmoid activation function in the hidden layer and the linear transfer function in the output layer. The learning rate was set to 0.001 to ensure stable and efficient training. To select the most suitable training algorithm, the authors trained a model with two neurons utilizing the first three variables (i.e., SO42−, Na+, and Mg2+) listed in Table 1. The training algorithm illustrating the highest performance criterion was selected for optimization process. In this study, the evaluated training algorithms included trainlm, traincgp, traingdx, trainoss, trainbfg, traincgb, traingda, trainrp, and traincgf. Figure 3 shows the performance of training algorithms in this study. Results showed that Levenberg–Marquardt (trainlm) algorithm was superior to remaining algorithms and was the recommended choice for this study.
Choosing the number of neurons in the hidden layer is a critical factor in determining the accuracy of model training process. An excessive number of neurons can lead to overfitting, where the model memorizes noise instead of learning meaningful patterns. Conversely, if there are too few neurons, the model’s capacity to capture complex relationships is limited, which can lead to underfitting [38,39,40].
To enhance the performance of each model, the optimal number of neurons in the hidden layer was determined using a trial-and-error approach, with the number of neurons ranging from 1 to 30. Figure 4 illustrates the effect of difference in the number of hidden neurons on model accuracy in the validation phase. This showed how changing the number of hidden neurons affected the model accuracy. Based on the results, the optimal number of hidden neurons for each model was identified as follows: five neurons for LMBP-MLP1, LMBP-MLP2, and LMBP-MLP3; three neurons for LMBP-MLP4; nine neurons for LMBP-MLP5 and LMBP-MLP7; ten neurons for LMBP-MLP6; and eighteen neurons for LMBP-MLP8.
Figure 5 illustrates the flowchart of modeling process. Initially, SO42− concentrations were predicted based on measured TDS and MIN values. Afterwards, Na+ and Mg2+ concentrations were computed utilizing measured (i.e., TDS and MIN) and predicted (SO42−) values. Also, the prediction of Ca2+ concentrations incorporated measured (i.e., TDS and MIN) and predicted (i.e., SO42−, Na+, and Mg2+) values. Similarly, Cl and K+ concentrations were predicted utilizing measured TDS and MIN, along with predicted SO42− and Na+ values. Finally, NO3 and HCO3 concentrations were calculated based on measured TDS and MIN, integrating with the predicted Mg2+ values.

4. Results and Discussion

Table 2 presents the results of the developed ANN models for all major ions in the Aflou syncline region of Algeria, utilizing the coefficient of determination (R2), root mean square error (RMSE), and Nash–Sutcliffe efficiency (NSE) as evaluation metrics in the training, validation, test, and all dataset. The predictive accuracy of the models’ performance varied greatly based on different ANN models.
The ANN models used in this study, including LMBP-MLP1, LMBP-MLP2, LMBP-MLP3, LMBP-MLP4, and LMBP-MLP5, presented accurate performance, as indicated by high R2 and NSE values and relatively low RMSE values across all subsets. This implies that the aforementioned models can accurately predict the concentrations of these ions (SO42−, Mg2+, Na+, Ca2+, and Cl) in the Aflou syncline region. However, some models, including LMBP-MLP6, LMBP-MLP7, and LMBP-MLP8, displayed poorer performance. That is to say, low R2 and NSE values and high RMSE values indicated that these models have difficulty capturing the variability of K+, HCO3, and NO3 ions. In addition, among all of the developed models (i.e., from LMBP-MLP1 to LMBP-MLP8), it can be inferred from Table 2 that LMBP-MLP1 provided the strongest correlation between the predicted and measured variables (SO42−), whereas LMBP-MLP6, LMBP-MLP7, and LMBP-MLP8 did not yield a meaningful relationship between the predicted and measured variables (K+, HCO3, and NO3) based on various measures of accuracy in all of the datasets.
Figure 6 illustrates the comparison between the predicted and measured values of all major ions in the testing phase utilizing line plots and scatter plots. Line plots are effective when engineers and scientists want to review how the used data changes over time, or when making measurements on a non-time scale. Also, scatter plots employ various dots to interpret the suggested values for two different numeric indicators [41]. This gives a comparative assessment of the measured and predicted ion concentrations, providing important insights into the hydrochemical dynamics of the region.
For sulfate (SO42−), magnesium (Mg2+), and sodium (Na+), utilizing all of the datasets, the line plots for individual ANN models (i.e., LMBP-MLP1, LMBP-MLP2, and LMBP-MLP3) showed a strong visual agreement between the observed and predicted values, highlighting the reliability of LMBP-MLP1, LMBP-MLP2, and LMBP-MLP3. The scatter plots further quantified this alignment, showing strong predictive accuracy with high coefficients of determination (R2 = 0.936 for SO42−, R2 = 0.924 for Mg2+, and R2 = 0.916 for Na+).
For calcium (Ca2+) and chloride (Cl), the analysis results revealed partial model efficacy. The Ca2+ line plot was broadly consistent with the measured trend, but the deviations in 70–80 of the samples showed that it had limited ability to capture regional differences. The Ca2+ scatter plot (R2 = 0.892) confirmed this, suggesting that high concentrations of outliers led to lower accuracy in the higher ranges. Similarly, the Cl scatter plot tracked the trend adequately, but underestimated the sharp peak at 90–100 of the samples, as evidenced by the clustering of scatter plot outliers (R2 = 0.872) above the regression line. These discrepancies might arise from extreme values or incomplete representation of location-specific geochemical interactions, highlighting the need for targeted fine-tuning to improve prediction performance for these ions.
In contrast to the strong correlations for SO42−, Mg2+, and Na+, some models (i.e., LMBP-MLP6, LMBP-MLP7, and LMBP-MLP8) displayed poor performance for the scatter plots of potassium (K+), bicarbonate (HCO3), and nitrate (NO3). The K+ scatter plot (R2 = 0.441) showed minimal agreement with the observed data and failed to reproduce the variability. Also, the HCO3 scatter plot (R2 = 0.330) provided a nearly random dispersion, indicating a fundamental flaw in either variable selection or mechanical assumptions. Although the NO3 scatter plot (R2 = 0.523) displayed a marginal improvement, LMBP-MLP8 systematically underestimated maximum concentrations, due to unaccounted for anthropogenic or biogeochemical influences.
It can be judged from the line plots and scatter plots (Figure 6) that LMBP-MLP1 displayed the highest R2 value between the predicted and measured variables (SO42−), whereas LMBP-MLP6, LMBP-MLP7, and LMBP-MLP8 did not give the lowest R2 values between the predicted and measured variables (K+, HCO3, and NO3) in all of the datasets.
Figure 7 presents the sorted charge balance (CB) values of all samples utilizing the predicted ion concentrations. It displays aligned equilibrium ion values for 153 water samples from the Aflou syncline region, employing the predicted ion concentrations. This charge balance serves as an indicator of data quality and the predictive accuracy of the ions, with values closer to 0 indicating a better match between positive and negative charges. Samples are classified into three groups, “Good” (green), “Moderate” (yellow), and “Poor” (red), based on their deviation from the ion balance. Most of the samples (84%) were classified as “Good”, suggesting the overall reliability of the data and the satisfactory performance of the developed models. However, 11% of the samples fell into the “Moderate” category, indicating potential problems, such as measurement errors, the presence of uncounted ions, or sample degradation. A small number of samples (5%) were classified as “Poor”, indicating serious errors that required reevaluation.
Testing the Developed Model in Adjacent Locations
To evaluate the generalization ability of the developed models, their performance was tested utilizing 20 water samples collected from three external locations within the research area. These locations share the same geological structure as the main research area, but show some petrological differences. The selected locations include Madna (six samples); Aflou ( samples), situated southwest of the Aflou syncline; and Ain Madhi (ten samples), situated further south. The predictive accuracy of the developed models was assessed utilizing the coefficient of determination (R2), and the results are presented in Figure 8.
Our results show significant differences in model performance across different ions and locations, highlighting that both ion-specific behavior and location-specific characteristics have an impact. The applied models show high prediction accuracies for SO42−, Mg2+, and Na+, and consistently high R2 values at all points, indicating that the relationships between these ions and the feature variables were stable.
For Ca2+ and Cl, the applied models performed well in Aflou and Madna, but supplied poor accuracy in Ain Madhi, suggesting that there are location-specific hydrogeochemical factors that influence ion concentrations. In contrast, the applied models performed poorly for NO3 and K+, with R2 values close to 0 at three locations. This is likely due to external influences such as agricultural activities (NO3) and local mineral dissolution (K+).
The prediction of HCO3 varied significantly, displaying moderate performance at Madna but low accuracy at Aflou and Ain Madhi, indicating that hydrogeochemical control may be possible, depending on the location. The differences of performance in the applied models can be attributed to differences in rock composition, groundwater flow dynamics, and local environmental factors affecting ion concentrations.
The geological characteristics of Ain Madhi may provide more pronounced variations than those of Aflou and Madna, which may lead to inconsistencies that the applied models may not fully capture. In addition, location-specific geochemical processes, anthropogenic influences (e.g., fertilizer use affecting NO3), and varying mineral dissolution rates (e.g., sylvite dissolution for K+) may contribute to the observed discrepancies.
Overall, the applied models demonstrated strong predictive ability for SO42−, Mg2+, and Na+, whereas they performed poorly for Ca2+, Cl, HCO3, NO3, and K+, especially at Ain Madhi. These results highlight the need for additional location-specific calibrations to improve model accuracy for specific ions and account for local hydrogeochemical variations.
To estimate the charge balance (CB) in the adjacent locations of Aflou, Madna, and Ain Madhi, the authors utilized Figure 8 to identify ions with high predictive performance (R2 > 0.600), and replaced ions with low performance utilizing the measured data to ensure accuracy.
The values of the selected ions utilized for ionic balance calculations varied depending on location. In Aflou, the selected ions were SO42−, Cl, Ca2+, Mg2+, Na+, and K+. In Ain Madhi, only SO42−, Ca2+, Mg2+, and Na+ satisfied the selection criteria, while in Madna, the chosen ions included SO42−, HCO3, Cl, Ca2+, Mg2+, and Na+. The results of the ionic balance and their evaluation are presented in Table 3, Table 4 and Table 5.
Each table (Table 3, Table 4 and Table 5) provides the predicted and measured ion concentrations, the calculated charge balance, and the evaluation of samples based on these values. In Aflou, 75% of the samples were rated as “Good”, indicating reliable data and accurate ion estimates, 25% were rated as “Moderate”, and 0% were rated as “Poor”. In contrast, Ain Madhi featured a wider range of sample quality, featuring 50% “Good” samples, along with 30% “Poor” and 20% “Moderate” ratings. This suggests potential data quality issues specific to Ain Madhi, which could arise from sample contamination or measurement errors. Madna, similar to Aflou, gave mostly “Good” ratings (66.67%), with 33.33% being “Moderate” samples, suggesting generally reliable data with localized discrepancies. The variability in charge balance across all regions highlights the importance of ion balance analysis as a tool to assess data quality and validate ion estimates. The presence of “Poor” and “Moderate” samples highlights the need for further investigation to identify and correct potential problems to ensure the accuracy and reliability of hydrochemical data.

5. Conclusions

In Algeria, groundwater remains critical for irrigation, but groundwater quality in locations such as the Aflou syncline is increasingly compromised by salinity and agricultural contamination, threatening agricultural sustainability. Conventional monitoring methods that rely on expensive sampling campaigns and laboratory analyses highlight the urgent need for innovative and cost-effective solutions to ensure water security. Artificial intelligence (AI), especially artificial neural networks (ANNs), have emerged as a revolutionary tool in the field of hydrochemistry.
In this study, the authors introduced a novel algorithm to predict eight major ions (Ca2+, Mg2+, Na+, K+, HCO3, SO42−, Cl, and NO3) utilizing only two accessible parameters (i.e., total dissolved solids (TDS) and mineralization (MIN)). The Levenberg–Marquardt backpropagation multilayer perceptron (LMBP-MLP) model with ion-specific customized architecture achieved robust predictive accuracies for SO42−, Mg2+, Na+, Ca2+, and Cl (R2 and NSE ≥ 0.87), proving its usefulness in real-time monitoring. However, predictive accuracies for K+, HCO3, and NO3 were less reliable (R2 ≤ 0.50), most likely due to complex environmental interactions and low concentrations leading to low statistical significance. That is, LMBP-MLP2 (R2 = 0.980, RMSE = 12.840 mg/L, and NSE = 0.978) provided the best accuracy for predicting groundwater ion concentrations (Mg2+) in the testing phase, whereas LMBP-MLP7, LMBP-MLP8, and LMBP-MLP9 displayed the worst prediction of groundwater ion concentrations (K+, HCO3, and NO3) in the testing phase. If the number of measured ion datasets for the groundwater increases significantly, the prediction of groundwater ion concentration which displayed poor results (K+, HCO3, and NO3) can boost predictive accuracy significantly in the testing phase. Also, the validation of the charge balance (CB) analysis confirmed strong ionic balance in 95% of the predictions, but 5% showed discrepancies requiring improvement.
Spatial tests across three locations (i.e., Aflou, Madena, and Ain Madhi) showed consistent accuracy for SO42−, Mg2+, and Na+; moderate performance for Ca2+ and Cl; variable results for K+ and HCO3; and overall poor prediction of NO3. These results highlight the model’s adaptability to regions such as Aflou and Madna, while also emphasizing the need for expanded geographic data to improve generalization. Despite these limitations, this algorithm has made great strides in water resource management in salinity-affected areas. Direct TDS and MIN measurements enable early detection of important ions (Ca2+, Mg2+, Na+, SO42−, and Cl), providing three key benefits: cost savings, adaptability, and efficiency.
The limitation of this study can be explained by the use of few ANN models optimized with various learning algorithms to predict the concentration of major ions in the groundwater utilizing the restricted samples. That is, this study cannot be said, in general, to have accomplished the prediction of the concentrations of major ions in groundwater based on the suggested model. Therefore, the lack of various experiments can be improved by further studies, which should combine artificial neural networks, evolutionary optimization approaches, and data preprocessing tools to achieve the best prediction with the highest level of quality. Also, the overfitting issue occurring in the training procedure can be solved by using more high-quality data which include maximum and minimum values [38,40]. In addition, the K-fold cross validation [39] method, which has been often applied to training procedures, can also reduce the overfitting issue dependent on the diverse models.

Author Contributions

Conceptualization, M.E.S., A.H. (Abderrahmane Hamimed) and M.Z.; methodology, M.Z.; validation, M.E.S., A.H. (Abderrahmane Hamimed) and S.K.; formal analysis, A.H. (Azzaz Habib)., A.H. (Abderrahmane Hamimed) and M.Z.; investigation, A.H. (Azzaz Habib)., I.-M.C. and S.K.; data curation, M.E.S. and A.H. (Abderrahmane Hamimed).; writing—original draft preparation, M.E.S., A.H. (Azzaz Habib), A.H. (Abderrahmane Hamimed), M.Z., I.-M.C. and S.K.; writing—review and editing, M.Z., I.-M.C. and S.K.; visualization, M.E.S., A.H. (Azzaz Habib) and A.H. (Abderrahmane Hamimed); supervision, S.K.; funding acquisition, I.-M.C. All authors have read and agreed to the published version of the manuscript.

Funding

Research for this paper was carried out under the KICT Research Program (Project No. 20250108-001, Development of IWRM-Korea Technical Convergence Platform Based on Digital New Deal) funded by the Ministry of Science and ICT. This work was also supported by the Korea Environmental Industry & Technology Institute’s Drought Response Water Management Innovation Technology Development Project funded by the Ministry of Environment (2020361002).

Data Availability Statement

The data presented in this study will be made available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
TDSTotal dissolved solids
MINMineralization
MLPMultilayer perceptron
LMBPLevenberg–Marquardt backpropagation
AIArtificial intelligence
WQIWater quality index
ANNArtificial neural networks
ECElectrical conductivity
RBF-NNRadial basis function neural networks
PNNProbabilistic neural networks
FCNNFeedforward connected neural networks
R2Coefficient of determination
RMSERoot mean square error
CBCharge balance
LMBP-MLPLevenberg–Marquardt backpropagation multilayer perceptron

References

  1. UNESCO. The United Nations World Water Development Report 2018-Nature-Based Solutions for Water; UN: New York, NY, USA, 2019. [Google Scholar]
  2. Boretti, A.; Rosa, L. Reassessing the projections of the world water development report. NPJ Clean Water 2019, 2, 15. [Google Scholar] [CrossRef]
  3. Canton, H. Food and agriculture organization of the United Nations—FAO. In The Europa Directory of International Organizations 2021; Routledge: Oxfordshire, UK, 2021; pp. 297–305. [Google Scholar]
  4. Hamed, Y.; Hadji, R.; Redhaounia, B.; Zighmi, K.; Bâali, F.; El Gayar, A. Climate impact on surface and groundwater in North Africa: A global synthesis of findings and recommendations. Euro-Mediterr. J. Environ. Integr. 2018, 3, 25. [Google Scholar] [CrossRef]
  5. Bioud, I.; Semar, A.; Laribi, A.; Douaibia, S.; Chabaca, M.N. Assessment of groundwater quality and its suitability for irrigation: The case of Souf Valley phreatic aquifer. Alger. J. Environ. Sci. Technol. 2023, 9, 1429–1441. [Google Scholar]
  6. Shiri, N.; Shiri, J.; Yaseen, Z.M.; Kim, S.; Chung, I.M.; Nourani, V.; Zounemat-Kermani, M. Development of artificial intelligence models for well groundwater quality simulation: Different modeling scenarios. PLoS ONE 2021, 16, e0251510. [Google Scholar] [CrossRef]
  7. Alizamir, M.; Ahmed, K.O.; Kim, S.; Heddam, S.; Gorgij, A.D.; Chang, S.W. Development of a robust daily soil temperature estimation in semi-arid continental climate using meteorological predictors based on computational intelligent paradigms. PLoS ONE 2023, 18, e0293751. [Google Scholar] [CrossRef]
  8. Lopes, M.B.S. The 2017 World Health Organization classification of tumors of the pituitary gland: A summary. Acta Neuropathol. 2017, 134, 521–535. [Google Scholar] [CrossRef]
  9. Khadra, F.W.; El Sibai, R.; Khadra, W.M. Deriving groundwater major ions from electrical conductivity using artificial neural networks supported by analytical hydrochemical solutions. Groundw. Sustain. Dev. 2024, 24, 101056. [Google Scholar] [CrossRef]
  10. Tao, H.; Hameed, M.M.; Marhoon, H.A.; Zounemat-Kermani, M.; Heddam, S.; Kim, S.; Sulaiman, S.O.; Tan, M.L.; Sa’adi, Z.; Mehr, A.D.; et al. Groundwater level prediction using machine learning models: A comprehensive review. Neurocomputing 2022, 489, 271–308. [Google Scholar] [CrossRef]
  11. Khudair, B.H.; Jasim, M.M.; Alsaqqar, A.S. Artificial neural network model for the prediction of groundwater quality. Civ. Eng. J. 2018, 4, 2959–2970. [Google Scholar] [CrossRef]
  12. Setshedi, K.J.; Mutingwende, N.; Ngqwala, N.P. The use of artificial neural networks to predict the physicochemical characteristics of water quality in three district municipalities, eastern cape province, South Africa. Int. J. Environ. Res. Public Health 2021, 18, 5248. [Google Scholar] [CrossRef]
  13. Stylianoudaki, C.; Trichakis, I.; Karatzas, G.P. Modeling groundwater nitrate contamination using artificial neural networks. Water 2022, 14, 1173. [Google Scholar] [CrossRef]
  14. Allawi, M.F.; Al-Ani, Y.; Jalal, A.D.; Ismael, Z.M.; Sherif, M.; El-Shafie, A. Groundwater quality parameters prediction based on data-driven models. Eng. Appl. Comput. Fluid Mech. 2024, 18, 2364749. [Google Scholar] [CrossRef]
  15. Mateo, L.F.; Más-López, M.I.; García-del-Toro, E.M.; García-Salgado, S.; Quijano, M.Á. Artificial Neural Networks to Predict Electrical Conductivity of Groundwater for Irrigation Management: Case of Campo de Cartagena (Murcia, Spain). Agronomy 2024, 14, 524. [Google Scholar] [CrossRef]
  16. Al-Sulttani, A.O.; Ali, S.K.; Abdulhameed, A.A.; Jassim, D.T. Artificial Neural Network Assessment of Groundwater Quality for Agricultural Use in Babylon City: An Evaluation of Salinity and Ionic Composition. Int. J. Des. Nat. Ecodyn. 2024, 19, 329–336. [Google Scholar] [CrossRef]
  17. Sekkoum, M.; Safa, A.; Stamboul, M. Groundwater hydrochemistry of Aflou syncline, Central Saharan Atlas of Algeria. Desalin. Water Treat. 2020, 190, 424–439. [Google Scholar] [CrossRef]
  18. Cerlini, P.B.; Silvestri, L.; Meniconi, S.; Brunone, B. Simulation of the water table elevation in shallow unconfined aquifers by means of the ERA5 soil moisture dataset: The Umbria region case study. Earth Interact. 2021, 25, 15–32. [Google Scholar] [CrossRef]
  19. Kim, S.; Cho, J.S.; Park, J.K. Hydrological analysis using the neural networks in the parallel reservoir groups, South Korea. In World Water & Environmental Resources Congress; American Society of Civil Engineers: Reston, VA, USA, 2003. [Google Scholar]
  20. Kim, S.; Seo, Y.; Lee, C.J. Modeling of rainfall by combining neural computation and wavelet technique. Procedia Eng. 2016, 154, 1231–1236. [Google Scholar] [CrossRef]
  21. Zakhrouf, M.; Bouchelkia, H.; Stamboul, M.; Kim, S.; Heddam, S. Time series forecasting of river flow using an integrated approach of wavelet multi-resolution analysis and evolutionary data-driven models. A Case Study: Sebaou River (Algeria). Phys. Geogr. 2018, 39, 506–522. [Google Scholar] [CrossRef]
  22. Hagan, M.T.; Demuth, H.B.; Beale, M. Neural Network Design; PWS Publishing Co., Ltd.: Worcester, UK, 1997. [Google Scholar]
  23. Haykin, S. Neural Networks: A Comprehensive Foundation; Prentice-Hall Inc.: Upper Saddle River, NJ, USA, 1999. [Google Scholar]
  24. Kim, S.; Lee, S. Forecasting of flood stage using neural networks in the Nakdong river, South Korea. In Watershed Management and Operations Management; American Society of Civil Engineers: Reston, VA, USA, 2000. [Google Scholar]
  25. Bishop, C.M.; Nasrabadi, N.M. Pattern Recognition and Machine Learning; Springer: New York, NY, USA, 2006. [Google Scholar]
  26. Zakhrouf, M.; Bouchelkia, H.; Stamboul, M.; Kim, S.; Singh, V.P. Implementation on the evolutionary machine learning approaches for streamflow forecasting: Case study in the Seybous River, Algeria. J. Korea Water Resour. Assoc. 2020, 53, 395–408. [Google Scholar]
  27. Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
  28. Nocedal, J.; Wright, S.J. Numerical Optimization; Springer: New York, NY, USA, 1999. [Google Scholar]
  29. Elmeddahi, Y.; Ragab, R. Prediction of the groundwater quality index through machine learning in Western Middle Cheliff plain in North Algeria. Acta Geophys. 2022, 70, 1797–1814. [Google Scholar] [CrossRef]
  30. Ahlgren, P.; Jarneving, B.; Rousseau, R. Requirements for a cocitation similarity measure, with special reference to Pearson’s correlation coefficient. JASIST 2003, 54, 550–560. [Google Scholar] [CrossRef]
  31. Kim, S.; Seo, Y.; Malik, A.; Kim, S.; Heddam, S.; Yaseen, Z.M.; Kisi, O.; Singh, V.P. Quantification of river total phosphorus using integrative artificial intelligence models. Ecol. Indic. 2023, 153, 110437. [Google Scholar] [CrossRef]
  32. Seo, Y.; Kim, S.; Singh, V.P. Physical interpretation of river stage forecasting using soft computing and optimization algorithms. In Harmony Search Algorithm: Proceedings of the 2nd International Conference on Harmony Search Algorithm (ICHSA2015); Springer: Berlin/Heidelberg, Germany, 2016; pp. 259–266. [Google Scholar]
  33. Alizamir, M.; Gholampour, A.; Kim, S.; Keshtegar, B.; Jung, W.T. Designing a reliable machine learning system for accurately estimating the ultimate condition of FRP-confined concrete. Sci. Rep. 2024, 14, 20466. [Google Scholar] [CrossRef]
  34. Nash, J.E.; Sutcliffe, J.V. River flow forecasting through conceptual models part I—A discussion of principles. J. Hydrol. 1970, 10, 282–290. [Google Scholar] [CrossRef]
  35. Kim, S.; Singh, V.P.; Lee, C.J.; Seo, Y. Modeling the physical dynamics of daily dew point temperature using soft computing techniques. KSCE J. Civ. Eng. 2015, 19, 1930–1940. [Google Scholar] [CrossRef]
  36. Reed, M.H. Calculation of multicomponent chemical equilibria and reaction processes in systems involving minerals, gases and an aqueous phase. Geochim. Cosmochim. Acta 1982, 46, 513–528. [Google Scholar] [CrossRef]
  37. Stuyfzand, P.J. Hydrogeochemcal (HGC 2.1), for Storage, Management, Control, Correction and Interpretation of Water Quality Data in Excel® Spread Sheet; KWR-Rapport B111698-002; KWR: Nieuwegein, The Netherlands, 2012. [Google Scholar]
  38. Kim, S.; Kim, H.S. Uncertainty reduction of the flood stage forecasting using neural networks model. JAWRA J. Am. Water Resour. Assoc. 2008, 44, 148–165. [Google Scholar] [CrossRef]
  39. Fushiki, T. Estimation of prediction error by using K-fold cross-validation. Stat. Comput. 2011, 21, 137–146. [Google Scholar] [CrossRef]
  40. Gu, Y.; Wylie, B.K.; Boyte, S.P.; Picotte, J.; Howard, D.M.; Smith, K.; Nelson, K.J. An optimal sample data usage strategy to minimize overfitting and underfitting effects in regression tree models based on remotely-sensed data. Remote Sens. 2016, 8, 943. [Google Scholar] [CrossRef]
  41. Kisi, O.; Alizamir, M.; Trajkovic, S.; Shiri, J.; Kim, S. Solar radiation estimation in Mediterranean climate by weather variables using a novel Bayesian model averaging and machine learning methods. Neural Process. Lett. 2020, 52, 2297–2318. [Google Scholar] [CrossRef]
Figure 1. Geographic map of study area.
Figure 1. Geographic map of study area.
Hydrology 12 00103 g001
Figure 2. Heatmap of statistical associations based on Pearson correlation coefficient (PCC).
Figure 2. Heatmap of statistical associations based on Pearson correlation coefficient (PCC).
Hydrology 12 00103 g002
Figure 3. The performance of training algorithms.
Figure 3. The performance of training algorithms.
Hydrology 12 00103 g003
Figure 4. Effect of difference in number of hidden neurons on model accuracy in validation phase (a) LMBP-MLP1 (SO42−), (b) LMBP-MLP2 (Mg2+), (c) LMBP-MLP3 (Na+), (d) LMBP-MLP4 (Ca2+), (e) LMBP-MLP5 (Cl), (f) LMBP-MLP6 (K+), (g) LMBP-MLP7 (HCO3), and (h) LMBP-MLP8 (NO3).
Figure 4. Effect of difference in number of hidden neurons on model accuracy in validation phase (a) LMBP-MLP1 (SO42−), (b) LMBP-MLP2 (Mg2+), (c) LMBP-MLP3 (Na+), (d) LMBP-MLP4 (Ca2+), (e) LMBP-MLP5 (Cl), (f) LMBP-MLP6 (K+), (g) LMBP-MLP7 (HCO3), and (h) LMBP-MLP8 (NO3).
Hydrology 12 00103 g004
Figure 5. Flowchart of modeling process.
Figure 5. Flowchart of modeling process.
Hydrology 12 00103 g005
Figure 6. Comparison between predicted and measured values of all major ions for all datasets. (a) LMBP-MLP1 (SO42−), (b) LMBP-MLP2 (Mg2+), (c) LMBP-MLP3 (Na+), (d) LMBP-MLP4 (Ca2+), (e) LMBP-MLP5 (Cl), (f) LMBP-MLP6 (K+), (g) LMBP-MLP7 (HCO3), and (h) LMBP-MLP8 (NO3).
Figure 6. Comparison between predicted and measured values of all major ions for all datasets. (a) LMBP-MLP1 (SO42−), (b) LMBP-MLP2 (Mg2+), (c) LMBP-MLP3 (Na+), (d) LMBP-MLP4 (Ca2+), (e) LMBP-MLP5 (Cl), (f) LMBP-MLP6 (K+), (g) LMBP-MLP7 (HCO3), and (h) LMBP-MLP8 (NO3).
Hydrology 12 00103 g006aHydrology 12 00103 g006bHydrology 12 00103 g006c
Figure 7. The sorted charge balance (CB) values of all of the samples utilizing the predicted ions.
Figure 7. The sorted charge balance (CB) values of all of the samples utilizing the predicted ions.
Hydrology 12 00103 g007
Figure 8. Comparison of developed models’ performance in adjacent locations (Aflou, Madena, and Ain Madhi).
Figure 8. Comparison of developed models’ performance in adjacent locations (Aflou, Madena, and Ain Madhi).
Hydrology 12 00103 g008
Table 1. Features and output variables for developed ANN models of all major ions.
Table 1. Features and output variables for developed ANN models of all major ions.
ANN ModelFeaturesOutput
LMBP-MLP1TDS, MINSO42−
LMBP-MLP2TDS, MIN, SO42−Mg2+
LMBP-MLP3TDS, MIN, SO42−Na+
LMBP-MLP4TDS, MIN, SO42−, Na+, Mg2+Ca2+
LMBP-MLP5TDS, MIN, SO42−, Na+Cl
LMBP-MLP6TDS, MIN, SO42−, Na+K+
LMBP-MLP7TDS, MIN, Mg2+HCO3
LMBP-MLP8TDS, MIN, Mg2+NO3
Table 2. Results of developed ANN models for all major ions.
Table 2. Results of developed ANN models for all major ions.
ANN ModelOutputTrainingValidationTestAll
R2RMSE
(mg/L)
NSER2RMSE
(mg/L)
NSER2RMSE
(mg/L)
NSER2RMSE
(mg/L)
NSE
LMBP-MLP1SO42−0.92365.7300.9200.96456.9700.9620.84253.6600.8400.93663.3680.930
LMBP-MLP2Mg2+0.92114.8900.9180.94311.8000.9360.98012.8400.9780.92414.2740.910
LMBP-MLP3Na+0.91620.2300.9150.92717.2700.9260.75914.9600.7540.91619.3460.910
LMBP-MLP4Ca2+0.86721.9900.8640.88723.5100.8780.94536.4600.9410.89224.0340.889
LMBP-MLP5Cl0.86544.6400.8570.90243.6000.8980.89530.5300.8920.87243.2960.870
LMBP-MLP6K+0.5332.9900.5350.6012.8500.5310.0456.4800.0030.4413.4820.440
LMBP-MLP7HCO30.30064.2500.3010.63037.7600.5400.36641.7200.3610.33059.0290.320
LMBP-MLP8NO30.32543.4000.3250.86540.8700.8230.00440.460−0.9330.52341.8860.510
Table 3. The values of charge balance (CB) and th evaluation of samples (Aflou).
Table 3. The values of charge balance (CB) and th evaluation of samples (Aflou).
LocationSO42−
Pred.
NO3 Meas.HCO3
Meas.
Cl
Pred.
Ca2+ Pred.Mg2+ Pred.Na+ Pred.K+
Pred.
CB
%
Evaluation
Aflou10552404588232370.03Good
41030326135152688273.54Good
90615273210292146168147.45Moderate
39314239190153618673.23Good
Table 4. The values of charge balance (CB) and the evaluation of samples (Ain Madhi).
Table 4. The values of charge balance (CB) and the evaluation of samples (Ain Madhi).
LocationSO42−
Pred.
NO3 Meas.HCO3 Meas.Cl
Meas.
Ca2+ Pred.Mg2+ Pred.Na+ Pred.K+
Meas.
CB
%
Evaluation
Ain Madhi434923714520683102511.60Poor
4261023214520680104511.90Poor
123132127095272820.07Good
124161859395272821.57Good
16774237400177298270154.87Good
1227102173705761732471215.28Poor
291132472401873814064.51Good
28122412051833813167.40Moderate
352516222016350104152.65Good
25734144155128415860.77Good
Table 5. The values of charge balance (CB) and the evaluation of samples (Madna).
Table 5. The values of charge balance (CB) and the evaluation of samples (Madna).
LocationSO42−
Pred.
NO3
Meas.
HCO3 Pred.Cl
Meas.
Ca2+ Pred.Mg2+ Pred.Na+ Pred.K+
Meas.
CB
%
Evaluation
Madna8887230354199142221121.28Good
90384281250236148209142.44Good
63154263257185106155121.11Good
437172471982138210777.77Moderate
62971266250181104158141.64Good
6531672973161446.72Moderate
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Stamboul, M.E.; Habib, A.; Hamimed, A.; Zakhrouf, M.; Chung, I.-M.; Kim, S. Extraction of Major Groundwater Ions from Total Dissolved Solids and Mineralization Using Artificial Neural Networks: A Case Study of the Aflou Syncline Region, Algeria. Hydrology 2025, 12, 103. https://doi.org/10.3390/hydrology12050103

AMA Style

Stamboul ME, Habib A, Hamimed A, Zakhrouf M, Chung I-M, Kim S. Extraction of Major Groundwater Ions from Total Dissolved Solids and Mineralization Using Artificial Neural Networks: A Case Study of the Aflou Syncline Region, Algeria. Hydrology. 2025; 12(5):103. https://doi.org/10.3390/hydrology12050103

Chicago/Turabian Style

Stamboul, Mohammed Elamin, Azzaz Habib, Abderrahmane Hamimed, Mousaab Zakhrouf, Il-Moon Chung, and Sungwon Kim. 2025. "Extraction of Major Groundwater Ions from Total Dissolved Solids and Mineralization Using Artificial Neural Networks: A Case Study of the Aflou Syncline Region, Algeria" Hydrology 12, no. 5: 103. https://doi.org/10.3390/hydrology12050103

APA Style

Stamboul, M. E., Habib, A., Hamimed, A., Zakhrouf, M., Chung, I.-M., & Kim, S. (2025). Extraction of Major Groundwater Ions from Total Dissolved Solids and Mineralization Using Artificial Neural Networks: A Case Study of the Aflou Syncline Region, Algeria. Hydrology, 12(5), 103. https://doi.org/10.3390/hydrology12050103

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop