Predicting Osmotic Coefficients in Aqueous Inorganic Systems: A Hybrid Gazelle Optimization Algorithm (GOA)–Machine Learning Framework for Sustainable Water Treatment

Hashemi, Seyed Hossein; Cheperli, Ali; Torabi, Farshid; Shafiei, Yousef

doi:10.3390/su18083959

Open AccessArticle

Predicting Osmotic Coefficients in Aqueous Inorganic Systems: A Hybrid Gazelle Optimization Algorithm (GOA)–Machine Learning Framework for Sustainable Water Treatment

Faculty of Engineering and Applied Science, University of Regina, Regina, SK S4S 0A2, Canada

^*

Author to whom correspondence should be addressed.

Sustainability 2026, 18(8), 3959; https://doi.org/10.3390/su18083959

Submission received: 11 March 2026 / Revised: 10 April 2026 / Accepted: 14 April 2026 / Published: 16 April 2026

(This article belongs to the Special Issue Water and Wastewater Treatment Futures: Sustainable Solutions for a Circular Economy)

Download

Browse Figures

Versions Notes

Abstract

Efficient design of desalination and brine management systems, which are central to a water circular economy, requires accurate thermodynamic data such as the osmotic coefficient. This property is key to understanding salt behavior in aqueous solutions, directly impacting the energy efficiency and sustainability of treatment processes. This study presents a predictive framework that combines machine learning with the Gazelle Optimization Algorithm (GOA) to accurately estimate osmotic coefficients for various inorganic salt solutions. The GOA was employed to automatically tune the hyperparameters of two models: Decision Tree (DT) and Gradient Boosting Machine (GBM). Using a comprehensive dataset of 893 samples with 27 salt-specific parameters, the GOA-GBM hybrid model delivered the highest predictive accuracy, achieving an R² of 0.9734 on test data. The GOA-DT model also performed robustly (R² = 0.9260), providing a more interpretable alternative. By creating a reliable tool for predicting osmotic coefficients, this methodology enables more precise process simulation and optimization. This directly supports the development of energy-efficient desalination technologies and informed decision-making for water reuse and resource recovery. The integration of advanced digital tools like GOA with machine learning offers a powerful approach to enhancing process efficiency and environmental safety, contributing directly to the design of sustainable, circular economy-based water treatment solutions for industrial and municipal applications.

Keywords:

osmotic coefficient; inorganic solutions; Gazelle Optimization Algorithm (GOA); machine learning prediction; Gradient Boosting Machine (GBM); Decision Tree (DT)

1. Introduction

The properties of electrolyte solutions, such as osmotic coefficients, deviate from ideal models due to electro-static forces and ion–water interactions. Accurately quantifying these properties is not merely a fundamental scientific exercise; it is a prerequisite for optimizing industrial processes with significant sustainability implications [1]. For instance, in water desalination and ion separation, a precise understanding of these properties allows for the design of more energy-efficient systems, directly contributing to the monitoring and reduction in the socio-economic costs associated with water and energy use.

In recent decades, many researchers have focused on studying the osmotic coefficient due to its importance in various aqueous solutions. Ibrahim et al. [2] studied the thermodynamic properties of various aqueous sugar solutions. They used both a Perturbed Hard Sphere Chain equation of state and an Artificial Neural Network (ANN) model, with a specific focus on accurately predicting the osmotic coefficient. Abedi et al. [3] studied drug–biomolecule interactions in water by measuring osmotic coefficients in mixed solutions of certain drugs and amino acids. Their findings, which used vapor pressure osmometry at body temperature (310.15 K), showed that osmotic behavior depends on the specific amino acid and suggests ion pair formation between drug molecules. Patil et al. [4] measured the osmotic coefficient of two bio-ionic liquid solutions in water. Their analysis showed these liquids act as electrolytes, with their thermodynamic behavior significantly influenced by hydrophobic hydration effects, similar to other ionic liquids.

Given its critical role in understanding electrolyte equilibrium, the osmotic coefficient has been the subject of considerable study. Xin et al. [5] measured vapor pressure lowering to determine the osmotic coefficients of lithium salts in organic solvents at 298.15 K. The data was modeled to calculate the salts’ activity coefficients, which is relevant for lithium-ion battery electrolyte research. Grundl et al. [6] investigated the osmotic coefficients and water activity of binary water/5-(hydroxymethyl) furfural and in ternary water/5-(hydroxymethyl) furfural/salt solutions using vapor pressure osmometry at 298.15 K. They used a Pitzer-type model for the binary systems and the Zdanovskii–Stokes–Robinson (ZSR) mixing rule for the ternary systems to calculate the activity coefficients of the components. Meng et al. [7] used molecular dynamics simulations and Raman spectroscopy to study NH₄Cl solutions, finding that contact ion pairs form at higher concentrations, altering the hydrogen bond structure. Their work also established a preliminary link between the solution’s osmotic coefficient and the specific configuration of its hydrogen bonds. Wu et al. [8] modeled the activity and osmotic coefficients of rubidium-containing electrolyte solutions using two computational models, Electrolyte Molecular Interaction Volume Model (eMIVM) and Electrolyte Molecular Interaction Volume Model–Energy Term (eMIVM-ET). Their results show that eMIVM-ET performed better for predicting properties of mixed electrolyte solutions, while both models effectively described single-electrolyte systems. Rudakov et al. [9] developed a thermodynamic model focusing on the osmotic coefficient of 2-1 electrolyte solutions. The model, which accounts for hydration and ion pairing, successfully described the osmotic behavior of CaCl₂ solutions across a wide temperature range.

The importance and improvement of prediction accuracy in chemical systems, especially for thermodynamic properties, has received increased attention based on machine learning algorithms [10,11]. Therefore, in this study, focusing on the osmotic coefficient in chemical processes, particularly desalination, the aim is to predict the osmotic coefficient of aqueous electrolyte systems for various chloride, sulfate, and phosphate mineral salts. The approach combines an optimization algorithm with machine learning models. This combination has not been used before for predicting osmotic coefficients in electrolyte systems. Specifically, two machine learning algorithms including Decision Tree and Gradient Boosting Machine (GBM) will have their parameters optimized using the Gazelle Optimization Algorithm (GOA). This specific combined method has not been evaluated in previous studies.

2. Data and Methods

2.1. GOA-DT Hybrid Approach

The Gazelle Optimization Algorithm (GOA) is a contemporary metaheuristic inspired by the natural behaviors of gazelles, which alternate between escaping predators to facilitate exploration and grazing in safe zones to promote exploitation when addressing optimization problems [12]. In this study, the Gazelle Optimization Algorithm is employed to autonomously identify the optimal hyperparameters for a Decision Tree (DT) model intended to predict osmotic coefficients. The methodology follows a systematic and sequential procedure.

Initially, the dataset is prepared for analysis. It comprises 27 features and 893 samples, which are organized into a feature matrix and a target vector. The data are subsequently partitioned into two subsets, with 70 percent allocated for model training and the remaining 30 percent reserved for final testing. This division ensures that the model is ultimately assessed on unseen data, thereby providing a reliable evaluation of its generalization capability. The GOA is then configured for the optimization task. The algorithm begins by generating a random population of 50 candidate solutions, each representing a potential combination of three Decision Tree hyperparameters: the minimum leaf size, the maximum number of splits, and the minimum parent size. The search space is constrained by predefined minimum and maximum allowable values for each parameter. The core optimization loop subsequently commences. During each iteration, every candidate updates its position according to one of two simple rules selected at random. The first rule governs exploitation, whereby the candidate moves toward the best solution identified thus far and another randomly selected candidate, thereby concentrating the search in promising regions. The second rule governs exploration, whereby the candidate undertakes a randomized step to investigate new areas of the search space and avoid premature convergence. The magnitude of these steps diminishes progressively over time, allowing the search to begin broadly and become increasingly refined as the algorithm progresses. Following the positional update, the quality of each candidate is assessed using a dedicated objective function. This function takes the proposed hyperparameters, trains a Decision Tree on the training data using five-fold cross-validation, and returns the R squared value derived from this validation process. The incorporation of cross validation is critical, as it evaluates the performance of the hyperparameter configuration across different subsets of the training data, thereby mitigating the risk of overfitting and steering the GOA toward a robust solution. Selection is then performed. If a candidate new position yields a superior fitness score compared to its previous position, the updated configuration is retained. Throughout this process, the best solution discovered by any candidate in the entire population is continuously tracked. This iterative procedure continues over numerous cycles, during which the population of candidates gradually converges toward the optimal set of Decision Tree hyperparameters. The best fitness score from each iteration is recorded to illustrate the algorithm progressive improvement over time.

Upon completion of the GOA, the single best set of hyperparameters is extracted. A final Decision Tree model is trained on the entirety of the training dataset using these optimal settings. This model, which has not been involved in the tuning process, is then applied once to the separate test set to generate final predictions. Its performance is quantified using evaluation metrics including R squared and Root Mean Squared Error (RMSE).

Finally, all results are presented comprehensively. This includes the values of the three optimal hyperparameters, the performance metrics for both the training and test datasets, and several graphical plots. These visualizations depict the GOA convergence trajectory over successive iterations, a comparison between actual and predicted values, and a bar chart illustrating the relative importance of each of the 27 input features as determined by the Decision Tree model.

2.2. GOA-GBM Hybrid Approach

In this hybrid methodology, the Gazelle Optimization Algorithm is applied to automatically determine the optimal hyperparameters for a Gradient Boosting Machine (GBM) model designed to predict the osmotic coefficient. The entire procedure adheres to a clear and structured workflow.

The process commences with data preparation. The feature data and target labels are loaded and transposed into the appropriate format, with the target variable being formatted as a column vector. This yields a feature matrix and a target vector. Subsequently, the data are divided into training and testing subsets. To ensure reproducibility of results, a fixed random seed is established. The total samples are randomly shuffled, with 70 percent allocated to the training set and the remaining 30 percent reserved as an independent test set. This separation creates distinct data partitions for model development and for the unbiased evaluation of final model performance. The GOA optimization framework is then established. Key algorithmic parameters are defined, including a population of 30 candidate solutions and search boundaries for five GBM hyperparameters. These parameters are the number of trees, the learning rate, the minimum leaf size, the maximum number of splits, and the minimum parent size. The GOA population is initialized by randomly assigning each candidate a value for each hyperparameter within the specified ranges, with integer parameters subsequently rounded. The fitness of each candidate is evaluated through a dedicated objective function, and the best solution within the initial population is identified. The principal GOA loop executes for a predetermined number of iterations. During each iteration, every candidate updates its position according to one of two straightforward strategies selected randomly. The first strategy involves exploitation, wherein the candidate moves toward the current best solution and a randomly selected peer. The second strategy involves exploration, wherein the candidate undertakes a randomized step. A decreasing weight factor ensures that exploration is more pronounced in early iterations and gradually gives way to exploitation in later stages. Following each positional update, the new configuration is verified to remain within the established bounds. The quality of the updated position is evaluated using the objective function, which accepts the proposed hyperparameters, performs five-fold cross-validation on the training set, and returns the average R squared value. The use of cross-validation in this context prevents overfitting and provides reliable guidance for the search process. If the new position yields an improved fitness score, it replaces the previous configuration. The overall best solution is updated as necessary, and the best fitness score from each iteration is recorded to monitor progress. Upon completion of the GOA, the optimal hyperparameters are extracted from the best solution identified. These five values represent the optimal number of trees, learning rate, minimum leaf size, maximum splits, and minimum parent size. A final GBM model is then trained on the complete training set using these optimal settings, employing the Least Squares Boost (LS Boost) method. Importantly, this final training phase does not involve cross-validation but instead utilizes all training data to construct the most robust model possible for subsequent testing. This final model is employed to generate predictions on both the training set and the held out test set. Performance metrics, including R squared and Root Mean Squared Error (RMSE), are calculated for both datasets. These metrics illuminate the model proficiency in learning the training data and, more importantly, its accuracy in generalizing to new, unseen test data. The analysis also computes feature importance, revealing which of the 27 input features the GBM model found most valuable for prediction. Additionally, learning curves are generated by tracking the reduction in model error (MSE) on both training and test sets as the number of trees in the ensemble increases.

2.3. Data Collection

In this study, 893 samples were collected to evaluate and predict the osmotic coefficient in the equilibrium systems of inorganic materials. The dataset includes 27 parameters such as HCl, LiCl, NaCl, KCl, NH₄Cl, CsCl, MgCl₂, CaCl₂, BaCl₂, Li₂SO₄, Na₂SO₄, K₂SO₄, (NH₄)₂SO₄, MgSO₄, MnSO₄, NiSO₄, CuSO₄, ZnSO₄, NaH₂PO₄, KH₂PO₄, (NH₄)H₂PO₄, Na₂HPO₄, K₂HPO₄, (NH₄)₂HPO₄, Na₃PO₄, K₃PO₄, and (NH₄)₃PO₄. The experimental osmotic coefficient data were obtained from published scientific studies and were carefully processed to ensure reliable and consistent results. This preprocessing included unit normalization to maintain consistent measurement scales across all parameters, as well as outlier removal to eliminate unusual data points that could distort the results. For accurate model performance evaluation, the dataset was randomly split into training (70%) and testing (30%) subsets using a fixed random seed, ensuring reproducibility and fair comparison across different optimization experiments. Performance metrics such as R² were calculated exclusively on the test set. Table 1 provides detailed information about the collected dataset, including the target minerals. The dataset used for training and evaluating the performance of the algorithms in this study is provided in the Supplementary Materials.

3. Results and Discussion

In Figure 1, the complete correlation matrix is presented to evaluate the effect of dissolved mineral components on the osmotic coefficient in an equilibrium system. The figure shows that most features have low correlation with each other, indicating that each mineral contributes relatively independently to the osmotic behavior of the solution. Therefore, the osmotic coefficient is controlled by the combined influence of multiple ions rather than by a single dominant component. Chloride-based salts show moderate positive correlations among themselves. This suggests that minerals such as NaCl, KCl, and NH₄Cl affect the osmotic coefficient in a similar way because they have comparable ion–water interactions. These salts mainly control osmotic pressure through ionic strength and hydration of monovalent and divalent cations. Sulfate salts form another clear correlated group. The strong correlation between different sulfate minerals indicates that the sulfate anion plays an important role in determining the osmotic coefficient. Sulfate ions have a higher charge and stronger electrostatic interactions, which increase non-ideal behavior in the solution and significantly affect osmotic properties. In contrast, phosphate-based salts show very weak correlations with other minerals. This means their effect on the osmotic coefficient is different and more complex. Phosphate ions have higher valence and stronger ion pairing, which leads to nonlinear and system-specific effects in equilibrium conditions. Based on the results of this figure, the osmotic coefficient in a mineral equilibrium system is mainly influenced by ion type, ionic charge, and anion group. Multivalent ions generally have a stronger impact than monovalent ions. The low correlation between most features shows that using all mineral components is important for accurate modeling and prediction of the osmotic coefficient.

In Figure 2, the feature importance for predicting the osmosis coefficient is presented using the Out-of-Bag (OOB) method. This chart shows which chemical features have the strongest influence on the osmotic coefficient. The most important feature is LCI, with the highest importance score, meaning it has the greatest effect on the osmotic coefficient. It is followed by CaCl₂ and MgCl₂, which also show strong influence. Features such as NiSO₄ and HCl have moderate importance, while chemicals like NaCl, MnSO₄, and ZnSO₄ have lower impact. At the bottom of the ranking, features like K₃PO₄, Na₃PO₄, and (NH4)₃PO₄ have considerably lower effects on the osmotic coefficient. This means that changes in these low-importance features do not significantly change the predicted osmotic coefficient value. Figure 2 helps identify which chemical factors should be prioritized when modeling or controlling the osmotic coefficient. As shown in Figure 2, LiCl exhibits the highest feature importance score, followed by NaCl and KCl. This dominance of LiCl can be explained by its unique physicochemical properties: the small ionic radius and high charge density of Li⁺ lead to a strong hydration shell and pronounced ion–dipole interactions with water molecules, which significantly alter the hydrogen bonding network and thus the osmotic coefficient of the solution. In contrast, larger alkali ions (Na⁺, K⁺, Cs⁺) have lower charge densities and weaker hydration effects, resulting in relatively lower importance scores. Among divalent cations (Mg²⁺, Ca²⁺, Ba²⁺), despite their stronger electrostatic interactions, their overall influence on the osmotic coefficient in the present dataset is moderate, likely due to different ion pairing behavior and solubility constraints.

In Figure 3, the training and test results for the hybrid GOA-DT approach are shown. This approach was used to predict the osmotic coefficient in an equilibrium system containing inorganic materials. The optimal hyperparameters identified were: a minimum leaf size of 2, a maximum of 199 splits, and a minimum parent size of 2. The model demonstrated high performance on both datasets. On the training data, the model achieved an excellent R² score of 0.9670 and a very low Root Mean Squared Error (RMSE) of 0.0604.

When evaluated on the independent test set, the model maintained strong predictive accuracy with an R² of 0.9260 and an RMSE of 0.0947. These results indicate that the GOA–optimized Decision Tree model is highly effective and shows good generalization to unseen data.

In Figure 4, the training and test results for the hybrid GOA-GBM approach are presented. The Gradient Boosting Machine (GBM) model was optimized using the Gazelle Optimization Algorithm (GOA). The optimal hyperparameters identified were: 427 trees, a learning rate of 0.3680, a minimum leaf size of 1, and a maximum of 3 splits per tree. The model demonstrated exceptional performance on both datasets. On the training data, the model achieved a near-perfect R² score of 0.9974 and a very low Root Mean Squared Error (RMSE) of 0.0171. When evaluated on the independent test set, the model maintained outstanding predictive accuracy with an R² of 0.9734 and an RMSE of 0.0568. These results indicate that the GOA–optimized GBM model is highly effective and shows excellent generalization capability for predicting the osmotic coefficient in equilibrium systems containing inorganic materials.

The results reveal distinct strengths and weaknesses for each of the two hybrid optimization methods. The GOA-GBM model demonstrated superior predictive accuracy, achieving the highest test R² (0.9734) and lowest test RMSE (0.0568). Its primary strength lies in its exceptional learning capability and generalization power, making it the most reliable model for this application. However, its weakness is increased model complexity, which can make it computationally more expensive and less interpretable than simpler models.

The GOA-DT model showed very strong performance with a test R² of 0.9260, offering an excellent balance between accuracy and simplicity. Its strength is providing near-state-of-the-art results while maintaining model transparency and faster execution. The noticeable gap between its training R² (0.9670) and test R² suggests a potential weakness: slight overfitting to the training data compared to the more stable GBM approach.

In Table 2, the performance results of the two machine learning algorithms without optimization are presented. The comparison between baseline models (without optimization) and hybrid GOA-based models clearly highlights the impact of hyperparameter optimization on predictive performance. Among the baseline models, the GBM algorithm exhibited the best performance, achieving a test R² of 0.9292 and a test RMSE of 0.0927. This indicates that boosting-based methods inherently possess strong learning and generalization capabilities, even without optimization, while the Decision Tree showed the lowest accuracy, confirming its sensitivity to model configuration and tendency toward suboptimal generalization. However, when compared to the optimized models presented earlier (GOA–DT and GOA–GBM), a substantial improvement in performance is observed. The GOA–GBM model significantly outperformed its baseline counterpart, improving the test R² from 0.9292 to 0.9734 and reducing the RMSE from 0.0927 to 0.0568. This demonstrates the effectiveness of GOA in fine-tuning critical hyperparameters such as learning rate and tree structure. Similarly, the GOA–DT model showed a remarkable improvement over the standard Decision Tree, increasing test R² from 0.8115 to 0.9260, which indicates that optimization plays a crucial role in enhancing simpler models.

4. Conclusions

This study successfully applied the Gazelle Optimization Algorithm (GOA) to tune two machine learning models for predicting the osmotic coefficient in inorganic equilibrium systems. The results demonstrate that the choice of algorithm significantly affects prediction accuracy. The GOA–optimized Gradient Boosting Machine (GBM) achieved the highest accuracy, proving its effectiveness for this complex prediction task. The optimized Decision Tree (DT) also performed very well, offering a strong balance between high performance and model simplicity. Beyond model comparison, the GOA-GBM hybrid approach has practical value for real-world desalination and brine management. Accurate prediction of osmotic coefficients can improve the design and monitoring of reverse osmosis systems, optimize energy consumption, and enhance water recovery rates. For instance, integrating this predictive model into process control systems could enable real-time adjustments to operating conditions, reducing scaling risks and improving resource recovery from brine streams. However, several limitations must be acknowledged. The model’s performance is contingent on the specific inorganic equilibrium systems and dataset used in this study. Its generalizability to mixed-solute or industrial brines with complex organic–inorganic interactions remains untested. Additionally, the computational cost of GOA-based optimization may be prohibitive for very large datasets or online deployment without further code optimization. Future work should focus on validating the GOA-GBM model on pilot-scale or industrial desalination data, incorporating dynamic operating conditions. Expanding the modeling framework to include multi-component brine systems and integrating it with economic or life-cycle assessment tools would also be valuable. Finally, exploring hybrid optimization algorithms or deep learning architectures could further enhance predictive accuracy and computational efficiency.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/su18083959/s1, Table S1: Data.

Author Contributions

Conceptualization, S.H.H., A.C., F.T. and Y.S.; methodology, S.H.H., A.C. and F.T.; software, S.H.H. and A.C.; validation, S.H.H., A.C. and F.T.; formal analysis, S.H.H., A.C., F.T. and Y.S.; investigation, S.H.H., A.C., F.T. and Y.S.; data curation, S.H.H.; writing—original draft preparation, S.H.H., A.C., F.T. and Y.S.; writing—review and editing, S.H.H., A.C., F.T. and Y.S.; visualization, S.H.H. and A.C.; supervision, F.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author, upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hashemi, S.H.; Bagheri, M.; Hashemi, S.A. Thermodynamic study of the effect of concentration and ionic strength on osmotic coefficient of aqueous sulfate and chloride solutions at 298.15 K. Model. Earth Syst. Environ. 2020, 6, 2189–2196. [Google Scholar] [CrossRef]
Ibrahim, S.K.; Albadr, R.J.; Sur, D.; Yadav, A.; Menon, S.V.; Shit, D.; Supriya, S.; Panigrahi, R.; Mohammed Taher, W.; Alwan, M.; et al. Prediction of thermodynamic properties of aqueous carbohydrates solution using the PHSC and ANN models. Sci. Rep. 2025, 15, 21539. [Google Scholar] [CrossRef]
Abedi, R.; Shekaari, H.; Mokhtarpour, M.; Faraji, S. Determination of osmotic and activity coefficients of calcium d-pantothenate, cefazolin sodium, and ceftriaxone sodium drugs in aqueous solutions of amino acids by using vapor pressure osmometry at 310.15 K. J. Chem. Thermodyn. 2023, 181, 107024. [Google Scholar] [CrossRef]
Patil, K.R.; Patil, S.K.; Shaikh, V.R.; Dagade, D.H.; Patil, K.J. Studies of osmotic and activity coefficient properties of aqueous solutions of triethylammonium formate and triethylammonium glycolate protic ionic liquids at 298.15 K. J. Mol. Liq. 2020, 324, 115143. [Google Scholar] [CrossRef]
Xin, N.; Sun, Y.; Radke, C.; Prausnitz, J. Osmotic and Activity Coefficients for Five Lithium Salts in Three Non–aqueous Solvents. J. Chem. Thermodyn. 2018, 132, 83–92. [Google Scholar] [CrossRef]
Grundl, G.; Tsurko, E.N.; Neueder, R.; Kunz, W. Osmotic coefficients and activity coefficients in binary water/5-(hydroxymethyl) furfural and in ternary water/5-(hydroxymethyl) furfural/salt solutions at 298.15 K. J. Chem. Thermodyn. 2019, 139, 105878. [Google Scholar] [CrossRef]
Meng, X.; Zhuang, X.; Fu, Y.; Yuan, J.; Li, S.; Chen, S. Local structure of NH₄Cl solution and its correlation with osmotic coefficient by molecular dynamics simulation and Raman spectroscopy. Vib. Spectrosc. 2020, 111, 103155. [Google Scholar] [CrossRef]
Wu, Y.; Tao, D. Prediction of Activity Coefficients and Osmotic Coefficient of Electrolyte Solutions Containing Rb+ by the Electrolyte Molecular Interaction Volume Model and the Electrolyte Molecular Interaction Volume Model-Energy Term. Metals 2024, 14, 245. [Google Scholar] [CrossRef]
Rudakov, A.M.; Sergievskii, V.V.; Nagovitsyna, O.A. Dependences of the osmotic coefficients of aqueous calcium chloride solutions on concentration at different temperatures. Russ. J. Phys. Chem. 2017, 91, 2361–2365. [Google Scholar] [CrossRef]
Qi, S.M.; Bo, T.; Zhang, L.; Chai, Z.F.; Shi, W.Q. Machine-Learning-Driven Simulations on Microstructure, Thermodynamic Properties, and Transport Properties of LiCl-KCl-LiF Molten Salt. Artif. Intell. Chem. 2023, 2, 100027. [Google Scholar] [CrossRef]
Zante, G. Machine learning for determination of activity of water and activity coefficients of electrolytes in binary solutions. Artif. Intell. Chem. 2024, 2, 100069. [Google Scholar] [CrossRef]
Agushaka, J.O.; Ezugwu, A.E.; Abualigah, L. Gazelle optimization algorithm: A novel nature-inspired metaheuristic optimizer. Neural Comput. Appl. 2023, 35, 4099–4131. [Google Scholar] [CrossRef]
EL Guendouzi, M.; Dinane, A.; Mounir, A. Water activities, osmotic and activity coefficients in aqueous chloride solutions at T = 298.15 K by the hygrometric method. J. Chem. Thermodyn. 2001, 33, 1059–1072. [Google Scholar] [CrossRef]
EL Guendouzi, M.; Mounir, A.; Dinane, A. Water activity, osmotic and activity coefficients of aqueous solutions of Li₂SO₄, Na₂SO₄, K₂SO₄, (NH₄)₂SO₄, MgSO₄, MnSO₄, NiSO₄, CuSO₄, and ZnSO₄ at T = 298.15 K. J. Chem. Thermodyn. 2003, 35, 209–220. [Google Scholar] [CrossRef]
EL Guendouzi, M.; Benbiyi, A.; Dinane, A.; Azougen, R. The thermodynamic study of the system LiCl–KCl–H₂O at the temperature 298.15 K. Calphad 2003, 27, 213–219. [Google Scholar] [CrossRef]
Dinane, A.; Mounir, A. Hygrometric Determination of Water Activities and Osmotic and Activity Coefficients of NH₄Cl–KCl–H₂O at 25 °C. J. Solut. Chem. 2003, 32, 395–404. [Google Scholar] [CrossRef]
El Guendouzi, M.; Benbiyi, A.; Dinane, A.; Azougen, R. Determination of water activities and osmotic and activity coefficients of the system NaCl–BaCl₂–H₂O at 298.15 K. Calphad 2003, 27, 375–381. [Google Scholar] [CrossRef]
EL Guendouzi, M.; Azougen, R.; Benbiyi, A.; Dinane, A. Thermodynamic properties of the system {yNH4Cl + (1 − y) CsCl} (aq) at temperature 298.15 K. Calphad 2004, 28, 329–336. [Google Scholar] [CrossRef]
El Guendouzi, M.; Azougen, R.; Benbiyi, A. Thermodynamic properties of the mixed electrolyte systems {yMgCl₂ + (1 − y) NaCl} (aq) and {yMgCl₂ + (1 − y) CaCl₂) (aq) at 298.15 K. Calphad 2005, 29, 114–124. [Google Scholar] [CrossRef]
Mounir, A.; ELGuendouzi, M.; Dinane, A. Hygrometric determination of the thermodynamic properties of the system MgSO₄–Na₂SO₄–H₂O at 298.15 K. Fluid Phase Equilibria 2002, 201, 233–244. [Google Scholar] [CrossRef]
Mounir, A.; El Guendouzi, M.; Dinane, A. Hygrometric Determination of Water Activities, Osmotic and Activity Coefficients, and Excess Gibbs Energy of the System MgSO₄–K₂SO₄–H₂O. J. Solut. Chem. 2002, 31, 793–799. [Google Scholar] [CrossRef]
Dinane, A. Thermodynamic properties of (NaCl+KCl+LiCl+H₂O) at T = 298.15 K: Water activities, osmotic and activity coefficients. J. Chem. Thermodyn. 2007, 39, 96–103. [Google Scholar] [CrossRef]
Abderrahim Dinane, A. Thermodynamic properties of NaCl–NH₄Cl–LiCl–H₂O at T = 298.15 K: Water activities, osmotic and activity coefficients. Fluid Phase Equilibria 2008, 273, 59–67. [Google Scholar] [CrossRef]
Dinane, A. Thermodynamic Properties of Aqueous Mixtures NaCl−KCl−NH₄Cl−H₂O: Water Activity and Osmotic and Activity Coefficients at 298.15 K. J. Chem. Eng. Data 2006, 51, 1602–1608. [Google Scholar] [CrossRef]
Dinane, A. Thermodynamic Properties of {NaCl-CsCl-LiCl} (aq) at T = 298.15 K: Water Activities, Osmotic and Activity Coefficients. J. Solut. Chem. 2007, 36, 1421–1436. [Google Scholar] [CrossRef]
ELGuendouzi, M.; Benbiyi, A. Thermodynamic properties of binary aqueous solutions of orthophosphate salts, sodium, potassium and ammonium at T = 298.15 K. Fluid Phase Equilibria 2014, 369, 68–85. [Google Scholar] [CrossRef]

Figure 1. Complete correlation matrix of mineral components for evaluating their effect on the osmotic coefficient in an equilibrium system.

Figure 2. Feature importance for predicting osmotic coefficient using the Out-of-Bag (OOB) method.

Figure 3. Performance of the hybrid GOA–DT model on training (a) and testing (b) datasets for osmotic coefficient prediction in an equilibrium inorganic system.

Figure 4. Predicted versus actual osmotic coefficient values obtained from the hybrid GOA-GBM model for both training (a) and test (b) data in an equilibrium system with inorganic materials.

Table 1. Mineral-based dataset used for predicting the osmotic coefficient at 298.15 K.

Electrolytes Studied	System Type	Ref
HCl, LiCl, NaCl, KCl, CsCl, NH₄Cl, MgCl₂, CaCl₂, BaCl₂ (aq)	Binary chlorides	[13]
Li₂SO₄, Na₂SO₄, K₂SO₄, (NH₄)₂SO₄, MgSO₄, MnSO₄, NiSO₄, CuSO₄, ZnSO₄ (aq)	Binary sulphates	[14]
LiCl + KCl + H₂O	Ternary mixture as chlorides	[15]
NH₄Cl + CaCl₂ + H₂O	Ternary mixture as chlorides	[16]
NaCl + BaCl₂ + H₂O	Ternary mixture as chlorides	[17]
NH₄Cl + CsCl + H₂O	Ternary mixture as chlorides	[18]
MgCl₂ + CaCl₂ + H₂O MgCl₂ + NaCl + H₂O	Ternary mixture as chlorides	[19]
MgSO₄ + Na₂SO₄ + H₂O	Ternary mixture as sulphates	[20]
MgSO₄ + K₂SO₄ + H₂O	Ternary mixture as sulphates	[21]
NaCl + KCl + LiCl + H₂O	Multi-component mixture as chlorides	[22]
NaCl + NH₄Cl + LiCl + H₂O	Multi-component mixture as chlorides	[23]
NaCl + KCl + NH₄Cl + H₂O	Multi-component mixture as chlorides	[24]
NaCl + CsCl + LiCl + H₂O	Multi-component mixture as chlorides	[25]
NaH₂PO₄ + H₂O KH₂PO₄ + H₂O (NH₄)H₂PO₄ + H₂O Na₂HPO₄ + H₂O K₂HPO₄ + H₂O (NH₄)₂HPO₄ + H₂O Na₃PO₄ + H₂O K₃PO₄ + H₂O (NH₄)₃PO₄ + H₂O	Binary phosphates	[26]

Table 2. Performance results of the two machine learning algorithms without optimization.

Model	Train R²	Test R²	Train RMSE	Test RMSE
Decision Tree	0.8560	0.8115	0.1262	0.1512
GBM	0.9708	0.9292	0.0568	0.0927

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hashemi, S.H.; Cheperli, A.; Torabi, F.; Shafiei, Y. Predicting Osmotic Coefficients in Aqueous Inorganic Systems: A Hybrid Gazelle Optimization Algorithm (GOA)–Machine Learning Framework for Sustainable Water Treatment. Sustainability 2026, 18, 3959. https://doi.org/10.3390/su18083959

AMA Style

Hashemi SH, Cheperli A, Torabi F, Shafiei Y. Predicting Osmotic Coefficients in Aqueous Inorganic Systems: A Hybrid Gazelle Optimization Algorithm (GOA)–Machine Learning Framework for Sustainable Water Treatment. Sustainability. 2026; 18(8):3959. https://doi.org/10.3390/su18083959

Chicago/Turabian Style

Hashemi, Seyed Hossein, Ali Cheperli, Farshid Torabi, and Yousef Shafiei. 2026. "Predicting Osmotic Coefficients in Aqueous Inorganic Systems: A Hybrid Gazelle Optimization Algorithm (GOA)–Machine Learning Framework for Sustainable Water Treatment" Sustainability 18, no. 8: 3959. https://doi.org/10.3390/su18083959

APA Style

Hashemi, S. H., Cheperli, A., Torabi, F., & Shafiei, Y. (2026). Predicting Osmotic Coefficients in Aqueous Inorganic Systems: A Hybrid Gazelle Optimization Algorithm (GOA)–Machine Learning Framework for Sustainable Water Treatment. Sustainability, 18(8), 3959. https://doi.org/10.3390/su18083959

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Predicting Osmotic Coefficients in Aqueous Inorganic Systems: A Hybrid Gazelle Optimization Algorithm (GOA)–Machine Learning Framework for Sustainable Water Treatment

Abstract

1. Introduction

2. Data and Methods

2.1. GOA-DT Hybrid Approach

2.2. GOA-GBM Hybrid Approach

2.3. Data Collection

3. Results and Discussion

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI