Machine Learning Prediction of CO2 Diffusion in Brine: Model Development and Salinity Influence Under Reservoir Conditions

Khan, Qaiser; Pourafshary, Peyman; Hadavimoghaddam, Fahimeh; Khoramian, Reza

doi:10.3390/app15158536

Open AccessArticle

Machine Learning Prediction of CO₂ Diffusion in Brine: Model Development and Salinity Influence Under Reservoir Conditions

¹

School of Mining and Geosciences, Nazarbayev University, 010000 Astana, Kazakhstan

²

Chemical Engineering Department, Ufa State Petroleum Technological University, 450000 Ufa, Russia

³

Institute of Unconventional Oil & Gas, Northeast Petroleum University, Daqing 163318, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(15), 8536; https://doi.org/10.3390/app15158536

Submission received: 22 May 2025 / Revised: 27 June 2025 / Accepted: 3 July 2025 / Published: 31 July 2025

Download

Browse Figures

Versions Notes

Abstract

The diffusion coefficient (DC) of CO₂ in brine is a key parameter in geological carbon sequestration and CO₂-Enhanced Oil Recovery (EOR), as it governs mass transfer efficiency and storage capacity. This study employs three machine learning (ML) models—Random Forest (RF), Gradient Boost Regressor (GBR), and Extreme Gradient Boosting (XGBoost)—to predict DC based on pressure, temperature, and salinity. The dataset, comprising 176 data points, spans pressures from 0.10 to 30.00 MPa, temperatures from 286.15 to 398.00 K, salinities from 0.00 to 6.76 mol/L, and DC values from 0.13 to 4.50 × 10⁻⁹ m²/s. The data was split into 80% for training and 20% for testing to ensure reliable model evaluation. Model performance was assessed using R², RMSE, and MAE. The RF model demonstrated the best performance, with an R² of 0.95, an RMSE of 0.03, and an MAE of 0.11 on the test set, indicating high predictive accuracy and generalization capability. In comparison, GBR achieved an R² of 0.925, and XGBoost achieved an R² of 0.91 on the test set. Feature importance analysis consistently identified temperature as the most influential factor, followed by salinity and pressure. This study highlights the potential of ML models for predicting CO₂ diffusion in brine, providing a robust, data-driven framework for optimizing CO₂-EOR processes and carbon storage strategies. The findings underscore the critical role of temperature in diffusion behavior, offering valuable insights for future modeling and operational applications.

Keywords:

diffusion; machine learning; random forest; gradient boosting regressor; extreme gradient boosting; enhanced oil recovery

1. Introduction

Anthropogenic greenhouse gas (GHG) emissions have recently reached critically high levels, significantly contributing to global warming and causing long-term changes in Earth’s climate system. To mitigate these effects, reducing atmospheric CO₂ concentrations has become an urgent priority [1]. One effective solution is carbon capture and storage (CCS), where CO₂ is either utilized for enhanced oil recovery (EOR) or permanently stored in deep saline aquifers. During this process, CO₂ diffuses into the surrounding brine due to concentration gradients, following the natural drive toward thermodynamic equilibrium [2]. This dissolution reduces the buoyancy-driven vertical migration of CO₂, leading to greater plume stability and more secure sequestration. A detailed understanding of the diffusion process enhances the accuracy of CO₂ transport models, supports optimized storage design, and reduces the potential for long-term leakage—ultimately improving the environmental and operational safety of CCS and EOR initiatives [3].

The diffusion coefficient (DC) is a crucial parameter in determining the rate of CO₂ movement into the brine. Higher DC values correspond to faster and more efficient diffusion, contributing to improved storage stability [4]. Since DC governs the transition of CO₂ from a free to a dissolved phase, its accurate determination is essential for evaluating sequestration efficiency. Precise measurement under varying reservoir conditions—particularly temperature, pressure, and salinity—is vital for predicting CO₂ behavior in saline aquifers. Three main approaches are used to determine DC: (a) experimental methods [5], (b) empirical or semi-empirical correlations [6], and (c) molecular simulations [7]. In addition, artificial intelligence (AI) techniques offer a powerful alternative by integrating data from all three sources to build robust predictive models capable of handling diverse and non-linear conditions. Several experimental methods have been developed to measure DC of gases, as reviewed in multiple studies. These methods, illustrated in Figure 1, include Taylor dispersion [8], laser-induced fluorescence [9], and pressure decay techniques [10]. Among them, the pressure decay method is widely used due to its reliability, simplicity, and suitability for controlled environments. Since the 1930s, the PVT method has also been extensively applied—often in combination with pressure decay—to assess CO₂ diffusivity [11]. Experimental results show that DC is sensitive to pressure, temperature, and salinity. For example, Zhang et al. (2015) [12] reported values between 1.3 × 10⁻⁹ m²/s to 2.7 × 10⁻⁹ m²/s in NaCl solutions over 0.5–2.0 MPa and 25–70 °C. Yang and Gu (2006) [13] observed much higher DC values (170.7–269.8 × 10⁻⁹ m²/s) in brines at 2.6–7.5 MPa and 27–58 °C, likely due to natural convection. Lu et al. (2013) [14], using Raman spectroscopy at 10–45 MPa, found nearly pressure-independent behavior. Zarghami et al. (2017) [15] showed that increased salinity (20–80 ppm NaCl) at 68 °C significantly reduces diffusivity.

The CO₂ diffusion coefficient in brine and pure water has been widely measured and reported. For example, Cadogan et al. (2014) [16] used the Taylor dispersion method and reported diffusivity values approximately 16% higher than those by Ratcliff and Holdcroft (1963) [17], who applied the wetted sphere absorber technique—particularly under low salinity (1 mol·L⁻¹) and similar temperature-pressure conditions. Zhang et al. (2015) [12] studied CO₂ diffusion in 3 wt% brine under offshore conditions (0.1–5 MPa, 286.15–303.15 K), reporting values from 1.3 to 2.7 × 10⁻⁹ m²/s and observing a linear increase with both pressure and temperature. Sell et al. (2013) [18], using microfluidics, showed that higher salinity inhibits diffusion due to stronger ion–water interactions. The Taylor dispersion method has been especially useful across a broad range of temperatures and pressures, as shown by Cadogan et al. (2014) [16], who highlighted the positive effect of temperature on diffusivity due to enhanced molecular motion. Tewes and Boury (2005) [19] employed the pendant drop technique to investigate gas–liquid interfacial dynamics, revealing key diffusion behaviors. Hirai et al. (1997) [20] used laser-induced fluorescence (LIF) to validate theoretical predictions under high-pressure conditions. Pressure decay methods have also shown significant sensitivity to ionic strength: Azin et al. (2013) [21] and Yang and Gu (2006) [22] found that increased salinity and ionic strength reduce CO₂ diffusivity by acting as physical barriers to molecular transport. Although experimental methods provide accurate CO₂ DC, they are time-consuming, costly, and require specialized instrumentation. As a result, empirical correlations have become a practical alternative, particularly when direct measurements are unavailable. For example, the Stokes–Einstein relation, as applied by Cadogan et al. (2015) [23], uses viscosity, temperature, and solute radius but performs poorly in high-salinity systems (>5 M NaCl) due to ion pairing and viscosity effects. While it works for simple NaCl brines, it fails in multicomponent systems with divalent ions like Mg²⁺ and Ca²⁺. Similarly, the Wilke and Chang (1955) correlation [24] incorporates temperature, viscosity, solute size, and an association parameter but is limited to dilute, non-electrolytic solutions. It neglects ionic strength and solute hydration, leading to significant errors in concentrated brines (>1 M salinity) [24,25,26]. Overall, these correlations struggle with complex reservoir conditions and cannot reliably predict DC in realistic, high-salinity brine systems. Beyond experiments and empirical correlations, molecular dynamics (MD) simulations offer atomic-scale insights into CO₂ diffusion in brine. Studies by Garcia-Rates et al. (2012) [25] and Omrani et al. (2022) [27] highlight the role of ion hydration and salinity in controlling diffusivity. While powerful, MD simulations require validation under extreme reservoir conditions and are often combined with experimental methods like Raman spectroscopy and pressure decay to improve reliability. These hybrid approaches help capture complex interactions involving multivalent ions and ionic strength effects.

Traditional methods, including correlations and MD simulations, provide useful insights but face challenges with large datasets and non-linear interactions. Experimental techniques [28,29,30] are also time- and resource-intensive. Machine learning (ML) models address these issues by integrating experimental and simulation data to predict CO₂ diffusivity with high accuracy and efficiency [31]. Feng et al. (2019) [32] and Bemani et al. (2020) [33] developed ML models—MKSVM-GA and PSO-ANFIS—for predicting CO₂ diffusivity in brines. However, both used limited datasets with narrow salinity and pressure ranges. While high accuracy was reported, the small sample sizes risk overfitting and limit generalization. Their model parameters and performance metrics are summarized in Table 1.

The present study addresses limitations of previous works by developing a more robust and diverse predictive framework tailored for CCS applications. It aims to predict CO₂ diffusion coefficients in brine using advanced machine learning models—Random Forest (RF), Gradient Boosted Regression (GBR), and XGBoost—trained on a dataset of 176 experimental and simulation data points. The data span pressures of 0.1–30 MPa, temperatures of 286.15–398 K, and salinities up to 6 mol/L. RF was chosen for its robustness against overfitting, while GBR and XGBoost employ sequential learning to capture complex patterns. These models, applied here for the first time in this context, use temperature, pressure, and salinity as inputs. Results confirm salinity as the second most influential factor after temperature due to its effect on solvent viscosity, density, and molecular mobility. The workflow is summarized in Figure 2, and the findings provide a robust, data-driven approach to support CO₂ CCS and EOR design.

2. Data Description

In this study, a dataset consisting of 176 data points was gathered from different studies reported in the literature, including experimental and MD simulation data points [3,16,18,21,25,30,36,37,38,39,40,41]. While the experimental dataset was selected due to its high quality, public availability, and strong relevance to our study objectives, simulation data was also incorporated to address the limitations posed by the scarcity of experimental measurements. Simulation offers a controlled environment in which various variables can be systematically manipulated, enabling exploration across a wider parameter space. This approach allows us to generate meaningful findings that support our research objectives, particularly in areas where experimental data is difficult, costly, or time-consuming to obtain. Three input parameters including pressure (MPa), temperature (K), and salinity (mol/L) were considered in the data modeling process. The descriptive statistics for these parameters, presented in Table 2, highlight the diversity of the collected dataset. Notably, the mean values of pressure, temperature, salinity, and DC are 12.15 MPa, 320.06 K, 2.35 mol/L, and 2.07 ×10⁻⁹ m²/s, respectively, with corresponding standard deviations reflecting significant variability, especially for salinity (2.42 mol/L) and pressure (7.99 MPa). The dataset covers values from 0.10 MPa to 30.00 MPa for Pressure, 286.15 K to 398.00 K for temperature, and 0 mol/L to 6.76 mol/L. Quartile analysis provides additional insights into the data distribution, with Q1, Q2 (median), and Q3 values indicating lower, central, and upper thresholds, as detailed in Table 2. These descriptive statistics of dataset collectively make it suitable for the use of ML modeling and analysis. To support the performance comparison of the developed models, a radar chart was created using R², RMSE, and MAE values for each model. These metrics were normalized to a 0–1 scale to enable fair and consistent visual comparison, the methodology for its construction involved use of Python 3.9 with the pandas and matplotlib libraries. In addition, model interpretability was enhanced using Shapley Additive Explanations (SHAP). SHAP values were calculated for the RF model using the SHAP library (version 0.45.0), allowing for the quantitative assessment of each input feature’s impact on the predicted diffusion coefficient. The SHAP plots, which are also presented in Section 3, help to identify and visualize the influence of temperature, pressure, and salinity on model output. All ML models were implemented using Python 3.9. The RF and GBR models were developed using the scikit-learn library (version 1.2.2), while the XGBoost model was implemented using the XGBoost library (version 1.7.5).

The Pearson correlation coefficients among pressure, temperature, salinity, and DC are visualized in Figure 3 as a heatmap. The correlation coefficients range between +1 and −1, where values near +1 or −1 indicate strong positive or negative linear relationships, respectively. The DC shows a moderate positive correlation with temperature (0.30), while salinity has a weak negative correlation (−0.21), and pressure (P) exhibits an almost negligible correlation (0.02) with DC. These weak correlations suggest that simple linear relationships may not fully explain the interactions between these input variables and output DC. Therefore, advanced ML models such as RF, GBR, and XGBoost are needed to capture complex, non-linear relationships and improve the prediction of the DC.

Figure 4 displays violin plots that reveal the distribution of each variable, highlighting potential outliers. To make the dataset suitable for developing ML models, the parameter values were normalized due to their varying scales across. The normalization approach shown in Figure 5 provides representation of the data by a histogram graph, enabling a detailed examination of each variable’s distribution. The distribution of DC shows a pronounced central peak, suggesting limited variability and a strong clustering of values around the mean. Conversely, the salinity parameter exhibits a multimodal distribution with broader variability, indicative of distinct subsets or measurement inconsistencies within the dataset. The pressure and temperature parameters reveal asymmetrical distributions, pointing to potential skewness or non-uniform sampling across experimental conditions. The boxplots embedded within each violin highlight the interquartile range, median, and potential outliers, offering complementary insights into central tendency and spread. Such detailed exploratory data analysis is indispensable for understanding the dataset’s underlying structure and ensuring informed decisions in subsequent analytical or modeling workflows.

2.1. Modeling with RF, GBR, and XGBoost

In this study, RF, GBR, and XGBoost models were selected due to their ensemble learning capabilities and ability to handle complex data patterns, non-linear relationships, and diverse input features which are essential for predicting CO₂ DC in brine systems. In RF there is no need for feature normalization, it can handle both numerical and categorical data [42]. RF reduces overfitting by averaging multiple decision trees trained on random data subsets, providing robust predictions [43]. GBR and XGBoost build models sequentially by adding multiple weak learners, each correcting errors from the previous iteration. This approach captures complex patterns while reducing bias, as each new weak learner incrementally improves model performance. XGBoost further improves efficiency and accuracy through advanced regularization and optimization techniques [44]. The main aim of this study is to compare the performance of RF, GBR, and XGBoost on the given dataset, and to determine which of the algorithms yields the highest predictive performance for the CO₂ DC in brine system. The dataset was randomly split into two subsets: a training set (80%) for developing the model and a testing set (20%) for the model validation. Hyperparameter tuning was performed for each model to optimize performance. The independent testing set was used to evaluate the model’s accuracy on unseen data, providing a reliable measure of its generalizable predictive capability.

2.2. Model Evaluation Metrics

To assess the performance of all the models developed, several indices have been used. These metrics not only accurately measure the performance of each model but also facilitate the quantity of comparison and ranking them. The indices that were used in this study are presented in Table 3 where the general equations for the indices are given to enhance clarity as well as overall performance description.

3. Results and Discussion

3.1. Model Development and Evaluation

The present study, three alternative regression models namely RF, GBR, and XGBoost were used to predict the DC of CO₂ in brine system. The models were evaluated for their effectiveness and accuracy in predicting DC considering three input parameters including pressure, temperature, and salinity. To evaluate the performance of the developed models, different statistical metrics such as the R², RMSE, and MAE were calculated and reported in Table 4 for both test and train dataset. To obtain reliable results, the dataset was divided into 80% for training and 20% for testing and all models were optimized through hyperparameters tuning to further ensure the reliability of the developed models. Figure 6 demonstrates the performance of the developed models using clustered column charts, for all three-performance evaluation metrics (R², RMSE, and MAE), highlighting their predictive capabilities and reliability for both the test and train dataset.

The hyperparameters for each ML model were tuned to enhance accuracy and generalization. The optimized parameters are summarized in Table 5. RF model employed 500 estimators, a maximum depth of 10, and “auto” for feature selection, achieving robust performance. XGBoost utilized 500 estimators, a learning rate of 0.05, and a maximum depth of 3 to effectively capture non-linear patterns in the dataset. GBR incorporated advanced regularization with 1000 estimators, a learning rate of 0.03, and a subsample ratio of 0.8, balancing flexibility and generalization of the model. All models demonstrated excellent predictive accuracy, with RF achieving the highest R² of 0.96 and 0.95 for the train and test dataset, respectively.

Among all ML models developed in this study, the RF model emerged as the most effective overall, from R² and RMSE point of view, as represented in Figure 6c and Table 5. The RF model outperformed the XGBoost and GBR models, delivering superior evaluation metrics that demonstrated higher accuracy and better performance for both test and train datasets.

3.2. Visual Validation and Trend Analysis

To assess the predictive fidelity of the developed models, scatter plots were constructed to compare actual and predicted DC values (Figure 7). A model’s accuracy is reflected by the proximity of data points to the 1:1 diagonal line; the closer the alignment, the stronger the prediction. Among the models, the Random Forest (RF) exhibited the tightest clustering around the diagonal for both training and testing datasets, indicating minimal prediction bias and high generalization capability. The Gradient Boosted Regression (GBR) model also demonstrated strong performance, though with slightly greater deviation in the test data. XGBoost, while performing adequately in the training set, exhibited a more pronounced spread in the test set, suggesting reduced robustness compared to RF and GBR.

To further evaluate model performance across multiple error and accuracy dimensions, radar charts were employed, as shown in Figure 8. These charts facilitate a comprehensive comparison by representing each performance metric—R², RMSE, and MAE—on a separate axis. In the radar plots, metrics with lower error values (MAE, RMSE) are positioned closer to the center, while higher R² values are located further outward, indicating superior predictive strength. The RF model consistently outperformed both GBR and XGBoost across all metrics for both training and test sets, reaffirming its robustness and superior generalization. GBR followed closely, while XGBoost displayed relatively lower predictive consistency.

Molecular dynamics studies, such as Omrani et al. (2022) [27], highlight the effect of salinity on DC of CO₂ in brine at different pressures and temperatures. At temperature 323 K and pressure 100 MPa, the CO₂ DC decreased from 3.8327 × 10⁻⁹ m²/s in pure water to 3.1553 × 10⁻⁹ m²/s (17.68% reduction) at 1 mol/L NaCl, 2.49 × 10⁻⁹ m²/s (35.08% reduction) at 3 mol/L, and 1.34 × 10⁻⁹ m²/s (64.92% reduction) at 6 mol/L, due to increased salinity and solute–ion interactions. Similar trends were observed in our study: increasing the salinity at constant pressure (10 MPa) and temperature (323 K) resulted in a decrease in DC of CO₂ as shown in Table 6. Figure 9 represents the relationship trends of predicted DC at different salinities and at constant temperature and pressure using ML algorithms. For example, at temperature 310 K and pressure 10 MPa, the CO₂ DC decreased from 2.69 × 10⁻⁹ m²/s in pure water to 2.53 × 10⁻⁹ m²/s (5.94% reduction) at 1 mol/L NaCl, 1.48 × 10⁻⁹ m²/s (44.98% reduction) at 4 mol/L, and 1 × 10⁻⁹ m²/s (62.82% reduction) at 6 mol/L, due to increased salinity and solute–ion interactions.

3.3. Performance Comparison with Previous Studies

Previous research has successfully demonstrated the application of ML hybrid models such as PSO-ANFIS and MKSVM-GA in different studies. Feng et al. (2019) [32] and Bemani et al. (2020)) [33] made important contribution in predicting CO₂ diffusion in brine using advanced models like MKSVM-GA and PSO-ANFIS, achieving high R² of 0.9960 and 0.9993 for the training dataset, respectively. While their test results were valid, the studies relied on small datasets (92 and 86 data points, respectively), which increases the risk of overfitting. Hybrid models require large datasets to perform reliably when such models are applied to small datasets, they are prone to overfitting [45]. In contrast to hybrid models, our study employed RF, GBR, and XGBoost using 176 datapoints. Although the best R² achieved in our study is 0.96 and 0.95 for train and test datasets, respectively, it is suggested that our model can be more reliable due to dependence on large interval, large dataset and close data points which help the ML model to identify the best pattern among the input parameters (pressure, temperature, and salinity) with the output (DC) when compared to Feng et al. (2019) [32] and Bemani et al. (2020) [33]. Additionally, most studies selected viscosity and density as an input parameter, which is important, but salinity has a more direct relationship with the DC [3]. Therefore, our study prioritized salinity as an input parameter with pressure and temperature to capture the pattern and relationship with the output, which is DC. RF, GBR, and XGBoost models displayed exceptional predictive accuracy as shown by their high R² values (e.g., 0.95 for RF testing) along with minimal error (0.11 MAE and 0.03 RMSE for RF) among both training and testing data.

For the evaluation of any ML model, it is essential to analyze different performance metrics such as R², RMSE, and MAE. Relying on a single evaluated metric may lead to misleading conclusions. Kouhi et al. (2025) [35] also predicted CO₂ DC in the brine case with models including MLP, CFNN, RNN, and GEP with high RMSE values of 3.5452, 5.2872, 4.9287, and 5.5611, respectively, indicating significant errors in prediction performance although their R² is very high, as shown in Table 1 [35]. Their input parameters were temperature, pressure, and density. In contrast, the present study achieved considerably lower RMSE values, as shown in Table 7 and Figure 10, indicating improved accuracy and reliability in the predictions.

Another important advantage of our models is their interpretability and computational efficiency. Regression-based models like RF and GBR offer valuable insights into feature importance through SHAP values, illustrating the importance of each input parameter—such as temperature, salinity, and pressure—impact the DC. In contrast, hybrid models like PSO-ANFIS and MKSVM-GA are mostly difficult to interpret, resource-intensive, and time-consuming. These combined strengths demonstrate that our models provide a balance of accuracy, interpretability, and efficiency for reliable predictions of DC across varied conditions.

3.4. Input Variables Significance

To determine the effect of input parameters on DC, SHAP (Shapley Additive Explanations) values were utilized, as shown in Figure 11a. The RF model was chosen for this analysis due to its high accuracy, interpretability, and ability to manage complex relationships within data. RF effectively ranks feature importance by aggregating results across multiple decision trees [46]. When combined with SHAP values, it provides a clear explanation of each parameter’s (e.g., pressure, temperature, and salinity) contribution to the target variable (e.g., DC), making it an ideal choice for accurately evaluating feature influence. As shown in Figure 11a, temperature has the most significant impact on predicting the DC, followed by salinity and pressure. Notably, salinity, a fluid-related parameter (brine), demonstrates the second-highest influence on estimating the DC.

The “bee swarm” plot illustrated by SHAP values in Figure 11b ranks variables by their mean absolute SHAP values in descending order, with the most important parameters appearing at the top. Each point on the plot represents a data instance, plotted against its impact on the predicted DC value. The color of the points indicates the relative magnitude of the feature values, ranging from low (blue) to high (red). For instance, higher salinity values (red points with negative SHAP values) correspond to lower predicted DC values, demonstrating a negative relationship. In contrast, temperature, the most significant feature, shows a broader range of SHAP values, with predominantly positive impacts on DC as its value increases. Similarly, pressure exhibits a moderate influence, contributing less significantly than temperature and salinity.

To better understand the impact of each parameter on DC and validate the results shown by RF model using Tornado and SHAP charts, dependency plots for the three key parameters were also generated using RF, which was chosen for its high accuracy. In Figure 12c, it is evident that higher salinity values have a negative impact on the DC. This aligns with the physical understanding that an increase in salinity leads to a reduction in DC [3,27]. Additionally, since salinity inherently represents the interaction between salinity and CO₂, this further underscores the importance of salinity as a significant factor influencing the diffusion behavior of CO₂ in brine.

3.5. Future Research Directions

To build on the findings of this study and further advance the understanding of CO₂ diffusion in brine systems, several important research directions are recommended. First, a deeper investigation into long-term CO₂–brine–rock interactions is necessary. These include geochemical and mineralogical changes, wettability alterations, and pore-scale structural evolution, all of which directly influence multiphase flow dynamics and storage integrity. Future studies should also focus on improving diffusion coefficient measurement techniques under reservoir conditions, especially by incorporating density-driven convection effects and exploring the role of advanced materials such as green nanofluids on interfacial behavior. Machine learning models capable of predicting CO₂–brine interfacial tension would significantly enhance the reliability of diffusion estimations in heterogeneous systems.

Moreover, addressing subsurface heterogeneity and associated uncertainties remains a key priority. This can be achieved through stochastic modeling that incorporates spatial variability in porosity and permeability, as well as improved formation characterization techniques that reduce prediction errors in CO₂ injection scenarios. Operational variables such as CO₂–brine co-injection strategies, salinity variations, and salt precipitation effects also require further study, particularly in how they influence injectivity and storage efficiency. In parallel, the development of advanced monitoring tools for early detection of CO₂ leakage and assessment of its impact on groundwater quality is critical for long-term risk management.

Finally, special attention should be paid to hydrate formation and stability in depleted gas reservoirs under CO₂ injection, especially near the wellbore region where pressure and thermal gradients are prominent. Addressing these multifaceted challenges will not only close existing knowledge gaps but also enhance the accuracy, scalability, and field relevance of predictive models. These efforts will collectively support the safe and efficient deployment of carbon capture and storage technologies, contributing to global climate mitigation strategies.

4. Conclusions

In this study, 176 data points from the literature were collected and used to develop ML models for predicting and evaluating the effects of salinity, temperature, and pressure on the DC of CO₂ in brine. Three advanced ML models RF, GBR, and XGBoost were selected due to its ability of reducing the chances of overfitting. The data split into 80% for training and 20% for testing. Among these, the RF model demonstrated superior accuracy and efficiency in estimating CO₂ DC based on input parameters, significantly reducing the time required compared to laboratory experiments and molecular simulations. The results revealed that temperature is the most influential factor, positively correlating with DC, followed by salinity and pressure, as determined through three different techniques: Tornado chart, SHAP value analysis, and dependency plot. Salinity exhibited a negative correlation with DC across all models, with RF showing the highest accuracy and smallest error (R² of 0.96 and RMSE value of 0.03 for the test dataset) in predicting DC values and sensitivity to salinity changes. While the proposed models effectively captured diffusion behavior within the studied parameter range (pressures of 0.1–30 MPa, temperatures of 286.15–398 K, and salinities up to 6 mol/L), their applicability is limited to these conditions. For future work, it would be advisable to use the molecular structure of the salt and use some advanced ML models capable of capturing molecular-level features, which would help to elucidate different mechanisms at the molecular scale. Moreover, there is still a lack of information about the composition of the injected CO₂ in most ML models; including these parameters would further improve predictions of the diffusion coefficient. Most importantly, expanding the dataset to include additional experiments and input features such as permeability or porosity could further enhance model performance and robustness. These findings underscore the potential of ML models to provide a robust, data-driven framework for optimizing CO₂ capture, storage, and enhanced oil recovery operations by accurately predicting diffusion behavior in brine systems.

Author Contributions

Conceptualization, Q.K., P.P. and F.H.; methodology, Q.K.; software, Q.K.; validation, Q.K., P.P., R.K. and F.H.; formal analysis, Q.K.; investigation, Q.K.; resources, Q.K., P.P. and F.H.; data curation, Q.K.; writing—original draft preparation, Q.K.; writing—review and editing, P.P., F.H. and R.K.; visualization, Q.K. and R.K.; supervision, P.P. and F.H.; project administration, P.P.; funding acquisition, P.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Nazarbayev University under the Faculty Development Competitive Research Grant (Grant No. 201223FD2608).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The authors would like to express their sincere gratitude to Nazarbayev University for supporting this research through the Faculty Development Competitive Research Grant. We also acknowledge the contributions of previous researchers whose published datasets enabled this study. Special thanks go to the anonymous reviewers for their comments and suggestions, which significantly improved the quality of the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Holtz, M.H.; Nance, P.K.; Finley, R.J. Reduction of Greenhouse Gas Emissions through CO₂ EOR in Texas. Environ. Geosci. 2001, 8, 187–199. [Google Scholar] [CrossRef]
Mosavat, N.; Abedini, A.; Torabi, F. Phase Behaviour of CO₂–Brine and CO₂–Oil Systems for CO₂ Storage and Enhanced Oil Recovery: Experimental Studies. Energy Procedia 2014, 63, 5631–5645. [Google Scholar] [CrossRef]
Wang, H.; Li, Y.; Li, C.; Zhu, H.; Li, Z.; Wang, L.; Medina-Rodriguez, B.X. Unveil the controls on CO₂ diffusivity in saline brines for geological carbon storage. Geoenergy Sci. Eng. 2025, 244, 213483. [Google Scholar] [CrossRef]
Zhang, Y.; Geng, W.; Chen, M.; Xu, X.; Jiang, L.; Song, Y. Experimental Measurements of the Diffusion Coefficient and Effective Diffusion Coefficient of CO₂-Brine under Offshore CO₂ Storage Conditions. Energy Fuels 2023, 37, 19695–19703. [Google Scholar] [CrossRef]
Rezk, M.G.; Foroozesh, J.; Abdulrahman, A.; Gholinezhad, J. CO₂ Diffusion and Dispersion in Porous Media: Review of Advances in Experimental Measurements and Mathematical Models. Energy Fuels 2022, 36, 133–155. [Google Scholar] [CrossRef]
Renner, T.A. Measurement and Correlation of Diffusion Coefficients for CO₂ and Rich-Gas Applications. SPE Reserv. Eng. 1988, 3, 517–523. [Google Scholar] [CrossRef]
Feng, Q.; Xing, X.; Wang, S.; Liu, G.; Qin, Y.; Zhang, J. CO₂ diffusion in shale oil based on molecular simulation and pore network model. Fuel 2024, 359, 130332. [Google Scholar] [CrossRef]
Secuianu, C.; Maitland, G.C.; Trusler, J.P.M.; Wakeham, W.A. Mutual diffusion coefficients of aqueous KCl at high pressures measured by the Taylor dispersion method. J. Chem. Eng. Data 2011, 56, 4840–4848. [Google Scholar] [CrossRef]
Jimenez, M.; Dietrich, N.; Cockx, A.; Hébrard, G. Experimental study of CO₂ diffusion coefficient measurement at a planar gas–liquid interface by planar laser-induced fluorescence with inhibition. AIChE J. 2013, 59, 325–333. [Google Scholar] [CrossRef]
Riazi, M.R.; Whitson, C.H. Estimating Diffusion Coefficients of Dense Fluids. Ind. Eng. Chem. Res. 1993, 32, 3081–3088. [Google Scholar] [CrossRef]
Kumar, N.; Sampaio, M.A.; Ojha, K.; Hoteit, H.; Mandal, A. Fundamental aspects, mechanisms and emerging possibilities of CO₂ miscible flooding in enhanced oil recovery: A review. Fuel 2022, 330, 125633. [Google Scholar] [CrossRef]
Zhang, W.; Wu, S.; Ren, S.; Zhang, L.; Li, J. The modeling and experimental studies on the diffusion coefficient of CO₂ in saline water. J. CO2 Util. 2015, 11, 49–53. [Google Scholar] [CrossRef]
Yang, C.; Gu, Y. Accelerated mass transfer of CO₂ in reservoir brine due to density-driven natural convection at high pressures and elevated temperatures. Ind. Eng. Chem. Res. 2006, 45, 2430–2436. [Google Scholar] [CrossRef]
Lu, W.; Guo, H.; Chou, I.M.; Burruss, R.C.; Li, L. Determination of diffusion coefficients of carbon dioxide in water between 268 and 473 K in a high-pressure capillary optical cell within situ Raman spectroscopic measurements. Geochim. Cosmochim. Acta 2013, 115, 183–204. [Google Scholar] [CrossRef]
Zarghami, S.; Boukadi, F.; Al-Wahaibi, Y. Diffusion of carbon dioxide in formation water as a result of CO₂ enhanced oil recovery and CO₂ sequestration. J. Pet. Explor. Prod. Technol. 2017, 7, 161–168. [Google Scholar] [CrossRef]
Cadogan, S.P.; Maitland, G.C.; Trusler, J.P.M. Diffusion coefficients of CO₂ and N₂ in water at temperatures between 298.15 K and 423.15 K at pressures up to 45 MPa. J. Chem. Eng. Data 2014, 59, 519–525. [Google Scholar] [CrossRef]
Ratcliff, G.A.; Holdcroft, J.G. Diffusivities of gases in aqueous electrolyte solutions. Trans. Inst. Chem. Eng. 1963, 41, 315–319. [Google Scholar]
Sell, A.; Fadaei, H.; Kim, M.; Sinton, D. Micro fluidic Approach for Reservoir-Speci fi c Analysis. Environ. Sci. Technol. 2013, 47, 71–78. [Google Scholar] [CrossRef]
Tewes, F.; Boury, F. Formation and rheological properties of the supercritical CO₂—Water pure interface. J. Phys. Chem. B 2005, 109, 3990–3997. [Google Scholar] [CrossRef] [PubMed]
Hirai, S.; Okazaki, K.; Yazawa, H.; Ito, H.; Tabe, Y.; Hijikata, K. Measurement of CO₂ diffusion coefficient and application of LIF in pressurized water. Energy 1997, 22, 363–367. [Google Scholar] [CrossRef]
Azin, R.; Mahmoudy, M.; Raad, S.M.J.; Osfouri, S. Measurement and modeling of CO₂ diffusion coefficient in saline aquifer at reservoir conditions. Cent. Eur. J. Eng. 2013, 3, 585–594. [Google Scholar] [CrossRef]
Yang, C.; Gu, Y. A New Method for Measuring Solvent Diffusivity in Heavy Oil by Dynamic Pendant Drop Shape Analysis (DPDSA). 2006. Available online: https://onepetro.org/SPEATCE/proceedings-abstract/03ATCE/03ATCE/SPE-84202-MS/137562 (accessed on 8 February 2025).
Cadogan, S.P.; Hallett, J.P.; Maitland, G.C.; Trusler, J.P.M. Diffusion coefficients of carbon dioxide in brines measured using ¹³C pulsed-field gradient nuclear magnetic resonance. J. Chem. Eng. Data 2015, 60, 181–184. [Google Scholar] [CrossRef]
Wilke, C.R.; Chang, P. Correlation of diffusion coefficients in dilute solutions. AIChE J. 1955, 1, 264–270. [Google Scholar] [CrossRef]
Garcia-Ratés, M.; De Hemptinne, J.C.; Avalos, J.B.; Nieto-Draghi, C. Molecular modeling of diffusion coefficient and ionic conductivity of CO₂ in aqueous ionic solutions. J. Phys. Chem. B 2012, 116, 2787–2800. [Google Scholar] [CrossRef]
Numbere, D.; Brigham, W.E.; Standing, M.B. Correlations for Physical Properties of Petroleum Reservoir Brines. Master’s Thesis, Stanford University, Stanford, CA, USA, 1977. [Google Scholar] [CrossRef]
Omrani, S.; Ghasemi, M.; Mahmoodpour, S.; Shafiei, A.; Rostami, B. Insights from molecular dynamics on CO₂ diffusion coefficient in saline water over a wide range of temperatures, pressures, and salinity: CO₂ geological storage implications. J. Mol. Liq. 2022, 345, 117868. [Google Scholar] [CrossRef]
Tewes, F.; Boury, F. Dynamic and rheological properties of classic and macromolecular surfactant at the supercritical CO₂–H₂O interface. J. Supercrit. Fluids 2006, 37, 375–383. [Google Scholar] [CrossRef]
Moghaddam, R.N.; Rostami, B.; Pourafshary, P. A method for dissolution rate quantification of convection-diffusion mechanism during CO₂ storage in saline aquifers. Spec. Top. Rev. Porous Media 2013, 4, 13–21. [Google Scholar] [CrossRef]
Belgodere, C.; Dubessy, J.; Vautrin, D.; Caumon, M.-C.; Sterpenich, J.; Pironon, J.; Robert, P.; Randi, A.; Birat, J.-P. Experimental determination of CO₂ diffusion coefficient in aqueous solutions under pressure at room temperature via Raman spectroscopy: Impact of salinity (NaCl). J. Raman Spectrosc. 2015, 46, 1025–1032. [Google Scholar] [CrossRef]
Helmy, T.; Al-Azani, S.; Bin-Obaidellah, O. A machine learning-based approach to estimate the CPU-burst time for processes in the computational grids. In Proceedings of the AIMS 2015, 3rd International Conference on Artificial Intelligence, Modelling and Simulation, Kota Kinabalu, Malaysia, 2–4 December 2015; pp. 3–8. [Google Scholar] [CrossRef]
Feng, Q.; Cui, R.; Wang, S.; Zhang, J.; Jiang, Z. Estimation of CO₂ diffusivity in brine by use of the genetic algorithm and mixed kernels-based support vector machine model. J. Energy Resour. Technol. 2019, 141, 041001. [Google Scholar] [CrossRef]
Bemani, A.; Baghban, A.; Mosavi, A.; S, S. Estimating CO₂-Brine diffusivity using hybrid models of ANFIS and evolutionary algorithms. Eng. Appl. Comput. Fluid Mech. 2020, 14, 818–834. [Google Scholar] [CrossRef]
Amar, M.N.; Ghahfarokhi, A.J. Prediction of CO₂ diffusivity in brine using white-box machine learning. J. Pet. Sci. Eng. 2020, 190, 107037. [Google Scholar] [CrossRef]
Kouhi, M.M.; Kahzadvand, K.; Shahin, M.; Shafiei, A. New connectionist tools for prediction of CO₂ diffusion coefficient in brine at high pressure and temperature ─ implications for CO₂ sequestration in deep saline aquifers. Fuel 2025, 384, 134000. [Google Scholar] [CrossRef]
Raad, S.M.J.; Azin, R.; Osfouri, S. Measurement of CO₂ diffusivity in synthetic and saline aquifer solutions at reservoir conditions: The role of ion interactions. Heat Mass Transf. 2015, 51, 1587–1595. [Google Scholar] [CrossRef]
Ahmadi, H.; Jamialahmadi, M.; Soulgani, B.S.; Dinarvand, N.; Sharafi, M.S. Experimental study and modelling on diffusion coefficient of CO₂ in water. Fluid Phase Equilib. 2020, 523, 112584. [Google Scholar] [CrossRef]
Tamimi, A.; Rinker, E.B.; Sandall, O.C. Diffusion Coefficients for Hydrogen Sulfide, Carbon Dioxide, and Nitrous Oxide in Water over the Temperature Range 293–368 K. J. Chem. Eng. Data 1994, 39, 330–332. [Google Scholar] [CrossRef]
Polat, H.M.; Coelho, F.M.; Vlugt, T.J.H.; Franco, L.F.M.; Tsimpanogiannis, I.N.; Moultos, O.A. Diffusivity of CO₂ in H₂O: A Review of Experimental Studies and Molecular Simulations in the Bulk and in Confinement. J. Chem. Eng. Data 2023, 69, 3329. [Google Scholar] [CrossRef]
Basilio, E.; Addassi, M.; Al-Juaied, M.; Hassanizadeh, S.M.; Hoteit, H. Improved pressure decay method for measuring CO₂-water diffusion coefficient without convection interference. Adv. Water Resour. 2024, 183, 104608. [Google Scholar] [CrossRef]
Mutoru, J.W.; Leahy-Dios, A.; Firoozabadi, A. Modeling infinite dilution and Fickian diffusion coefficients of carbon dioxide in water. AIChE J. 2011, 57, 1617–1627. [Google Scholar] [CrossRef]
Caiola, G.; Reiter, J.P. Random Forests for Generating Partially Synthetic, Categorical Data. Trans. Data Priv. 2010, 3, 27–42. [Google Scholar]
Probst, P.; Boulesteix, A.-L. To Tune or Not to Tune the Number of Trees in Random Forest. J. Mach. Learn. Res. 2018, 18, 1–18. [Google Scholar]
Rathakrishnan, V.; Beddu, S.B.; Ahmed, A.N. Predicting compressive strength of high-performance concrete with high volume ground granulated blast-furnace slag replacement using boosting machine learning algorithms. Sci. Rep. 2022, 12, 9539. [Google Scholar] [CrossRef] [PubMed]
Rather, I.H.; Kumar, S.; Gandomi, A.H. Breaking the data barrier: A review of deep learning techniques for democratizing AI with small datasets. Artif. Intell. Rev. 2024, 57, 226. [Google Scholar] [CrossRef]
Fang, Y.; Gao, S.; Tai, D.; Middaugh, C.R.; Fang, J. Identification of properties important to protein aggregation using feature selection. BMC Bioinform. 2013, 14, 314. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Overview of direct and indirect methods used to measure CO₂ DC in aqueous systems.

Figure 2. Workflow of the machine learning framework for CO₂ DC prediction.

Figure 3. Correlation heatmap illustrating relationships among P, T, salinity, and Diffusion Coefficient (DC).

Figure 4. Violin plots depicting the distributions of (a) pressure (P) in MPa, (b) temperature (T) in K, (c) salinity in mol/L, and (d) diffusion coefficient.

Figure 5. Histograms show frequency of input parameters pressure (MPa), temperature (K), salinity (mol/L), and output parameters DC (10⁻⁹ m²/s).

Figure 6. Comparative plots showcasing the performance of RF, GBR, and XGBoost models in terms of (a) RMSE, (b) MAE, and (c) R² metrics across training and testing dataset.

Figure 7. Predicted vs. actual DC for (a) RF, (b) GBR, and (c) XGBoost models.

Figure 8. Radar chart comparing RF, GBR, and XGBoost models on RMSE, MAE, and R² metrics for training and testing datasets, highlighting their prediction accuracy.

Figure 9. Represents the relationship trends of predicted DC (10⁻⁹ m²/s) at different salinities (mol/L) at constant temperature (310 K) and pressure (10 MPa).

Figure 10. Comparison of previous ML models based on test RMSE value with present study.

Figure 11. (a) Tornado chart shows the importance of each parameter obtained from the RF model during hyperparameter tuning and (b) SHAP summary plot illustrating parameter importance.

Figure 12. Dependency plots show the impact of (a) pressure (MPa), (b) temperature (K), and (c) salinity (mol/L) on the diffusion coefficient, as predicted by the RF model.

Table 1. Comparison of models and evaluation metrics for predicting CO₂ diffusion performance.

Source	Model	Data Points (Train/Test)	Parameters (Ranges)	Train Metrics	Test Metrics
Feng et al. (2019) [32]	MKSVM-GA	92 (72/20)	T: 273–473.15 K P: 0.1–49.3 MPa µ: 0.139–1.950 mPa·s	R²: 0.9975 MAE: 0.1112 × 10⁻⁹ m²/s RMSE: 0.1527 × 10⁻⁹ m²/s MARE: 7.17%	R²: 0.9910 MAE: 0.2028 × 10⁻⁹ m²/s RMSE: 0.3028 × 10⁻⁹ m²/s MARE: 10.55%
Bemani et al. (2020) [33]	PSO-ANFIS	86 (N/A)	T: 273–473.15 K P: 0.1–49.3 MPa µ: 0.139–1.950 Pa·s	R²: 0.9993 MARE: 2.0945% RMSE: 0.0869	R²: 0.9978 MARE: 2.7188% RMSE: 0.113
	GA-ANFIS			R²: 0.9957 MARE: 4.2591% RMSE: 0.2156	R²: 0.9932 MARE: 4.9245% RMSE: 0.1976
	ACO-ANFIS			R²: 0.9924 MARE: 5.9726% RMSE: 0.2877	R²: 0.9854 MARE: 6.6933% RMSE: 0.3161
	BP-ANFIS			R²: 0.9862 MARE: 12.2787% RMSE: 0.3905	R²: 0.9738 MARE: 12.787% RMSE: 0.398
	DE-ANFIS			R²: 0.9708 MARE: 14.545% RMSE: 0.633	R²: 0.9514 MARE: 15.965% RMSE: 0.633
Amar et al. (2020) [34]	GEP	92 (72/20)	T: 273,473.15 K P: 0.1–49.3 MPa µ: 0.139–1.950 mPa·s	R²: 0.9980 AARD: 3.8584% RMSE: 0.1427 × 10⁻⁹ m²/s	R²: 0.9978 AARD: 6.0035% RMSE: 0.1245 × 10⁻⁹ m²/s
Amar et al. (2020) [34]	GMDH	92 (72/20)	T: 273,473.15 K P: 0.1–49.3 MPa µ: 0.139–1.950 mPa·s	R²: 0.9943 AARD: 8.6269% RMSE: 0.2479 × 10⁻⁹ m²/s	R²: 0.9874 AARD: 5.6292% RMSE: 0.2271 × 10⁻⁹ m²/s
Kouhi et al. (2025) [35]	MLP	191 (80/20)	P: 0.10–100 MPa T: 210–673 K Brine Density: 98.38–1400 kg/m³ DC: 0.0007–285 × 10⁻⁹ m²/s	R²: 0.9979 RMSE: 2.7521 MAE: 1.6421	R²: 0.9965 RMSE: 3.4812 MAE: 2.3647
	CFNN			R²: 0.9968 RMSE: 3.6024 MAE: 2.4597	R²: 0.9949 RMSE: 5.2113 MAE: 3.9210
	RNN			R²: 0.9974 RMSE: 2.9021 MAE: 1.8890	R²: 0.9958 RMSE: 4.8735 MAE: 3.2241
	GEP			R²: 0.9938 RMSE: 5.1432 MAE: 4.0023	R²: 0.9918 RMSE: 5.4981 MAE: 4.3184

Table 2. Descriptive statistics for the employed dataset of all models.

Statistic	P (MPa)	T (K)	Salinity (mol/L)	DC (10⁻⁹ m²/s)
Count	176	176	176	176
Mean	12.15	320.06	2.35	2.07
Std Dev	7.99	26.35	2.42	0.97
Min	0.10	286.15	0.00	0.13
25%	5.66	300.15	0.51	1.47
Median	10.00	313.00	1.00	1.81
75%	19.79	341.15	4.00	2.73
Max	30.00	398.00	6.76	4.50

Table 3. Summary of evaluation metrics used to assess model performance.

Metric	Expression	Description	Good Range
Coefficient of Determination (R²)	$R^{2} = 1 - \frac{\sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{N} {(y_{i} - \bar{y})}^{2}}$	Measures the proportion of variance in observed data explained by the model. Higher values (closer to 1) indicate a better fit. R² = 1 represents perfect fit, while R² = 0 indicates no explanatory power.	R² > 0.75 (Very Good)
Root Mean Square Error (RMSE)	$RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}$	Reflects the average magnitude of prediction errors, penalizing larger deviations more heavily. Lower values indicate better accuracy.	RMSE → 0.15 (Lower is Better)
Mean Absolute Error (MAE)	$M A E = \frac{1}{N} \sum_{i = 1}^{N} \|y_{i} - {\hat{y}}_{i}\|$	Represents the average absolute difference between predicted and observed values. Less sensitive to outliers than RMSE. Lower values indicate better performance.	MAE → 0.15 (Lower is Better)

Table 4. Performance metrics for RF, GBR, and XGBoost models.

Model	Set	MAE	RMSE	R²
RF	Train	0.10	0.02	0.96
RF	Test	0.11	0.03	0.95
GBR	Train	0.18	0.16	0.973
GBR	Test	0.19	0.026	0.925
XGBoost	Train	0.12	0.184	0.964
	Test	0.13	0.389	0.91

Table 5. Optimized hyperparameters for RF, XGBoost, and GBR.

Hyperparameters	RF	XGBoost	GBR
Number of Estimators (n estimators)	500	500	1000
Learning Rate (learning rate)	-	0.05	0.03
Maximum Depth (max depth)	10	3	4
Minimum Samples Split (min samples split)	2	-	8
Minimum Samples Leaf (min samples leaf)	1	-	4
Maximum Features (max features)	auto	-	-
Subsample (subsample)	-	-	0.8
Random State (random state)	42	42	42

Table 6. Comparison of ML models precited DC at different salinities and at constant pressure and temperature.

Salinity (mol/L)	RF DC (10⁻⁹ m²/s)	GBR DC (10⁻⁹ m²/s)	XGBoost DC (10⁻⁹ m²/s)
0	2.69	2.63	2.41
1	2.53	2.39	2.61
2	2.19	2.1	2
4	1.48	1.37	1.29
6	1	0.91	1.07

Table 7. Comparison of ML models based on test RMSE Performance metric with current studies.

Author	Model	Data Points	RMSE (Test)
Kouhi et al. (2025) [35]	MLP	191	3.5452
	CFNN		5.2872
	RNN		4.9287
	GEP		5.5611
Bemani et al. (2020) [33]	PSO-ANFIS	86	0.113
Amar and Jahanbani Ghahfarokhi (2020) [34]	GMDH	92	0.2271
Amar and Jahanbani Ghahfarokhi (2020) [34]	GEP		0.1245
Feng et al. (2019) [32]	MKSVM-GA		0.3028
Current Study	RF	176	0.03
	GBR		0.026
	XGBoost		0.389

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Khan, Q.; Pourafshary, P.; Hadavimoghaddam, F.; Khoramian, R. Machine Learning Prediction of CO₂ Diffusion in Brine: Model Development and Salinity Influence Under Reservoir Conditions. Appl. Sci. 2025, 15, 8536. https://doi.org/10.3390/app15158536

AMA Style

Khan Q, Pourafshary P, Hadavimoghaddam F, Khoramian R. Machine Learning Prediction of CO₂ Diffusion in Brine: Model Development and Salinity Influence Under Reservoir Conditions. Applied Sciences. 2025; 15(15):8536. https://doi.org/10.3390/app15158536

Chicago/Turabian Style

Khan, Qaiser, Peyman Pourafshary, Fahimeh Hadavimoghaddam, and Reza Khoramian. 2025. "Machine Learning Prediction of CO₂ Diffusion in Brine: Model Development and Salinity Influence Under Reservoir Conditions" Applied Sciences 15, no. 15: 8536. https://doi.org/10.3390/app15158536

APA Style

Khan, Q., Pourafshary, P., Hadavimoghaddam, F., & Khoramian, R. (2025). Machine Learning Prediction of CO₂ Diffusion in Brine: Model Development and Salinity Influence Under Reservoir Conditions. Applied Sciences, 15(15), 8536. https://doi.org/10.3390/app15158536

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu