Carbon Capture Using Metal Organic Frameworks (MOFs): Novel Custom Ensemble Learning Models for Prediction of CO2 Adsorption
Abstract
1. Introduction
2. Materials and Methods
2.1. Data Collection and Description
2.2. Model Development
2.2.1. Data Preprocessing
2.2.2. Feature Scaling
2.2.3. ML Algorithms
Random Forest Regressor
Extreme Gradient Boosting (XGBoost)
Light Gradient-Boosting Machine (LightGBM)
Support Vector Regression (SVR)
Multi-Layer Perceptron (MLP) Regressor
2.2.4. Ensemble Learning
Equal-Weighted Voting Ensemble
Weighted Voting Ensemble
Stacking Ensemble
Manual Blending Ensemble Learning
2.3. Model Evaluation
3. Results
3.1. Tuning of Hyperparameters
3.2. Performance of Ensemble Models
4. Discussion
4.1. Analysis of Residual Error
4.2. Cumulative-Frequency Analysis
4.3. Ablation Study
4.4. Comparative Analysis with Published Studies
4.5. Permuation Feature Importance
4.6. Partial Dependence Plot
4.7. Leverage Analysis
5. Conclusions
6. Limitations and Recommendations
- Primarily, the models were trained and validated solely on data from only five (5) metal centers. This approach does not fully capture the variety of base metals used in MOF design. Also, the input parameters included only physical conditions and textural properties. Future work can extend this framework by incorporating deep learning architectures and transfer-learning strategies. These models could potentially capture complex, non-linear interactions between the variables more effectively, leading to improved predictive accuracy for CO2 uptake in MOFs under diverse carbon-capture conditions.
- Another limitation is the selection of machine learning models. While base models have demonstrated strong predictive performance, they may not fully capture the non-linear interactions among pressure, temperature, SBET, VT, and CO2 adsorption. Hybrid descriptors that combine both experimental and structural data would improve model scalability and applicability across wider MOF classes and gas species. Incorporation of geometric-based and energy-based properties of MOFs to fully represent their structural and adsorption characteristics is recommended to improve the accuracy and generalizability of CO2-uptake-prediction models.
- In this study, we applied label encoding to the “metal center” feature for simplicity and computational efficiency. This choice may not fully capture the nuanced chemical effects of different metal sites. Future work should compare alternative schemes such as one-hot encoding, target (mean) encoding or learned embeddings to determine which best represents the metal’s influence on CO2 uptake.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
CO2 | Carbon dioxide |
MOF | Metal–Organic Framework |
SBET | BET surface area (m2/g) |
SL | Langmuir surface area (m2/g) |
VT | Pore volume (cm3/g) |
P | Pressure (bar) |
T | Temperature (K) |
MC | Metal center |
CO2 Uptake | Amount of carbon dioxide adsorbed (mmol/g) |
RMSE | Root Mean Squared Error |
R2 | Coefficient of Determination |
MAE | Mean Absolute Error |
SVR | Support Vector Regression |
SR | Standard Residual |
GNN | Graph Neural Network |
MLP | Multi-Layer Perceptron |
MLPNN | Multi-Layer Perceptron Neural Network |
LSSVM | Least Squared Support Vector Machine |
PSO | Particle Swamp Optimization |
GO | Growth Optimization |
CatBoost | Categorical Boosting |
XGBoost | Gradient Boosting |
XGB | Extreme Gradient Boosting |
ET | Extra Trees |
LightGBM | Light Gradient Boosting Machine |
RF | Random Forest |
PI | Prediction Interval |
IQR | Interquartile Range |
CI | Confidence Interval |
EWE | Equal-weighted Ensemble |
PWE | Performance-weighted Ensemble |
Stack | Meta-model ensemble using predictions from multiple base learners |
MB | Manual Blending |
EWV | Equal Weighted Voting |
WV | Weighted Voting |
References
- Li, X.; Zhang, X.; Zhang, J.; Gu, J.; Zhang, S.; Li, G.; Shao, J.; He, Y.; Yang, H.; Zhang, S.; et al. Applied machine learning to analyze and predict CO2 adsorption behavior of metal-organic frameworks. Carbon Capture Sci. Technol. 2023, 9, 100146. [Google Scholar]
- Longe, P.O.; Danso, D.K.; Gyamfi, G.; Tsau, J.S.; Alhajeri, M.M.; Rasoulzadeh, M.; Li, X.; Barati, R.G. Predicting CO2 and H2 Solubility in Pure Water and Various Aqueous Systems: Implication for CO2–EOR, Carbon Capture and Sequestration, Natural Hydrogen Production and Underground Hydrogen Storage. Energies 2024, 17, 5723. [Google Scholar]
- Prabowo, W.A.E.; Akrom, M.; Rustad, S.; Sutojo, T.; Dipojono, H.K.; Maezono, R.; Rusydi, F. Predicting CO2 adsorption in metal-organic frameworks: Integrating machine learning with virtual sample generation. Results Surf. Interfaces 2025, 19, 100505. [Google Scholar]
- Chao, C.; Deng, Y.; Dewil, R.; Baeyens, J.; Fan, X. Post-combustion carbon capture. Renew. Sustain. Energy Rev. 2021, 138, 110490. [Google Scholar]
- Achour, S.; Hosni, Z. ML-driven models for predicting CO2 uptake in metal–organic frameworks (MOFs). Can. J. Chem. Eng. 2025, 103, 2161–2173. [Google Scholar]
- Longe, P.O.; Davoodi, S.; Mehrad, M.; Wood, D.A. Robust machine-learning model for prediction of carbon dioxide adsorption on metal-organic frameworks. J. Alloys Compd. 2025, 1010, 177890. [Google Scholar]
- Amar, M.N.; Ouaer, H.; Ghriga, M.A. Robust smart schemes for modeling carbon dioxide uptake in metal− organic frameworks. Fuel 2022, 311, 122545. [Google Scholar]
- Moosavi, S.M.; Jablonka, K.M.; Smit, B. The role of machine learning in the understanding and design of materials. J. Am. Chem. Soc. 2020, 142, 20273–20287. [Google Scholar]
- Burner, J.; Schwiedrzik, L.; Krykunov, M.; Luo, J.; Boyd, P.G.; Woo, T.K. High-performing deep learning regression models for predicting low-pressure CO2 adsorption properties of metal–organic frameworks. J. Phys. Chem. C 2020, 124, 27996–28005. [Google Scholar]
- Longe, P.; Molomjav, S.; Tsau, J.-S.; Musgrove, S.; Villalobos, J.; D’ERasmo, J.; Alhajeri, M.M.; Barati, R. Techno-economic evaluation of CO2-EOR and carbon storage in a shallow incised fluvial reservoir using captured-CO2 from an ethanol plant. Geoenergy Sci. Eng. 2025, 246, 213559. [Google Scholar]
- Moosavi, S.M.; Nandy, A.; Jablonka, K.M.; Ongari, D.; Janet, J.P.; Boyd, P.G.; Lee, Y.; Smit, B.; Kulik, H.J. Understanding the diversity of the metal-organic framework ecosystem. Nat. Commun. 2020, 11, 4068. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Liu, J.; Wang, H.; Zhou, M.; Ke, G.; Zhang, L.; Wu, J.; Gao, Z.; Lu, D. A comprehensive transformer-based approach for high-accuracy gas adsorption predictions in metal-organic frameworks. Nat. Commun. 2024, 15, 1904. [Google Scholar] [PubMed]
- Herm, Z.R.; Wiers, B.M.; Mason, J.A.; van Baten, J.M.; Hudson, M.R.; Zajdel, P.; Brown, C.M.; Masciocchi, N.; Krishna, R.; Long, J.R. Separation of hexane isomers in a metal-organic framework with triangular channels. Science 2013, 340, 960–964. [Google Scholar]
- Millward, A.R.; Yaghi, O.M. Metal−organic frameworks with exceptionally high capacity for storage of carbon dioxide at room temperature. J. Am. Chem. Soc. 2005, 127, 17998–17999. [Google Scholar] [PubMed]
- Wang, M.; Zeng, Q.; Chen, D.; Zhang, Y.; Liu, J.; Ma, C.; Jia, P. A machine learning feature descriptor approach: Revealing potential adsorption mechanisms for SF6 decomposition product gas-sensitive materials. J. Hazard. Mater. 2025, 481, 136567. [Google Scholar]
- Zhang, Z.; Cao, X.; Geng, C.; Sun, Y.; He, Y.; Qiao, Z.; Zhong, C. Machine learning aided high-throughput prediction of ionic liquid@ MOF composites for membrane-based CO2 capture. J. Membr. Sci. 2022, 650, 120399. [Google Scholar]
- Gulbalkan, H.C.; Aksu, G.O.; Ercakir, G.; Keskin, S. Accelerated Discovery of Metal–Organic Frameworks for CO2 Capture by Artificial Intelligence. Ind. Eng. Chem. Res. 2023, 63, 37–48. [Google Scholar]
- Park, H.; Yan, X.; Zhu, R.; Huerta, E.A.; Chaudhuri, S.; Cooper, D.; Foster, I.; Tajkhorshid, E. A generative artificial intelligence framework based on a molecular +diffusion model for the design of metal-organic frameworks for carbon capture. Commun. Chem. 2024, 7, 21. [Google Scholar]
- Choudhary, K.; Yildirim, T.; Siderius, D.W.; Kusne, A.G.; McDannald, A.; Ortiz-Montalvo, D.L. Graph neural network predictions of metal organic framework CO2 adsorption properties. Comput. Mater. Sci. 2022, 210, 111388. [Google Scholar]
- Abdi, J.; Hadavimoghaddam, F.; Hadipoor, M.; Hemmati-Sarapardeh, A. Modeling of CO2 adsorption capacity by porous metal organic frameworks using advanced decision tree-based models. Sci. Rep. 2021, 11, 24468. [Google Scholar]
- Lu, C.; Wan, X.; Ma, X.; Guan, X.; Zhu, A. Deep-Learning-Based End-to-End Predictions of CO2 Capture in Metal–Organic Frameworks. J. Chem. Inf. Model. 2022, 62, 3281–3290. [Google Scholar]
- Orhan, I.B.; Zhao, Y.; Babarao, R.; Thornton, A.W.; Le, T.C. Machine learning descriptors for CO2 capture materials. Molecules 2025, 30, 650. [Google Scholar] [CrossRef]
- Wan, H.; Fang, Y.; Hu, M.; Guo, S.; Sui, Z.; Huang, X.; Liu, Z.; Zhao, Y.; Liang, H.; Wu, Y.; et al. Interpretable Machine-Learning and Big Data Mining to Predict the CO2 Separation in Polymer-MOF Mixed Matrix Membranes. Adv. Sci. 2025, 12, 2405905. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar]
- Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. Lightgbm: A highly efficient gradient boosting decision tree. 31st Conference on Neural Information Processing Systems. In Advances in Neural Information Processing Systems 30; Curran Associates, Inc.: Long Beach, CA, USA, 2017. [Google Scholar]
- Dietterich, T.G. Ensemble methods in machine learning. In International Workshop on Multiple Classifier Systems; Springer: Berlin/Heidelberg, Germany, 2000; pp. 1–15. [Google Scholar]
- Rokach, L. Ensemble-based classifiers. Artif. Intell. Rev. 2010, 33, 1–39. [Google Scholar]
- Zhou, Z.-H. Ensemble methods. Combining Pattern Classifiers; Wiley: Hoboken, NJ, USA, 2014; pp. 186–229. [Google Scholar]
- Warner, B.; Ratner, E.; Carlous-Khan, K.; Douglas, C.; Lendasse, A. Ensemble Learning with Highly Variable Class-Based Performance. Mach. Learn. Knowl. Extr. 2024, 6, 2149–2160. [Google Scholar]
- Aggarwal, S.; Gupta, S.; Gupta, D.; Gulzar, Y.; Juneja, S.; Alwan, A.A.; Nauman, A. An artificial intelligence-based stacked ensemble approach for prediction of protein subcellular localization in confocal microscopy images. Sustainability 2023, 15, 1695. [Google Scholar] [CrossRef]
- Kuncheva, L.I. Combining Pattern Classifiers: Methods and Algorithms; John Wiley & Sons: Hoboken, NJ, USA, 2014. [Google Scholar]
- Silva, P.; Vilela, S.M.; Tomé, J.P.; Paz, F.A.A. Multifunctional metal–organic frameworks: From academia to industrial applications. Chem. Soc. Rev. 2015, 44, 6774–6803. [Google Scholar]
- Naimi, A.I.; Balzer, L.B. Stacked generalization: An introduction to super learning. Eur. J. Epidemiol. 2018, 33, 459–464. [Google Scholar]
- Opitz, D.; Maclin, R. Popular ensemble methods: An empirical study. J. Artif. Intell. Res. 1999, 11, 169–198. [Google Scholar]
- Kalule, R.; Abderrahmane, H.A.; Alameri, W.; Sassi, M. Stacked ensemble machine learning for porosity and absolute permeability prediction of carbonate rock plugs. Sci. Rep. 2023, 13, 9855. [Google Scholar]
- Hoerl, A.E.; Kennard, R.W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970, 12, 55–67. [Google Scholar]
- Guan, J.; Huang, T.; Liu, W.; Feng, F.; Japip, S.; Li, J.; Wu, J.; Wang, X.; Zhang, S. Design and prediction of metal organic framework-based mixed matrix membranes for CO2 capture via machine learning. Cell Rep. Phys. Sci. 2022, 3, 100864. [Google Scholar]
- Hyndman, R.J.; Athanasopoulos, G. Forecasting: Principles and Practice, 2nd ed.; OTexts.com/fpp2; OTexts: Melbourne, Australia, 2018. [Google Scholar]
- Biecek, P.; Burzykowski, T. Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models; Chapman and Hall/CRC: Boca Raton, FL, USA, 2021. [Google Scholar]
- Williams, D. Generalized linear model diagnostics using the deviance and single case deletions. J. R. Stat. Soc. Ser. C (Appl. Stat.) 1987, 36, 181–191. [Google Scholar]
SBET (m2/g) | VT (cm3/g) | T (K) | P (bar) | CO2 Uptake (mmol/g) | |
---|---|---|---|---|---|
Mean | 2221.10 | 0.8986 | 284.9109 | 10.9987 | 11.6579 |
Std | 996.7416 | 0.4132 | 29.4018 | 11.7384 | 7.2239 |
Min | 345 | 0.18 | 220 | 0.0005 | 0.0456 |
25% | 1690 | 0.61 | 260 | 1.2780 | 6.10 |
50% | 1980 | 0.91 | 298 | 5.8270 | 10.36 |
75% | 2516 | 1.14 | 310 | 18.60 | 16.5512 |
Maximum | 4508 | 1.83 | 313 | 42.50 | 33.90 |
Algorithm | Hyperparameter | Range | Optimal Value |
---|---|---|---|
max_depth | None, 4, 6, 8, 10 | 8 | |
RF | min_samples_split | 2, 4, 6, 8, 10 | 4 |
min_samples_leaf | 2, 4, 6, 8, 10 | 2 | |
XGB | n_estimators | 100, 250, 500, 750, 1000 | 1000 |
max_depth | 3, 4, 5, 6, 8, 10 | 6 | |
learning_rate | 0.01, 0.05, 0.1, 0.2 | 0.1 | |
subsample | 0.5, 0.7, 0.9, 1.0 | 0.9 | |
colsample_bytree | 0.5, 0.7, 0.9, 1.0 | 0.7 | |
gamma | 0, 0.1, 0.5, 1 | 0.1 | |
n_estimators | 100, 250, 500, 750, 1000 | 1000 | |
LightGBM | max_depth | 2, 4, 6, 8 | 6 |
min_child_weight | 2, 4, 6, 8 | 2 | |
C | 1, 10, 100 | 10 | |
SVR | epsilon | 0.01, 0.1, 0.5 | 0.1 |
gamma | scale, auto | scale | |
hidden_layer_sizes | (100,), (100,100) | (100) | |
MLP | activation | relu, tanh | relu |
alpha | 0.0001, 0.001, 0.01 | 0.001 |
Model | Train | Test | |||||
---|---|---|---|---|---|---|---|
R2 | RMSE | MAE | R2 | RMSE | MAE | ||
RF | 0.9767 | 1.0793 | 0.5903 | 0.9639 | 1.4736 | 0.9719 | |
XGB | 0.9835 | 0.9080 | 0.3571 | 0.9747 | 1.2339 | 0.7628 | |
Base | LightGBM | 0.9661 | 1.3019 | 0.7784 | 0.9562 | 1.6222 | 1.0809 |
SVR | 0.9093 | 2.1281 | 1.2703 | 0.9316 | 2.0270 | 1.2552 | |
MLP | 0.8717 | 2.5310 | 1.8102 | 0.8916 | 2.5525 | 1.8967 | |
EWV | 0.9861 | 0.8328 | 0.4376 | 0.9792 | 1.1174 | 0.6943 | |
Ensemble | WV | 0.9848 | 0.8726 | 0.4971 | 0.9779 | 1.1527 | 0.7558 |
Stack | 0.9931 | 0.5872 | 0.3032 | 0.9833 | 1.0016 | 0.6630 | |
MB | 0.9828 | 0.9269 | 0.5120 | 0.9765 | 1.1889 | 0.7429 |
Model | R2 (Mean) | R2 (Std) | RMSE (Mean) | RMSE (Std) | MAE (Mean) | MAE (Std) |
---|---|---|---|---|---|---|
Random Forest | 0.9571 | 0.0064 | 1.4576 | 0.1821 | 0.9116 | 0.0950 |
XGBoost | 0.9674 | 0.0081 | 1.2590 | 0.1603 | 0.7570 | 0.0744 |
LightGBM | 0.9509 | 0.0081 | 1.5556 | 0.2180 | 0.9312 | 0.0786 |
SVR | 0.7288 | 0.0365 | 3.6673 | 0.4434 | 2.7115 | 0.3088 |
MLP | 0.8641 | 0.0238 | 2.5892 | 0.3187 | 1.8793 | 0.1785 |
Stacking | 0.9740 | 0.0062 | 1.1584 | 0.5530 | 0.6994 | 0.0369 |
WV | 0.9728 | 0.0064 | 1.1859 | 0.5635 | 0.7449 | 0.0451 |
EWV | 0.9258 | 0.0171 | 1.8215 | 0.9437 | 1.4667 | 0.1316 |
Blending | 0.9694 | 0.0059 | 1.2579 | 0.5326 | 0.7916 | 0.0465 |
Studies | Dataset Size | Features Used | Best Model | R2 Score | Interpretability |
---|---|---|---|---|---|
Abdi et al. [20] | 1191 | T, P, SBET, and VT | CatBoost | 0.9733 | Statistical evaluation, Williams plot |
Amar et al. [7] | 1212 | SBET, P, Void Fraction, Topological and Geometrical Descriptors | Stacked Ensemble (Meta-regressor with RF, GBoost, Lasso, Ridge) | 0.9693 | Partial correlation plot and ensemble diagrams |
Li et al. [1] | 475 | SBET, Void Fraction, VT, and T | ET | 0.9625 | SHAP feature explanation and summary plots |
Longe et al. [6] | 475 | SBET, P, T, SL, VT, and MC | LSSVM, MLPNN + PSO, GO (hybrid ML-optimization) | 0.9798 | Williams plot, feature importance, and partial dependence plot |
This Study | 1212 | SBET, P, T, VT, and MC | Stacking Model | 0.9833 | Residual plots, permutation importance, prediction intervals, ablation, and Williams plot |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Iyiola, Z.; Brantson, E.T.; Okeke, N.J.; Sanni, K.; Longe, P. Carbon Capture Using Metal Organic Frameworks (MOFs): Novel Custom Ensemble Learning Models for Prediction of CO2 Adsorption. Processes 2025, 13, 2199. https://doi.org/10.3390/pr13072199
Iyiola Z, Brantson ET, Okeke NJ, Sanni K, Longe P. Carbon Capture Using Metal Organic Frameworks (MOFs): Novel Custom Ensemble Learning Models for Prediction of CO2 Adsorption. Processes. 2025; 13(7):2199. https://doi.org/10.3390/pr13072199
Chicago/Turabian StyleIyiola, Zainab, Eric Thompson Brantson, Nneoma Juanita Okeke, Kayode Sanni, and Promise Longe. 2025. "Carbon Capture Using Metal Organic Frameworks (MOFs): Novel Custom Ensemble Learning Models for Prediction of CO2 Adsorption" Processes 13, no. 7: 2199. https://doi.org/10.3390/pr13072199
APA StyleIyiola, Z., Brantson, E. T., Okeke, N. J., Sanni, K., & Longe, P. (2025). Carbon Capture Using Metal Organic Frameworks (MOFs): Novel Custom Ensemble Learning Models for Prediction of CO2 Adsorption. Processes, 13(7), 2199. https://doi.org/10.3390/pr13072199