Next Article in Journal
Enhancing Wildfire Risk Management Through Sensor-Based AI Integration in Social IoT Frameworks
Previous Article in Journal
Review of Techno-Economic Analysis Studies Using HOMER Pro Software
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Analysis of the Descriptors for the Oxidative Coupling of Methane Reaction, Using Varying Machine Learning Approaches †

1
Clean Energy Technologies Research Institute (CETRI), Process Systems Engineering, Faculty of Engineering and Applied Science, University of Regina, 3737 Wascana Parkway, Regina, SK S4S 0A2, Canada
2
Department of Systems and Enterprises, Stevens Institute of Technology, 1 Castle Point, Hoboken, NJ 07030, USA
*
Author to whom correspondence should be addressed.
Presented at the 1st International Conference on Industrial, Manufacturing, and Process Engineering (ICIMP-2024), Regina, Canada, 27–29 June 2024.
Eng. Proc. 2024, 76(1), 100; https://doi.org/10.3390/engproc2024076100
Published: 5 December 2024

Abstract

The fusion of catalytic and electronic properties, coupled with empirical data, provides enriched perspectives into catalyst evaluation and design, thus propelling advancement and innovation in the domain of heterogeneous catalytic reactions, including the oxidative coupling of methane (OCM) reaction. Comparative assessment of various machine learning methodologies on OCM reaction datasets reveals that the Random Forest regression (RFR) model excels in C2H4 and C2H6 combined yield (C2y) predictive accuracy, boasting an average R2 value of 0.98. The hierarchy of modeling performance stands as follows: RFR > XGBR > SVR > DNN. The MSE and MAE metrics of the RFR models were observed to be lower compared to alternative models, ranging from 0.12 to 9.03 for MSE and 0.21 to 2.02 for MAE. Model accuracy follows the order of C2H6y > C2H4y > C2y > CO2y > CH4_conv (methane conversion). When examining the influence of model features, C2y increases proportionally with an augmentation in dataset attributes, including the quantity of alkali/alkali-earth metal moles in the catalyst (13.69%), the atomic number (6.24%) of the catalyst promoter, and the Fermi energy of the metal, with a less pronounced impact compared to the case of temperature (33.70%). This suggests a highly nonlinear correlation between combined ethylene and ethane yield and temperature. Other factors, such as the bandgap of the active metal oxide and the support, as well as the Fermi energy of the catalyst support, were observed to have a relatively modest effect on the predictive models for combined ethylene and ethane yield and methane conversion.

1. Introduction

In pursuing the aspiration to develop efficient and novel catalysts for the OCM reaction, the exploration of computational approaches has recently surfaced as a strategy for examining the OCM reaction. Fang et al. [1] have defined the Mn/K2MoO4/Al2O3 catalyst as a leading OCM catalyst, achieving a notable 67% C2 selectivity coupled with 38% CH4 conversion. The innovation of new catalysts for OCM stands to significantly enhance the process’s feasibility, providing a viable approach for utilizing the abundant CH4 greenhouse gas to produce the essential petrochemical, ethylene.
Employing artificial intelligence (AI) in the form of Machine Learning (ML) continues to support various OCM-related studies. ML facilitates the investigation of complex reaction pathways in oxidative coupling of methane (OCM) by training algorithms on diverse datasets, encompassing both experimental and computational information such as Quantum Mechanics (QM) data in the form of Density Functional Theory (DFT).
To improve OCM reaction efficiency at lower temperatures, ML and data reconstruction supported the targeted development of catalysts. Ohyama et al. [2] synthetically produced and evaluated 63 OCM catalysts, utilizing unsupervised ML to categorize datasets. The examination revealed a subset of catalysts demonstrating effectiveness at low temperatures, subsequently confirmed through experimentation. Discovery and validation of three previously undisclosed low-temperature OCM catalysts showcased the capacity of AI to assist in catalyst synthesis.
This study aims to leverage a fusion of catalyst electronic characteristics and extensive experimental data to construct and compare predictive models and pinpoint predominant catalytic-electronic-based descriptors for efficient catalyst design.

2. Methodology

The collection of data for this study involves merging High Throughput (HTP) Experimental information retrieved from the Catalyst Acquisition by Data Science (CADS) repository [3] with electronically calculated properties by Ugwu et al. [4,5], encompassing aspects like the Fermi energy, bandgap energy, and Magnetic Moment of the catalyst constituents—including the catalyst promoter, active metallic/bimetallic oxide, and the catalyst support. Alongside the electronically computed properties, the HTP OCM dataset aligns experimental conditions with their corresponding reaction results.
The dataset covers 12,708 reactions across 59 distinct catalyst compositions and a control sample, and contains the electronic characteristics of the catalyst components. It incorporates 34 attributes, comprising 8 DFT-computed electronic properties for the catalyst promoter, active metal oxide, and the support, and 26 features derived from the HTP experimental data. Several machine learning (ML) methodologies were assessed for analyzing the dataset:
  • Deep Neural Networks: encompassing Deep Feed-Forward Neural Networks (DNN)
  • Random Forest (RFR)
  • Support Vector Regression (SVR)
  • Extreme Gradient Boost Regression (XGBR)
The ML models, aimed at predicting specific targets/labels such as CO2, C2H4, C2H6, C2 (combined C2H4 and C2H6 yields), and CH4 conversion, underwent training using a designated portion of the data for training and validation (80% of the dataset). Model optimization techniques like hyperparameter tuning and cross-validation were employed. The trained models were then assessed using a separate set of test data (20% of the dataset) that was not utilized during training. Various performance metrics, including the coefficient of determination (R2), Mean Squared Error (MSE), and Mean Absolute Error (MAE), were calculated to evaluate the models. Model comparison was based on their respective error rates.

3. Results and Discussion

3.1. Analysis of the SVR Models

An SVR model was devised for forecasting C2 using only reaction conditions as dataset features (D1). This model, employing linear kernel and radial basis function, concurred with Ohayama et al.’s findings [6] based on R2, affirming the nonlinear correlation between C2 and dataset features. The radial basis function method, incorporating various polynomial kernels with different degrees, was utilized. Model scores indicated an enhancement from 0.77 when considering only reaction conditions (D1) to 0.93 when incorporating reaction conditions and catalyst electronic properties (D2), affirming the improvement upon including catalysts’ electronic properties in the dataset. The residual plot (Figure 1a) depicted a relatively normal data distribution in D2, comparing train and test data. A parity plot (Figure 1b) illustrated the alignment of real data with predicted data. R2 for SVR (rbf) models ranged from 0.76 (C2H4y) to 0.93 (C2y), with MSE and MAE varying from 2.83 to 2.44 and 1.11 to 0.95, respectively, in the following order: C2y > CH4_conv > C2H6y > CO2y > C2H4y.

3.2. Analysis of the RFR Models

The RFR model analyzed diverse datasets, optimizing the number of trees to 300 for the entire dataset. Figure 2a,b showcased the visual illustration of the distribution of the predicted values against the residual values for the C2y predictive model along with the parity plot. Like SVR, R2 for D2 (0.97) surpassed D1 (0.85), consistent with Ohayama et al.’s results [6] (0.78), affirming the model’s validity and the positive impact of catalyst electronic properties. The accuracy of the RFR models is obviously better than that of SVR, as Ohayama et al. [6] noted. RFR predictive models for C2H4y, C2H6y, CO2y, and CH4_conv using D2 exhibited high values. The order of R2 for labels was CH4_conv > C2H6y > CO2y > C2H4y > C2y. MSE and MAE for RFR models were higher than SVR, ranging from 0.12 to 9.03 for MSE and 0.21 to 2.02 for MAE.

3.3. Analysis of the DNN Models

In a bid to optimize the neural network configuration, various architectures were evaluated. A configuration with four layers was found effective, with normalized data and an ReLU activation function. The Adam optimizer was chosen for network optimization. Dropout layers were incorporated to tackle overfitting. R2 for the C2y predictive model was 0.88, with other labels ranging from 0.84 to 0.92. MAE and MSE varied across labels, similar to RFR and XGBR. Figure 3a,b compare the loss and validation loss plots in the overfitted model and the model with dropout layers implemented.

3.4. Analysis of the XGBR Models

The XGBR model’s performance for C2y prediction was compared with Ohayama et al.’s results [6]. R2 for D1 was 0.83, and for D2, it was 0.92, indicating the improvement upon including catalyst electronic properties. Model performance for other labels was also noteworthy. The model comparison suggested similar performances for D1 and D2 in XGBR and RFR. SVR and DNN performance, particularly in terms of R2, were comparable. The overall performance order was RFR > XGBR > SVR > DNN. Figure 4a,b display the visual illustration of the distribution of the predicted values against the residual values for the C2y predictive model, along with a parity plot.

3.5. Model Assessment

An evaluation of model performance on datasets D1 and D2 indicates comparable results between XGBR and RFR for both datasets across five of the target labels. Conversely, SVR and DNN models on D2 exhibit similar performance, particularly in R2 comparisons. For predicting C2y from D2, the model ranking based on R2 > MSE > MAEis RFR > XGBR > SVR > DNN, as detailed in Figure 5. Regarding data fitting, the order for the labels is C2H6y > C2H4y > C2y > CO2y > CH4_conv, depicted in Figure 6. Overall model performance ranking is RFR > XGBR > SVR > DNN.

3.6. Feature Impact Assessment

Common impactful features across RFR and XGBR include temperature, the CH4/O2 ratio, the number of moles of the alkali/alkali-earth metal in the catalyst, the bandgap of the active metal oxide, the atomic number of the catalyst promoter and the Fermi energy of the metal, with near-linear relationships to C2y (Figure 7a,b). Figure 8a,b are bar charts representing the relative impact of the different dataset features on the C2y RFR and XGBR predictive models, respectively.
Comparative 3D surface plots (Figure 9a–c) indicate the impact of M2 moles and promoter Fermi energy on C2y at various temperatures. Higher M2 moles amplify the effect of the atomic number and Fermi energy of the promoter, indicating increased reactivity with more alkali/alkaline earth metals.

4. Conclusions

The amalgamation of catalyst electronic characteristics and reaction parameters within the dataset utilized for forecasting reaction circumstances, encompassing C2H6y, C2H4y, C2y, CO2y, and CH4_conv, elevates model efficacy by roughly 10%, as evidenced by the comparison of R2 values across predictive models for C2y employing SVR, RFR, DNN, and XGBR. Comparative analysis across diverse ML methodologies indicates that RFR models, boasting an average R2 of 0.98 for predictive models concerning the five reaction outcomes and labels, exhibit superior efficiency and accuracy over XGBR, SVR, and DNN (sequentially). Generally, the data fitting sequence for labels with respect to the employed modeling techniques was C2H6y > C2H4y > C2y > CO2y > CH4_conv. The MSE and MAE metrics of RFR models tend to be lower compared to alternative modeling techniques, registering figures ranging from 0.12 to 9.03 for MSE and 0.21 to 2.02 for MAE. Numerous reaction circumstances and catalyst electronic properties wielded a considerable influence on the C2y predictive model, including the alkali/alkali-earth moles count (13.69%), temperature (33.70%), catalyst promoter’s atomic number (6.24%), active metal oxide’s bandgap, promoter’s Fermi energy (4.31%) and methane to oxygen ratio.

Author Contributions

Conceptualization, L.U. and H.I.; methodology, L.U., Y.M. and H.I.; software, L.U.; validation, L.U. and H.I.; formal analysis, L.U.; investigation, L.U.; resources, L.U., H.I. and Y.M.; data curation, L.U.; writing—original draft preparation, L.U.; writing—review and editing, L.U. and H.I.; visualization, L.U.; supervision, H.I. and Y.M.; project administration, H.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Sciences and Engineering Research Council of Canada (NSERC DG: RGPIN-2024-04760), Canada Foundation for Innovation (CFI JELF: 37758), and the VPR Discretionary Fund at the U of R, which are gratefully acknowledged.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data used in this study are open-source data available for reuse at the Catalyst Acquisition by Data Science (CADS) repository—https://cads.eng.hokudai.ac.jp/datamanagement/datasources/21010bbe-0a5c-4d12-a5fa-84eea540e4be/ (accessed on 10 February 2024) and [4].

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Geerts, J.W.M.H.; Chen, Q.; van Kasteren, J.M.N.; van der Wiele, K. Thermodynamics and Kinetic Modeling of the Homogeneous Gas Phase Reactions of the Oxidative Coupling of Methane. Catal. Today 1990, 6, 519–526. [Google Scholar] [CrossRef]
  2. Fallah, B.; Falamaki, C. A New Nano-(2Li2O/MgO) Catalyst/Porous Alpha-Alumina Composite for the Oxidative Coupling of Methane Reaction. AIChE J. 2010, 56, 717–728. [Google Scholar] [CrossRef]
  3. Fujima, J.; Tanaka, Y.; Miyazato, I.; Takahashi, L.; Takahashi, K. Catalyst Acquisition by Data Science (CADS): A Web-Based Catalyst Informatics Platform for Discovering Catalysts. React. Chem. Eng. 2020, 5, 903–911. [Google Scholar] [CrossRef]
  4. Ugwu, L.; Morgan, Y.; Ibrahim, H. Enhancing Ethene Production through Low-Temperature Oxidative Coupling of Methane: Leveraging DFT and Data Analysis for Crafting Innovative and Efficient Catalyst Compositions. Ind. Eng. Chem. Res. 2023, 62, 19658–19673. [Google Scholar] [CrossRef]
  5. Ugwu, L.I.; Morgan, Y.; Ibrahim, H. Increasing Ethene Yield via Oxidative Coupling of Methane at Low Temperature: An Application of Machine Learning and DFT in the Design and Innovation of Effective Catalyst Compositions. In Proceedings of the 2023 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), Regina, SK, Canada, 24–27 September 2023; pp. 342–347. [Google Scholar] [CrossRef]
  6. Ohyama, J.; Nishimura, S.; Takahashi, K. Data Driven Determination of Reaction Conditions in Oxidative Coupling of Methane via Machine Learning. ChemCatChem 2019, 11, 4307–4313. [Google Scholar] [CrossRef]
Figure 1. (a) Residual plot of C2y SVR (rbf) model (D2); (b) parity plot of C2y SVR (rbf) model (D2).
Figure 1. (a) Residual plot of C2y SVR (rbf) model (D2); (b) parity plot of C2y SVR (rbf) model (D2).
Engproc 76 00100 g001
Figure 2. (a) a visual illustration of the residuals against the predicted values from the model for both the training and test data for label C2y; (b) a parity plot that compares the predictions to real values for D2.
Figure 2. (a) a visual illustration of the residuals against the predicted values from the model for both the training and test data for label C2y; (b) a parity plot that compares the predictions to real values for D2.
Engproc 76 00100 g002
Figure 3. (a) loss and validation loss for the C2y predictive model over 200 epochs, and (b) the plot of the loss and validation loss of the overfitted C2y predictive model.
Figure 3. (a) loss and validation loss for the C2y predictive model over 200 epochs, and (b) the plot of the loss and validation loss of the overfitted C2y predictive model.
Engproc 76 00100 g003
Figure 4. (a) the visual distribution of the residual (training) data against the predicted (residual) data of the XGBR C2y predictive model using reaction conditions and DFT-computed catalyst electronic properties as dataset features (D2); (b) a parity plot comparing the real data to the predicted data from the model.
Figure 4. (a) the visual distribution of the residual (training) data against the predicted (residual) data of the XGBR C2y predictive model using reaction conditions and DFT-computed catalyst electronic properties as dataset features (D2); (b) a parity plot comparing the real data to the predicted data from the model.
Engproc 76 00100 g004
Figure 5. C2y SVR (rbf) model (D2) residual plot.
Figure 5. C2y SVR (rbf) model (D2) residual plot.
Engproc 76 00100 g005
Figure 6. Comparison of predictive models based on the order of data fit.
Figure 6. Comparison of predictive models based on the order of data fit.
Engproc 76 00100 g006
Figure 7. (a) Fermi energy of the catalyst promoter metal vs. C2y; (b) bandgap of the active metal oxide vs. C2y.
Figure 7. (a) Fermi energy of the catalyst promoter metal vs. C2y; (b) bandgap of the active metal oxide vs. C2y.
Engproc 76 00100 g007
Figure 8. The feature impact of the C2y predictive (a) RFR model, as well as that of (b) the XGBR.
Figure 8. The feature impact of the C2y predictive (a) RFR model, as well as that of (b) the XGBR.
Engproc 76 00100 g008
Figure 9. Three-dimensional surface plots of C2y against number of moles of M2, atomic number of M1, and fermi energy of M1; (a) 700 °C, (b) 800 °C, and (c) 900 °C.
Figure 9. Three-dimensional surface plots of C2y against number of moles of M2, atomic number of M1, and fermi energy of M1; (a) 700 °C, (b) 800 °C, and (c) 900 °C.
Engproc 76 00100 g009
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ugwu, L.; Morgan, Y.; Ibrahim, H. Analysis of the Descriptors for the Oxidative Coupling of Methane Reaction, Using Varying Machine Learning Approaches. Eng. Proc. 2024, 76, 100. https://doi.org/10.3390/engproc2024076100

AMA Style

Ugwu L, Morgan Y, Ibrahim H. Analysis of the Descriptors for the Oxidative Coupling of Methane Reaction, Using Varying Machine Learning Approaches. Engineering Proceedings. 2024; 76(1):100. https://doi.org/10.3390/engproc2024076100

Chicago/Turabian Style

Ugwu, Lord, Yasser Morgan, and Hussameldin Ibrahim. 2024. "Analysis of the Descriptors for the Oxidative Coupling of Methane Reaction, Using Varying Machine Learning Approaches" Engineering Proceedings 76, no. 1: 100. https://doi.org/10.3390/engproc2024076100

APA Style

Ugwu, L., Morgan, Y., & Ibrahim, H. (2024). Analysis of the Descriptors for the Oxidative Coupling of Methane Reaction, Using Varying Machine Learning Approaches. Engineering Proceedings, 76(1), 100. https://doi.org/10.3390/engproc2024076100

Article Metrics

Back to TopTop