1. Introduction
Methane dry reforming (DRM) is a pivotal process in sustainable energy research, enabling the simultaneous utilisation of methane (CH
4) and carbon dioxide (CO
2) to produce synthesis gas suitable for downstream fuels and chemical manufacture. Beyond greenhouse gas mitigation, DRM offers strategic flexibility in syngas composition. However, practical deployment remains constrained by high operating temperatures, catalyst deactivation, and strong coupling between reaction variables, which complicates systematic optimisation and interpretation of experimental trends [
1,
2,
3,
4].
From a thermodynamic perspective, DRM is highly endothermic (ΔH° = +247.3 kJ mol
−1), requiring elevated temperatures to achieve appreciable conversion. Catalytic systems, particularly Ni-based formulations, enable lower operating temperatures while maintaining reasonable activity. Numerous studies have demonstrated the influence of metal loading, dispersion, and support composition on DRM performance and stability [
5]. The physicochemical behaviour of DRM systems has been extensively studied, including thermodynamic constraints, carbon formation mechanisms, and syngas composition optimisation. In our previous work (Devasahayam et al., 2025) [
6], the influence of process conditions on carbon intensity and energy efficiency was quantified, highlighting the dominant role of temperature and feed composition. These established insights provide a mechanistic foundation for interpreting the machine learning outputs in the present study.
Despite these advances, catalyst deactivation and sensitivity to operating conditions remain persistent challenges [
4,
7,
8,
9,
10]. The mechanisms of catalyst deactivation have been discussed extensively in the authors’ prior work [
11]; accordingly, catalyst stability and lifetime are beyond the scope of the present study.
Accurate prediction of CH
4 and CO
2 conversions and H
2 and CO yields is essential for analysing DRM behaviour and narrowing experimental design spaces. Conventional optimisation approaches—including response surface methodology, kinetic modelling, and computational simulations—provide valuable insight but are often constrained by extensive data requirements, simplifying assumptions, or high computational cost, particularly when applied to small experimental datasets typical of catalytic DRM studies [
8,
12]. As a result, conditional and regime-dependent effects of secondary variables such as feed ratio or metal loading are frequently masked by global trend analysis.
In this context, machine learning (ML) methods have emerged as complementary analytical tools for complex reaction systems. Tree-based ensemble models, such as CatBoost, are well suited to small, heterogeneous datasets and nonlinear relationships. However, predictive accuracy alone is insufficient for scientific insight; transparent interpretation is necessary to ensure that ML-derived trends remain physically consistent and comparable with established DRM understanding. Interpretable machine learning, therefore, offers potential value not in rediscovering known global effects—such as the dominance of temperature—but in revealing how variable importance and interactions evolve across operating regimes [
6,
12]. A similar SHAP Analysis for Interpretation and relevant workflow is reported for a gold flotation system for a small dataset [
13].
Comparisons of ML-identified trends with known DRM behaviour have been reported in prior publications [
14,
15,
16,
17,
18,
19]. To support interpretability, SHapley Additive exPlanations (SHAP) is employed as a post hoc attribution framework. Unlike conventional regression or ANN approaches, SHAP enables decomposition of model predictions into additive feature contributions, allowing identification of conditional and interaction effects that are not directly observable through traditional statistical analysis. A similar SHAP Analysis for Interpretation and relevant workflow is reported for a gold flotation system for a small dataset [
13]. In this work, SHAP is used explicitly as an analytical lens rather than as a surrogate for mechanistic or kinetic modelling. The objective is not to infer new catalytic mechanisms, but to examine whether interpretable ML can identify regime-specific and conditional variable influences in a data-limited DRM dataset that are obscured by global averages or conventional regression analysis. Specifically, the study addresses the question: can interpretable machine learning identify conditional interaction patterns—such as temperature-dependent roles of feed ratio and Ni loading—that are not evident from global trend analysis alone?
The analysis is conducted using a previously published Ni/CaFe
2O
4 DRM dataset reported by Hossain et al. (2016), comprising 27 experimental runs spanning 700–800 °C, CH
4/CO
2 ratios of 0.4–1.0, and Ni loadings of 5–15 wt% [
20,
21]. This dataset was selected due to its systematic variation in key operating parameters under realistic experimental constraints. Ni is widely employed for C–H activation in DRM, while spinel supports such as CaFe
2O
4 are known to enhance thermal stability and mitigate sintering and coking at elevated temperatures [
8]. All experimental data originate entirely from prior published studies; no new experiments, catalysts, or materials are introduced.
Within these constraints, the contribution of this work is methodological. By combining cross-validated CatBoost models with SHAP-based global and local attribution, the study demonstrates a structured, reproducible workflow for extracting variable importance hierarchies and conditional interaction patterns from small DRM datasets. The analysis focuses on how the relative influence of secondary variables evolves across temperature and feed-ratio regimes, rather than on reaffirming established thermodynamic dominance. All conclusions are dataset-specific and intended to support hypothesis generation and experimental prioritisation under data-limited conditions, complementing—rather than replacing—experimental and mechanistic DRM studies.
2. Methodology
2.1. Data Collection
All analyses in this study are performed on the same published dataset reported by Hossain et al., 2016, for Ni/CaFe
2O
4 catalysts [
8]; no new experiments were conducted (
Table 1). The workflow diagram and detailed predictive performance and parity-plot validation for DRM machine learning models using this dataset have already been reported and are therefore not repeated here [
21]. The dataset lacks time-on-stream (TOS) and post-DRM carbon/graphitisation metrics; deactivation analysis is therefore out of scope. Side reactions such as the reverse water–gas shift were not explicitly analysed, as the source dataset does not report water formation or reaction selectivity. However, these aspects have been addressed in our earlier work [
11,
12,
22].
The catalytic performance of Ni/CaFe
2O
4 was evaluated at reaction temperatures of 700 °C, 750 °C, and 800 °C, with feed ratios (CH
4:CO
2) of 0.4, 0.7, and 1.0 under atmospheric conditions. The gas hourly space velocity (GHSV) was maintained at 30,000 h
−1 STP. The composition of the outlet gases (CO
2, CH
4, CO, and H
2) was analysed using a gas chromatography (GC) system (Agilent Technologies, Santa Clara, California, USA) equipped with a thermal conductivity detector (TCD). Helium (He) was used as the carrier gas at a flow rate of 20 mL/min, with the column operating at 120 °C and the detector at 150 °C (column pressure < 90 psi). Gas separation and quantification for H
2, CH
4, and CO
2 were performed using a Hayesep DB column, while CO was analysed with a Molecular Sieve 13X column [
8].
The conversions of CH
4 and CO
2 and the yields of H
2 and CO are calculated using the following Equations (1)–(4):
where F
CO2in, F
CH4in, F
CO2out, and F
CH4out are the inlet and outlet molar flow rates of CO
2 and CH
4, respectively. FH
2 and F
CO are the outlet molar flow rates of H
2 and CO [
8]. The reported conversions and yields follow the definitions in the source publication; time-on-stream data are not reported therein and are therefore outside the scope of the present analysis.
2.2. Preprocessing the Data
The dataset was standardised using z-score normalisation across features. Z-score formula: The z-score is computed as (Equation (5)):
where:
X is the original data point;
μ is the mean of the feature;
σ is the standard deviation of the feature.
A z-score indicates how many standard deviations a particular value is from the mean in a normally distributed dataset. By normalising the data using z-scores, all features are adjusted to a uniform scale (mean of 0 and standard deviation of 1), facilitating comparison and analysis across variables.
2.3. Data Preparation and Model Training
The previously published dataset (
Section 2.1) was used for all machine learning analyses. The three input features—feed ratio, reaction temperature, and metal loading—were normalised prior to model training. The four measured reaction outcomes (CH
4 conversion, CO
2 conversion, H
2 yield, and CO yield) served as target variables (
Table 1). No additional preprocessing was required beyond sorting the data by feed ratio for clarity.
2.3.1. Feature and Target Variables
As described in
Section 2.1, the dataset comprises three process variables used as model features and four reaction metrics used as targets. These variables were used directly for training CatBoost regression models.
2.3.2. Feature Transformation
Although ensemble methods such as Random Forests and Gradient Boosting can capture nonlinear relationships directly, polynomial features were generated to explicitly represent interaction effects among process variables [
23]. These engineered features provide additional mechanistic insight by isolating how combined changes in feed ratio, reaction temperature, and metal loading influence DRM performance. The following polynomial terms were therefore introduced:
Feed ratio2 captures nonlinear effects of methane-to-CO2 ratio, reflecting how extreme feed compositions amplify or suppress conversion and yield responses.
Reaction Temp2 represents curvature in temperature dependence, allowing the model to distinguish between linear heating effects and accelerated kinetic responses at higher temperatures.
Metal loading2 captures diminishing or accelerating effects of increasing Ni content, which may influence active-site density and catalytic behaviour nonlinearly.
Feed ratio × Reaction Temp identifies coupled effects between feed composition and temperature, such as how higher temperatures may compensate for suboptimal CH4:CO2 ratios.
Feed ratio × Metal loading reflects how catalyst formulation interacts with inlet gas composition, potentially revealing sensitivities between methane-rich feeds and Ni dispersion.
Reaction Temp × Metal loading captures synergistic effects between temperature and Ni content, where metal loading may enhance activity only above certain thermal thresholds.
Together, these features create explicit mathematical representations of interactions known to influence DRM kinetics. The inclusion of polynomial features enables representation of second-order interactions (e.g., temperature–feed ratio coupling), which are expected in DRM due to the interplay between reaction thermodynamics, kinetics, and carbon formation mechanisms. Including them allows the model to quantify how variable combinations shape the four reaction outcomes, providing deeper interpretability beyond the original linear inputs.
2.4. Model Training, Validation, and Selection
Machine learning models were trained, validated, and selected using a workflow consistent with our previously published DRM modelling study based on the same 27-run dataset [
21]. Hyperparameter optimisation was performed using GridSearchCV under cross-validation to balance model flexibility and generalisation while avoiding overfitting in the data-limited regime.
Given the small dataset size, a multi-stage validation strategy was employed. The initial model evaluation used a 90:10 train–test split, followed by k-fold cross-validation (3-fold and 5-fold). In addition, leave-one-out cross-validation (LOOCV) was applied to maximise data utilisation and obtain variance-reduced performance estimates. Model performance was assessed using R2, RMSE, MAE, and MAPE. Strict separation between training and validation data was maintained throughout hyperparameter tuning and evaluation to prevent data leakage.
Multiple regression models, including Random Forest, Gradient Boosting, Support Vector Regression, and CatBoost, were evaluated under this common validation framework. Among these, CatBoost consistently provided the most stable and accurate performance across all four target variables (CH4 conversion, CO2 conversion, H2 yield, and CO yield) under both k-fold cross-validation and LOOCV.
The CatBoost model was implemented with optimised hyperparameters, including tree depth, learning rate, and number of iterations. Model performance was evaluated using R2, RMSE, and MAE. To ensure robustness under data-limited conditions, multiple validation strategies were employed, including 3-fold cross-validation, 5-fold cross-validation, and leave-one-out cross-validation (LOOCV). Consistency across these validation approaches was used to assess model stability.
CatBoost was therefore selected as the primary model for subsequent interpretability analysis. Its ordered boosting strategy, inherent regularisation, and suitability for small tabular datasets reduce overfitting risk and yield stable, reliable SHAP attributions. The selection of CatBoost is supported by its demonstrated robustness on small tabular datasets and its ability to capture nonlinear interactions efficiently, consistent with findings reported in our previous work [
21]. All SHAP analyses reported in this study are based on CatBoost models trained under the above validation protocol.
2.5. SHAP Analysis and Visualisation
SHapley Additive exPlanations (SHAP) was employed as a post hoc attribution framework to interpret the predictions of the trained CatBoost models. SHAP values were computed using the CatBoost-integrated TreeExplainer, which provides exact SHAP values for tree-based models. No sampling-based or approximate SHAP methods were used.
SHAP values were calculated only after model hyperparameters were stabilised via GridSearchCV and validated using leave-one-out cross-validation (LOOCV). Fixed random seeds were applied throughout to ensure reproducibility. While LOOCV maximises data utilisation for small datasets, additional validation using 3-fold and 5-fold cross-validation was performed to confirm the consistency of model performance and reduce variance in performance estimation. SHAP analysis was conducted on the full validated dataset without extrapolation beyond the experimental domain.
Global feature importance was assessed by aggregating absolute SHAP values across all samples for each target variable (CH4 conversion, CO2 conversion, H2 yield, and CO yield). To examine conditional and nonlinear dependencies, SHAP dependence plots were constructed by relating feature values to their SHAP contributions, with colour gradients indicating interacting variables. This representation enables identification of regime-specific changes in variable influence that are not captured by global metrics or linear regression coefficients.
To further characterise interaction effects, SHAP interaction values were computed, with particular attention to the temperature–feed ratio and temperature–metal loading interactions. These interactions were visualised using complementary representations, including two-dimensional scatter plots, contour maps, three-dimensional surface plots, and triangulated surface (trisurf) plots. Consistency of observed trends across these visualisation modes was used as an internal robustness check.
Instance-level attribution was examined using SHAP waterfall plots, which decompose individual predictions into additive feature contributions relative to the model baseline. This analysis was used to contextualise global and conditional trends under specific operating conditions, without implying universal operating optima.
All SHAP analyses are dataset-specific and analytical in nature. They are intended to examine how variable importance and interaction structure evolve across the experimental parameter space, rather than to infer new catalytic mechanisms or establish generalised DRM operating rules.
3. Results
Parity and validation plots are not the focus of the present study and have been extensively reported in the authors’ earlier DRM machine learning analyses [
6,
21]. In the current work, model validation serves only to confirm that predictive performance is sufficient to support SHAP-based interpretation, rather than to benchmark predictive superiority.
The SHAP results are presented with emphasis on conditional interaction patterns and relative-effect hierarchies, rather than rediscovery of established thermodynamic trends.
3.1. Overview of Key Findings
This Results Section addresses a single overarching question: how do reaction temperature, CH4/CO2 feed ratio, and Ni loading jointly influence DRM performance within a data-limited literature dataset?
The primary finding is that reaction temperature is the dominant global driver across all outputs, but its influence is strongly conditional, with feed ratio and Ni loading exerting their effects mainly through interaction with temperature rather than as independent variables. This observation is consistent with the thermodynamic and kinetic characteristics of DRM, an endothermic reaction (ΔH° ≈ +247 kJ mol
−1), where increasing temperature shifts the equilibrium toward syngas formation and enhances reaction rates [
24,
25].
Evidence for this hierarchy is presented across multiple complementary analyses. SHAP dependence and impact analyses (
Section 3.2 and
Section 3.6) establish the global dominance of temperature. Scatter and 3D visualisations (
Section 3.3 and
Section 3.4) reveal nonlinear modulation and regime-dependent behaviour. Instance-level SHAP waterfall plots (
Section 3.5) confirm that deviations from average trends arise primarily from interaction terms rather than single-feature effects.
The following subsections present these analyses in detail, with each visualisation serving as supporting evidence for the same underlying variable hierarchy rather than as independent findings.
3.2. Global Variable Hierarchy from SHAP Analysis
While H2 and CO yields in DRM systems may be influenced by competing reactions such as the reverse water–gas shift, explicit attribution is not possible in the present study due to the absence of water or selectivity data in the source dataset. Accordingly, the analysis is restricted to observed variable-importance rankings and interaction patterns derived from SHAP-based interpretation.
Across all SHAP-based visualisations—including dependence, scatter, and surface representations—reaction temperature consistently emerges as the dominant contributor to CH
4 conversion (
Figure 1a,b), CO
2 conversion (
Figure A1a,b), H
2 yield (
Figure A2a,b), and CO yield (
Figure A3a,b). The CH
4/CO
2 feed ratio exerts a secondary but consistently positive influence across all outputs, with its contribution strongly conditioned by temperature. This behaviour reflects thermodynamic limitations at low temperature, where insufficient energy restricts conversion irrespective of feed composition. Ni loading does not act as an independent primary driver; instead, its influence manifests predominantly through interaction effects at elevated temperatures.
Importantly, these variable hierarchies are reproduced consistently across multiple visualisation modalities, indicating that the identified trends reflect dataset-intrinsic structure rather than artefacts of a particular plotting approach. While temperature dominates globally, the principal analytical value of this section lies in demonstrating how the effects of feed ratio and Ni loading become condition-dependent across temperature regimes—behaviour not evident from global trend inspection alone.
3.2.1. For CH4 Conversion
For CH
4 conversion, SHAP dependence analysis indicates a consistently positive contribution from the CH
4/CO
2 feed ratio (
Figure 1a,b). Reaction temperature is the dominant controlling variable, governing conversion across the entire operating space. At lower temperatures, limited thermodynamic driving force suppresses conversion irrespective of feed ratio or metal loading, consistent with established DRM kinetic and equilibrium limitations [
6,
24,
25].
At higher temperatures, interaction effects between temperature, feed ratio, and Ni loading become increasingly influential, with elevated metal loading enhancing the positive contribution of temperature to CH
4 conversion (
Figure 1a,b). This indicates that catalyst loading becomes relevant only once kinetic limitations are overcome, consistent with catalytic reaction principles [
6].
The observed interaction between feed ratio and temperature reflects the known influence of CO
2 availability on carbon gasification reactions and syngas composition in DRM systems [
24,
25]. At higher temperatures, the increasing influence of the feed ratio highlights the role of CO
2 in promoting carbon removal and controlling product distribution.
3.2.2. CO2 Conversion
CO
2 conversion exhibits a positive dependence on the CH
4/CO
2 feed ratio, with the magnitude of this effect increasing at higher reaction temperatures (
Figure A1a). Reaction temperature remains the dominant variable; however, its positive impact on CO
2 conversion is strengthened under higher-Ni-loading conditions, indicating a temperature–metal interaction effect rather than an independent metal loading contribution (
Figure A1b).
3.2.3. For H2 Yield
For H
2 yield, higher CH
4/CO
2 feed ratios are associated with positive SHAP contributions, with this effect amplified at elevated reaction temperatures (
Figure A2a). Reaction temperature again dominates the response, while higher Ni loading enhances the positive temperature contribution, particularly in high-temperature regimes (
Figure A2b). These patterns indicate that Ni loading modulates hydrogen production conditionally rather than exerting a uniform global influence.
3.2.4. For CO Yield
CO yield shows a positive correlation with increasing CH
4/CO
2 feed ratio, an effect that becomes more pronounced at higher reaction temperatures (
Figure A3a). Reaction temperature exhibits both positive and negative contributions depending on operating conditions, reflecting nonlinear system behaviour. Higher Ni loading enhances the positive temperature contribution to CO yield, again highlighting interaction-driven rather than primary effects of metal loading (
Figure A3b).
3.3. Conditional Temperature Effects and Nonlinear Modulation
SHAP scatter plots are used to examine how reaction temperature contributes to model predictions for CH
4 conversion, CO
2 conversion, H
2 yield, and CO yield, relative to the model baseline. Across all four outputs (
Figure 2a–d), reaction temperature exhibits a strong, predominantly monotonic influence, with higher temperatures associated with increasingly positive SHAP contributions.
The vertical spread of SHAP values observed at fixed temperatures indicates that temperature effects are modulated by secondary variables rather than acting uniformly across the dataset. This dispersion reflects interaction-driven behaviour, primarily involving CH4/CO2 feed ratio and Ni loading, consistent with the interaction patterns identified in the dependence and surface analyses.
Notably, the scatter distributions reveal that temperature influence is nonlinear and regime-dependent. Certain temperature ranges exert a disproportionately strong effect on predicted performance, whereas other ranges show diminished sensitivity. These patterns confirm that while temperature is the dominant global driver of DRM performance, its local influence varies across the operating space.
Overall, the SHAP scatter plots reinforce the central finding that reaction temperature governs DRM outcomes, while simultaneously highlighting conditional modulation by feed ratio and metal loading. These observations are specific to the analysed dataset and provide complementary, instance-level evidence supporting the interaction hierarchies identified in preceding sections.
3.4. Interaction-Driven Regimes in Temperature–Feed Space
This section examines the joint influence of the reaction temperature and CH4/CO2 feed ratio on DRM performance using three-dimensional scatter, contour, and trisurf visualisations. These figures provide complementary perspectives on interaction-driven behaviour for CH4 conversion, CO2 conversion, H2 yield, and CO yield, enabling assessment of nonlinear trends and conditional sensitivities across the analysed parameter space.
3.4.1. 3D Scatter, Contour and Trisurf Plots of SHAP Values Plot for CH4 Conversions
The 3D scatter plot (
Figure 3a) shows that higher reaction temperatures are consistently associated with positive SHAP contributions to CH
4 conversion, with point dispersion indicating secondary modulation by feed ratio. This granularity reveals variability that is not captured by smoothed representations.
The contour plot (
Figure 3b) maps SHAP values continuously across the temperature–feed ratio domain, identifying a broad region where elevated temperatures exert a uniformly positive influence on CH
4 conversion. The smooth gradients suggest gradual transitions rather than abrupt regime changes.
The trisurf plot (
Figure 3c) highlights nonlinear interaction effects, with sharp surface gradients indicating that CH
4 conversion benefits most when high temperatures coincide with elevated feed ratios. Together,
Figure 3a–c, confirm temperature as the dominant driver, while demonstrating that feed ratio primarily influences conversion through interaction effects rather than as an independent control variable.
3.4.2. 3D Scatter, Contour and Trisurf Plots of SHAP Values Plot for CO2 Conversion
For CO
2 conversion, the 3D surface scatter plot (
Figure A4a) reveals a nonlinear response surface shaped predominantly by reaction temperature, with feed ratio influencing the magnitude of SHAP contributions under specific conditions. Data clustering at higher temperatures indicates enhanced conversion within restricted feed ratio ranges.
The contour plot (
Figure A4b) identifies temperature-controlled regions where SHAP values transition from positive to neutral or negative, suggesting the presence of threshold behaviour rather than monotonic dependence.
The trisurf plot (
Figure A4c) further illustrates localised reversals in SHAP values, reinforcing the interpretation that temperature governs CO
2 conversion, while the feed ratio acts as a secondary modifier within defined temperature regimes.
3.4.3. 3D Scatter, Contour and Trisurf Plots of SHAP Values Plot for H2 Yield
The 3D surface scatter plot (
Figure A5a) shows that positive SHAP contributions to H
2 yield are concentrated at higher reaction temperatures and feed ratios, confirming temperature dominance with conditional feed ratio enhancement.
The contour plot (
Figure A5b) reveals nonlinear response regions, particularly at lower feed ratios, where suppressed SHAP values indicate unfavourable conditions for H
2 formation.
The trisurf plot (
Figure A5c) highlights synergistic temperature–feed interactions, with steep gradients at high temperatures and feed ratios indicating threshold effects that enhance H
2 yield. These figures demonstrate that the feed ratio plays a more pronounced role for H
2 yield than for conversion metrics, but primarily through interaction with temperature.
3.4.4. 3D Scatter, Contour and Trisurf Plots of SHAP Values Plot for CO Yield
For CO yield, the 3D surface scatter plot (
Figure A6a) illustrates a strong temperature dependence, while increased dispersion at lower feed ratios reflects added complexity in response behaviour.
The contour plot (
Figure A6b) identifies regions of maximised CO yield at elevated temperatures and feed ratios, alongside zones of reduced SHAP contribution at intermediate conditions, indicating non-uniform sensitivity.
The trisurf plot (
Figure A6c) shows sharp increases in SHAP values at high temperatures and feed ratios, confirming a distinctly nonlinear response surface consistent with temperature-driven CO formation under DRM conditions.
3.4.5. Summary
Across all performance metrics, reaction temperature consistently emerges as the dominant contributor to DRM behaviour, with the feed ratio exerting a secondary, condition-dependent influence. The combined use of 3D scatter (
Figure 3a,
Figure A4a,
Figure A5a and
Figure A6a), contour (
Figure 3b,
Figure A4b,
Figure A5b and
Figure A6b), and trisurf plots (
Figure 3c,
Figure A4c,
Figure A5c and
Figure A6c) provides a coherent depiction of nonlinear interactions and threshold effects. These visualisations reinforce the conclusion that secondary variables primarily shape outcomes through interaction with temperature rather than through independent global control.
3.5. Instance-Level Attribution of Deviations from Global Trends
Instance-level SHAP waterfall analyses are used to illustrate how individual input variables contribute to deviations from the global trends identified in
Section 3.2,
Section 3.3 and
Section 3.4, e.g., model predictions, enabling assessment of how reaction temperature, CH
4/CO
2 feed ratio, and Ni loading combine to produce specific conversion and yield outcomes.
Across all target variables, reaction temperature consistently dominates the SHAP decompositions, contributing the largest positive shifts from the model baseline. This behaviour is observed for models trained using both original input features and polynomial feature expansions, confirming the robustness of temperature dominance within the analysed dataset.
Inclusion of polynomial features primarily affects the relative contribution of secondary variables. In particular, interaction terms involving reaction temperature and Ni loading become more prominent, indicating that catalyst loading influences predictions predominantly through temperature-conditioned effects rather than as an independent driver. These patterns reflect nonlinear dependencies present in the data without implying new mechanistic interpretations.
For CH
4 conversion, waterfall plots based on original and polynomial features (
Figure 3a,b) show reaction temperature as the primary contributor, with feed ratio and Ni loading providing smaller, instance-dependent adjustments. Similar behaviour is observed for CO
2 conversion (
Figure A7a,b), where temperature dominance is maintained while polynomial features highlight interaction effects with feed ratio.
For H
2 yield (
Figure A7c,d), reaction temperature again contributes most strongly to the predicted response, while polynomial representations reveal additional modulation by feed ratio and Ni loading under specific operating conditions. In the case of CO yield (
Figure A7e,f), temperature remains influential, but the polynomial models indicate a more distributed contribution across variables, reflecting increased nonlinearity in the response surface.
Overall, the waterfall analyses demonstrate that polynomial feature expansion enhances representation of nonlinear interactions and conditional effects, while preserving the dominant role of reaction temperature across all DRM performance metrics. These findings reinforce the dataset-specific, interpretative nature of the analysis and its consistency with established temperature-driven behaviour in DRM systems.
A comparative summary of SHAP waterfall results across all outputs and feature representations is provided in
Table 2. Waterfall plots are shown for models trained with original features (
Figure 4a and
Figure A7a,c,e) and polynomial features (
Figure 4b and
Figure A7b,d,f), with features ranked by their contribution to individual predictions.
3.6. Quantitative Impact Ranking and Interaction Strengths
SHAP impact analysis was used to quantify the relative influence of reaction temperature, CH
4/CO
2 feed ratio, and Ni loading—together with selected interaction terms—on CH
4 conversion, CO
2 conversion, H
2 yield, and CO yield. The resulting SHAP impact values (
Table 3) provide a comparative ranking of variable influence within the analysed dataset and are interpreted as descriptive indicators rather than operational recommendations.
Across all outputs, reaction temperature exhibits consistently high SHAP impact values, with temperatures around 800 °C corresponding to the strongest positive contributions. This confirms temperature as the dominant driver of conversion and yield under the reported experimental conditions. A CH4/CO2 feed ratio of 1.0 is likewise associated with elevated SHAP impacts, reflecting the strong sensitivity of DRM performance to feed composition. In contrast, the influence of Ni loading is more variable and output-dependent, contributing more strongly to CH4 conversion, H2 yield, and CO yield, while showing a comparatively greater relative importance for CO2 conversion at lower loadings.
Comparison of individual parameter impacts shows that feed ratio exerts the strongest influence on CH4 and CO2 conversions, with SHAP impact values of 4.62 and 4.14, respectively. Reaction temperature displays a similarly strong and uniform influence across all target variables, with SHAP impacts in the range of approximately 3.56–3.70. Metal loading has a more moderate effect overall, with CO yield exhibiting the lowest sensitivity to changes in catalyst loading (SHAP impact of 1.27). Interaction effects further highlight the coupled nature of DRM operating variables. Temperature–feed interactions exhibit the highest SHAP impacts, particularly for CO2 conversion (6.58) and CH4 conversion (5.10), indicating strong conditional dependence between thermal conditions and feed composition. Temperature–metal interactions also contribute significantly, with SHAP values in the range of approximately 3.07–3.80, suggesting that the effect of catalyst loading is amplified at higher temperatures. In contrast, feed–metal interactions show lower SHAP impact values (approximately 1.90–2.29), indicating a secondary but non-negligible role.
Overall, the SHAP impact hierarchy identifies temperature–feed interactions as the dominant contributors to system behaviour within the analysed dataset, followed by temperature–metal interactions, with feed–metal interactions playing a lesser role. These trends are consistent with established temperature-driven behaviour in DRM systems and are presented as dataset-specific analytical observations rather than generalised operating guidelines.
3.7. Methodological Contribution and Scope
Understanding the coupled behaviour of CH4 conversion, CO2 conversion, H2 yield, and CO yield in DRM requires analytical approaches that go beyond global model performance metrics. While conventional regression statistics (e.g., RMSE, R2) assess predictive accuracy, they do not reveal how individual variables and their interactions conditionally influence model outputs, particularly in data-limited experimental studies.
In this work, interpretable machine learning is used to extract dataset-specific variable hierarchies and interaction patterns from a previously published DRM dataset. By combining SHAP-based feature attribution with complementary visualisation approaches, the analysis quantifies how secondary variables (CH4/CO2 feed ratio and Ni loading) modulate performance conditionally across operating regimes, rather than globally. This integrated interpretation provides structured insight into nonlinear dependencies that are not readily accessible through simple correlation analysis or conventional regression models.
The contribution of this section is therefore methodological rather than mechanistic. No new reaction pathways or operating optima are proposed. Instead, the study demonstrates how interpretable ML can support transparent, non-prescriptive analysis of published catalytic datasets under realistic data constraints, complementing experimental and mechanistic DRM studies without replacing them.
4. Scope and Limitations
This study is restricted to variables reported in the source publication, and all limitations associated with dataset size, catalyst specificity, and absence of time-on-stream or deactivation metrics have been defined in the Introduction.
Robustness of SHAP Interpretations
Although the dataset is limited to 27 experimental runs and lacks mechanistic descriptors such as time-on-stream behaviour, surface area, metal dispersion, or coking indices, SHAP analysis remains meaningful within this constrained context. SHAP interpretations presented here are explicitly conditional—they apply only within the reported temperature, feed ratio, and metal loading ranges of the published dataset and are not intended as universal mechanistic generalisations. SHAP stability was reinforced through multiple measures: (i) models were validated using 3-fold, 5-fold, and LOOCV to reduce variance; (ii) SHAP patterns were confirmed to be consistent across CatBoost, Random Forest, and Gradient Boosting models; (iii) polynomial-feature models were used to capture nonlinear interactions; and (iv) SHAP dependence, scatter, surface, and waterfall plots all produced mutually consistent variable hierarchies, indicating that temperature dominance and secondary interaction effects are dataset-robust rather than artefacts of any single modelling approach.
Accordingly, SHAP interpretations in this work should be understood as dataset-specific, conditional explanations that remain robust within the validated operating window of the experimental DRM dataset.
5. Conclusions
The contribution of this study lies in quantitatively resolving conditional and interaction effects using interpretable machine learning, rather than rediscovering established thermodynamic trends. This study demonstrates that, when applied to a data-limited DRM dataset, interpretable machine learning provides structured, quantitative insight beyond global-trend confirmation. While reaction temperature is confirmed as the dominant controlling variable, the most significant scientific outcome is the quantification of conditional, regime-dependent influences of CH4/CO2 feed ratio and Ni loading, which become non-negligible only within specific temperature ranges. These effects are not readily apparent from conventional regression metrics or qualitative data inspection.
From a methodological perspective, this work highlights that SHAP-based analysis remains informative even for small experimental datasets, provided that interpretations are treated as local and conditional rather than globally predictive. In particular, SHAP interaction values enable systematic ranking of secondary variables and their coupled effects, offering a transparent alternative to black-box modelling in catalysis research.
Future work will focus on extending this validated interpretability workflow to larger, multi-source DRM datasets drawn from the literature, where comparative analysis across catalyst formulations and operating windows may enable the identification of more generalisable design hypotheses and experimental priorities.