Application of Meta-Analysis and Machine Learning Methods to the Prediction of Methane Production from In Vitro Mixed Ruminal Micro-Organism Fermentation.

Simple Summary In vitro gas production systems are regularly utilized to screen feed ingredients for inclusion in ruminant diets. However, not all in vitro systems are set up to measure methane (CH4) production, nor do all papers report in vitro CH4. Therefore, the objective of this study was to develop models to predict in vitro production of CH4, a greenhouse gas produced by ruminants, from in vitro gas and volatile fatty acid (VFA) production data, and to identify the major drivers of CH4 production in these systems. Meta-analysis and machine learning (ML) methodologies were applied to predict CH4 production from in vitro gas parameters. Meta-analysis results indicate that equations containing apparent dry matter (DM) digestibility, total VFA production, propionate, valerate and feed type (forage vs. concentrate) resulted in best prediction of CH4. The ML models far exceeded the predictability achieved using meta-analysis, but further evaluation on an external database would be required to assess their generalization capacity. The models developed can be utilized to estimate CH4 emissions in vitro. Abstract In vitro gas production systems are utilized to screen feed ingredients for inclusion in ruminant diets. However, not all in vitro systems are set up to measure methane (CH4) production, nor do all publications report in vitro CH4. Therefore, the objective of this study was to develop models to predict in vitro CH4 production from total gas and volatile fatty acid (VFA) production data and to identify the major drivers of CH4 production in these systems. Meta-analysis and machine learning (ML) methodologies were applied to a database of 354 data points from 11 studies to predict CH4 production from total gas production, apparent DM digestibility (DMD), final pH, feed type (forage or concentrate), and acetate, propionate, butyrate and valerate production. Model evaluation was performed on an internal dataset of 107 data points. Meta-analysis results indicate that equations containing DMD, total VFA production, propionate, feed type and valerate resulted in best predictability of CH4 on the internal evaluation dataset. The ML models far exceeded the predictability achieved using meta-analysis, but further evaluation on an external database would be required to assess generalization ability on unrelated data. Between the ML methodologies assessed, artificial neural networks and support vector regression resulted in very similar predictability, but differed in fitting, as assessed by behaviour analysis. The models developed can be utilized to estimate CH4 emissions in vitro.


Introduction
Globally, greenhouse gas (GHG) emissions from the agriculture, forestry and other land use (AFOL) sector account for~23% of the global anthropogenic GHG total emissions [1], with enteric methane (CH 4 ) from fermentation in the forestomach of ruminants representing 32%-40% of that total [1] (thereby 7.4%-9.2% of the global anthropogenic total). From the farmer's perspective, CH 4 also represents an energy loss and an inefficiency of production, ranging from approximately 3.0 (feedlot cattle) to 7.0 (forage fed cattle) percent of gross energy intake, with a ±20% uncertainty [2]. As a result, and to meet public expectation for sustainably produced food products, the agriculture sector has mobilized to examine a large array of potential CH 4 (as well as N and P excretion) mitigation strategies [3][4][5], to reduce the environmental impact of livestock and food production.
At the animal level, CH 4 is produced as a byproduct of anaerobic fermentation in the rumen and hindgut of ruminants, whereby methanogens utilize H 2 to obtain ATP by reducing CO 2 to CH 4 [6]. The removal of H 2 through methanogenesis, the main H-sink in the rumen [6], prevents the inhibitory effect of H 2 on ruminal fermentation and allows for the degradation and fermentation of feed to proceed. When methanogenesis is reduced, other pathways must be promoted to utilize H 2 or otherwise fermentation, digestibility and intake may be negatively affected [6].
As animal experiments to evaluate feedstuffs and feed additives are costly, time consuming and do not guarantee conclusive outcomes, the in vitro gas production technique represents a viable option for prescreening or screening of feedstuffs/additives for potential inclusion in the ration of modern dairy cows, beef cattle and other ruminants. However, CH 4 is often, but not always, included in the gases measured during in vitro incubation (particularly in developing countries where equipment may be unaffordable, unavailable or limited, for example). A reliable measure of CH 4 from in vitro cultures of mixed ruminal micro-organisms would be a useful tool to assess the potential dietary effects on methanogenesis. Estimation of CH 4 from the output of other fermentation end-products commonly measured in vitro could be a suitable alternative, and Jayanegara [7] proposed the use of the stoichiometric equations of Hegarty and Nolan [8] and of Moss et al. [9] to predict in vitro CH 4 . However, using this approach CH 4 was generally overpredicted, presumably because in vitro H 2 recovery observed in practice was substantially less than that assumed by the stoichiometric models. The objectives of this study were therefore to: (1) to develop empirical models to predict in vitro CH 4 production from in vitro gas production measures-via meta-analysis (multiple linear regression) and machine learning (ML) methods (artificial neural networks, ANN, and support vector regression, SVR), and (2) to identify the fermentation parameters most closely related to CH 4 production in vitro.

Database
The database compiled for this study consisted of 397 in vitro rumen fermentation bottle means (each the average of 3-5 replicate measurements), taken after 24 h of incubation, from 13 experiments reported in 10 publications [10][11][12][13][14][15][16][17][18][19] (experiments 1-3 were from publication [10]), plus 1 unpublished study [20]. As a result, experimental animals were not directly employed in this study. In accordance with the National Centre for the Replacement Refinement and Reduction of Animals in Research (NC3Rs), per Directive 2010/63/EU, all study data used were publicly available (with the exception of the one unpublished study) as reported in the aforementioned articles. Studies evaluated the in vitro gas and CH 4 production from oven-dried feedstuffs, including ryegrass, forbs, grass silages, clover, maize silage and other whole-crop cereal silages and concentrate feeds (no feed additives or rumen modifiers were included in the database). Feed type (FT) was categorized as either forage (FT = 1) or concentrate (FT = 2). The database included in vitro measurements of CH 4 gas production (CH 4 i, mL/g DM incubated, and CH 4 d, mL/g DM apparently digested), total gas production (TGP, mL/g DM incubated), apparent DM digestibility (DMD, g/g), volatile fatty acid production (VFA, mmol/g DM incubated), molar proportions of acetic acid (AC, mmol/mol VFA), propionic acid (PR, mmol/mol VFA), butyric acid (BT, mmol/mol VFA) and valeric acid (VL, mmol/mol VFA), the acetate to propionate ratio (C2C3) and the incubation medium final pH (pH). Daily production of each volatile fatty acid (ACp, PRp, BTp or VLp for mmol AC, PR, BT or VL produced per g DM incubated, respectively) was calculated from total VFA and the corresponding molar proportions. Variable abbreviations, units and descriptions are also summarized in Appendix A (Table A1). When digestibility is measured during in vitro batch cultures of mixed ruminal micro-organisms, it is assumed that DM disappearance after the incubation time (in this particular case 24 h) is an acceptable metric of apparent DM digestibility. Due to missing data, two studies [11,12], were removed from the database, leaving 354 observations from 11 experiments.
For model development and evaluation purposes, the dataset (n = 354) was divided into two subsets, the first one for training and model development purposes (70% of data, n = 247, with 4 outlier data points removed for meta-analysis), and the second one for model testing and evaluation (internal evaluation) purposes (30% of data, n = 107). Aside from 4 data points which were removed for the meta-analysis (statistical outliers), the 'two' developmental datasets were identical. Division of data points into the training or evaluation datasets was via random assignment, but each contained a proportional number of observations relative to the FT variable. Descriptive statistics for the training and evaluation datasets are provided in Table 1. Independent testing of the model gives a measure of the model's 'generalization ability' ('test error'), or the ability to make predictions on unseen data. This is particularly important for some ML approaches, which may achieve very accurate predictions, but essentially model the noise in the data.

Model Fitting-Meta-Analysis
The main effects of in vitro fermentation variables (TGP, DMD, VFA, AC, ACp, PR, PRp, BT, BTp, VL, VLp, C2C3 and final pH) were analyzed for inclusion in predictive models using the PROC MIXED procedure of SAS [21] to predict CH 4 i (mL/g incubated DM), or CH 4 d (mL/g DM apparently digested). Equations were fitted to the training dataset (Table 1-meta-analysis).
The mixed model analysis was chosen because the data were compiled from multiple studies, and thus the experiment was considered as a random effect [22]. If, when running the model, the random covariance or the random slope was not significant, they were removed from the model or simplified [22], though the random intercept term was always retained. The dual quasi-Newton technique was used for optimization with an adaptive Gaussian quadrature as the integration method. Normal distribution of the random study effect was assessed via Q-Q distribution plot, and normality of residuals via examination of the residual plots (PROC MIXED).
Three approaches were taken to fitting mixed models to this dataset: (1) univariate analysis of each dependent-independent variable combination (explanatory variable in linear, quadratic or cubic form); (2) multivariate analysis, preceded by examination in PROC REG (MaxR) and assessment for collinearity between driving variables in PROC CORR/visual plotting; and (3) multivariate analysis based on known biological principles. Approaches (2) and (3) are not distinguished/presented separately in the results, as both are considered 'multivariate'. A fourth approach was also included for comparison with the ML models (described below): (4) where all driving variables were included, irrespective of significance or collinearity (linear equation form). With the exception of approach (4), only equations with significant slope parameters (p < 0.05) and normally distributed residuals/random effects were retained and evaluated.

Model Fitting-Machine Learning
The in vitro fermentation variables (TGP, DMD, VFA, AC, PR, BT, VL, C2C3 and final pH) were retained as potential driving variables for development of ML-based predictive models for CH 4 i and CH 4 d. Predictive models were fitted on the training dataset (Table 1-Machine learning). The raw dataset was subjected to a preprocessing normalization process (standard scalar) [23] according to: where Z is the normalized value, X is the raw value, u is the mean of the training samples and S is the standard deviation of the training samples. The objective of this normalization step was to improve the convergence of the training process in the regression methods utilized [24]. Subsequently, two ML techniques (support vector regression and artificial neural network) were implemented using the Scikit-learn software library [25] for the Python programming language [26]. For both ML approaches, a 10-fold cross-validation procedure was used to fit the predictive models to the training dataset (n = 247; the evaluation database, n = 107, was therefore not included in this analysis). The training dataset was subsequently randomly split into 10 equal subgroups, and the model was trained using nine of the subsets and validated on the remaining one part of the data to compute a performance measure. This holdout process was repeated for each of the 10-folds, such that each subset was utilized for validation, whereas the other nine subsets were pooled for the training, in turn. The error estimation was averaged over the 10 iterations to assess the fit performance.

Support Vector Regression
Support Vector Machine (SVM) is a ML technique based on supervised learning with a modality oriented for regression problems, namely Support Vector Regression (SVR), able to forecast continuous variables [27] (in this case, CH 4 from in vitro cultures of mixed ruminal micro-organisms). The SVR method transforms the input data (previously normalized) into a multidimensional space by using nonlinear mapping, and a linear regression procedure is applied to each hyperplane obtained to calculate the desired output. The SVR method is developed by changing the kernel function and tuning the parameters C (the regularization parameter), γ (the kernel coefficient), Tol (tolerance for stopping criterion) and degree of the polynomial. Three 'kernel' functions were considered-linear, radial basis function and polynomial. The ranges of values used for the parameter optimization were C ∈ {1, 10, 100, 1000}, Tol ∈ {0.1, 0.01}, γ ∈ {1, 0.1, 0.01, 0.001} and degree (only for the polynomial function) ∈ {2, 3, 4, 5, 6}. Grid search combined with cross-validation [27] were used to achieve the best combination of parameters resulting in the optimal and most robust SVR model solution, on the basis of the ε-insensitive loss function. The best SVR models for both variables to be predicted (CH 4 i or CH 4 d) were obtained using the radial basis function as kernel, with the parameter values C = 1000, Tol = 0.1 and γ = 1.

Artificial Neural Network-Multilayer Perceptron
A multilayer perceptron (MLP) is a machine learning method based on supervised learning, and is a specific topology of a feedforward artificial neural network (ANN) [28]. The MLP network used for the current study was composed of three layers of nodes: the input layer, one hidden layer and an output layer. Achieving the optimal MLP architecture can require tuning a number of hyperparameters such as the number of hidden layers, neurons or iterations. For the current study, one hidden layer was applied and the rectified linear unit (ReLU) nonlinear activation function was implemented in each node (neuron) of this hidden layer (except the input nodes). On the other hand, the single neuron of the output layer utilized the linear activation function [28]. The training procedure was based on the backpropagation technique, using grid search combined with cross-validation [28] to derive the best combination of parameters resulting in the optimal and most robust MLP model solution.
The square-error loss function and the limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) numerical method were used for optimization. The number of hidden neurons in the best MLP models was 26 and 27 for CH 4 i and CH 4 d predictions, respectively, (the range tested was from 12-30 neurons in the hidden layer). Other hyperparameters were included in the grid for tuning aiming to optimize the training of the ANN. The best results were observed when early stopping was activated (to prevent overfitting), any prior attributes stored on the estimator were cleared ("warm start" disabled), initial learning rate was set at 10 −7 and kept constant, and the batch size for each iteration was equal to 1.

Model Evaluation
Model predictions developed in the current study (via meta-analysis, SVR or ANN) were evaluated using an independent data subset (internal evaluation, as the data are independent but related to the training dataset), described in Section 2.1 and in Table 1. Models were evaluated for their predictability using mean square prediction error (MSPE), calculated as: where O i is the observed value, P i is the predicted value and n is the number of observations. Square root of the MSPE (RMSPE), expressed as a proportion of the observed mean (RMSPE, %), gives an estimate of the overall prediction error. The RMSPE was decomposed into random (disturbance) error (ED), error due to deviation of the regression slope from unity (ER), and error in central tendency due to overall bias (EB) [29]. The EB, ER and ED fractions of MSPE were calculated as: where P and O are the predicted and observed means, S p is the standard deviation of predicted values, S o is the standard deviation of observed values and R is the Pearson correlation coefficient. Correspondence between predicted and observed values was also assessed by the concordance correlation coefficient (CCC) [30], which was calculated as: where C b is a bias correction factor (a measure of accuracy), and R is the Pearson correlation coefficient (a measure of precision). The C b variable is calculated as: so that ν provides a measure of scale shift, while µ provides a measure of location shift. The ν value indicates the change in standard deviation, if any, between predicted and observed values. A positive µ value indicates underprediction, while a negative µ indicates overprediction. Predictions were further evaluated visually against observations (via predicted vs. observed plots) as well as against residuals (residual vs. predicted, not shown). As one criticism of many ML methodologies remains their lack of transparency (i.e., no predictive equation is produced), models developed via ANN and SVR were further evaluated using behaviour analysis, where model inputs were systematically altered ±10% (in isolation) and the model's 'behavioural' response (% change in output prediction) was assessed (direction and magnitude).

Correlation Matrix Analysis
Potential X variables were evaluated against each other via correlation matrix analysis, to determine the extent of collinearity between X variables (Table 2). X variables that were highly collinear with each other (correlation >0.500) are highlighted in grey ( Table 2). X variables that were highly collinear included TGP and DMD, TGP and VFA/ACp/PRp, VFA and ACp/PRp/BTp, AC and PR/PRp, ACp and PRp/BTp, PR and PRp, PRp and BTp, BT and BTp, pH and BTp. These combinations were therefore avoided in multivariate meta-analysis equation development.

Univariate Meta-Analysis Models
Seventy-eight univariate equations to predict CH 4 d or CH 4 i were developed and evaluated with the variables presented in Table 2, in linear, quadratic or cubic form. Those with nonsignificant slope parameters or model fitting (fixed or random) problems were discarded, and the remaining equations (n = 22) were assessed on the evaluation dataset. On average, the CH 4 i outcome was predicted with higher CCC and lower RMSPE values compared to the CH 4 d outcome ( Table 3). The best six performing equations have their model evaluation results presented in Table 3. The best performing univariate equations included the X variables ACp, PRp, DMD, VLp, VFA and TGP. The best performing univariate equations were those predicting CH 4 i with TGP as a driving X variable, with a CCC on the evaluation database of 0.644 (quadratic) and 0.650 (linear) ( Table 3).  Tables 1 and 2  The results of univariate equation development (Table 3) agreed roughly with the correlation analysis (Table 2), where the variables most highly correlated with CH 4 i (TGP, VFA, ACp, VLp, DMD) and CH 4 d (DMD, AC, C2C3, PR) ( Table 2) appeared in the best performing univariate equations (Table 3). Some differences were evident, for example in the R-values, which may be explained by the difference in approach (correlation across all data points vs. correlation within study). The best performing univariate equations (U6, U12) were as follows: CH 4 i (mL CH 4 /g DM incubated) = 3.00 (± 1.546) + 0.149 (± 0.005) × TGP (mL/g DM incubated) (U12) (11)

Multivariate Meta-Analysis Models
Seventy-two multivariate equations, to predict CH 4 d or CH 4 i, were developed and evaluated with the variables presented in Table 2, in linear combinations. Those with nonsignificant slope parameters, model fitting problems (fixed or random) or had multiple X variables which were previously deemed to be collinear (Table 2) were discarded, and the remaining equations (n = 29) were evaluated on the evaluation dataset. Evaluation of the top six performing multivariate equations (for each of CH 4 d and CH 4 i) is reported in Table 4.
The overall best performing equations (from univariate or multivariate origin, CH 4 d, CH 4 i) are presented in Table 5, and their predicted vs. observed plots are illustrated in Figure 1.

Support Vector Regression and Artificial Neural Network Models
Evaluation of SVR and ANN models developed are presented in Table 6. Both SVR and ANN models demonstrated high predictability on the test dataset, with CCC values >0.90 for both CH4d and CH4i. For comparison purposes, meta-analysis equations METd and METi were also developed, via meta-analysis, but included all X variables (in linear form, regardless of significance). The CCC values for these equations were 0.645 and 0.734, respectively ( Table 6), indicating that the SVR and MLP models must consider a complex multiple-nonlinear response surface between the X variables and Y variables, in order to achieve substantially higher CCC values. The predicted vs. observed plots for these models are illustrated in Figure 2.   Tables 1 and 2

Support Vector Regression and Artificial Neural Network Models
Evaluation of SVR and ANN models developed are presented in Table 6. Both SVR and ANN models demonstrated high predictability on the test dataset, with CCC values >0.90 for both CH 4 d and CH 4 i. For comparison purposes, meta-analysis equations METd and METi were also developed, via meta-analysis, but included all X variables (in linear form, regardless of significance). The CCC values for these equations were 0.645 and 0.734, respectively ( Table 6), indicating that the SVR and MLP models must consider a complex multiple-nonlinear response surface between the X variables and Y variables, in order to achieve substantially higher CCC values. The predicted vs. observed plots for these models are illustrated in Figure 2.   ) and X are the explanatory variables (all variables included in this analysis for comparison purposes), 2 Mean = mean of predicted values; SEM = standard error of the mean of predicted values; RMSPE = root mean square prediction error expressed as a percentage of the observed mean; EB, ER and ED = error due to bias, regression and disturbance, respectively (all as % of total MSPE); CCC = concordance correlation coefficient; R = Pearson correlation coefficient (measure of precision); C b = bias correction factor (measure of accuracy).

Behaviour Analysis-Machine Learning Models
Unlike the meta-analysis method that results in a predictive equation, the ML methods SVR and ANN do not have the same degree of transparency. To understand the causal pathways to obtain the predictive result, behaviour analysis was performed (Table 7) by systematically changing the inputs in isolation and determining the degree of change in the output prediction. This was performed at +10% and −10% to determine direction of change in the response variable.
Results show ( Table 7) that the models ANN_2i and SVR_1i (predicting CH 4 i) were highly sensitive to the X variables pH and TGP, to varying extents (dependent on the model and FT). Secondary to these variables, the CH 4 i predictions were sensitive to AC, PR, BT and DMD. Each model (ANN, SVR) demonstrated different sensitivity to these driving variables, and the sensitivity differed between the FT 1 (forage) and FT 2 (concentrate) substrates (Table 7).
Some responses had different directional effects in the different models. For example, increasing pH increased CH 4 i by 14% in ANN_2i, but decreased it by 6% in SVR_1i (FT = 1) (and similarly for FT = 2, pH increased CH 4 i in ANN_2i by 4% and by 30% in SVR_1i); increasing BT did not change CH 4 i in ANN_2i, but decreased CH 4 i by 6% in SVR1i (FT = 1); and increasing AC reduced CH 4 i by 3% in ANN_2i but increased CH 4 i by 7% in SVR_1i (FT = 2). Similar results were found for CH 4 d predictions, where, for example, increasing pH decreased CH 4 d with ANN_2d by 14%, but increased it by 9% with SVR_1d (FT = 1), and for FT = 2, raising pH increased CH 4 d by 11% with ANN_2d, and by 37% with SVR_1d.
Some behaviour responses within the ML methods were also directionally different between FT. For example, CH 4 i (ANN_2i) increased as pH was increased (+14%, FT = 1), but also increased when pH was decreased (+36%), indicating a nonlinear/polynomial response surface. This is in contrast to when FT = 2, where increasing pH increased CH 4 i by 4%, and decreasing pH decreased CH 4 i by 24% (Table 7). For CH 4 i (SVR_1i, FT = 1), increasing pH decreased CH 4 i by 6%, while increasing pH increased CH 4 i by 5%. This is in contrast to when FT = 2, where increasing pH increased CH 4 i by 30%, and decreasing it decreased CH 4 i by 39%. Similar directional differences were observed for CH 4 d (FT = 1 vs. 2).

Discussion
To predict CH 4 (in vivo or in vitro) based on stoichiometry principles alone and considering a H recovery of 100%, Hegarty and Nolan [8]  The resulting H is utilized by methanogens to reduce CO 2 to H 2 O (CO 2 + 8H → CH 4 + 2H 2 O). However, predicting CH 4 from the above stoichiometric equations is only valid if (1) these VFAs are the only end-products of fermentation, (2) no free H 2 accumulates or escapes, (3) the microbial digestion process is strictly anaerobic, and (4) H 2 is not used in other reactions (e.g., reduction of sulphates to sulphides, or saturation of double bonds in fatty acids) [8]. In practice, the production of CH 4 will be less than the stoichiometry prediction given by the above equations, because these assumptions are generally not held. Jayanegara et al. [7] used both stoichiometric equations to predict CH 4 from VFA concentrations in vitro, and found that indeed, the equations overpredicted CH 4 , likely due to a much lower observed H recovery (observed range of 28.9% to 56.2%) compared to the recoveries assumed by the models (100% and 90%). In agreement with Jayanegara et al. [7], when the stoichiometric equations [8,9] were applied to the current test dataset, CH 4 i was overpredicted (observed CH 4 i (mmol/L) = 11.6 ± 2.44, using [8], predicted CH 4 i = 16.2 ± 3.07; using [9], predicted CH 4 i = 14.0 ± 2.68) and had poor CCC evaluation statistics (0.135 and 0.227 for [8,9], respectively). For the test dataset, the average H 2 recovery, calculated according to [31], was 80%, a value that is substantially lower than the theoretical recovery rates [8,9], and also different from those observed by Jayanegara et al. [7], indicating the potential value of an empirical approach, such as those developed in our work.
The objective of the current study was to utilize meta-analysis and ML methodologies to predict CH 4 emissions from in vitro gas and VFA production data. Results of this work found that via meta-analysis, the best predictive equations of in vitro CH 4 d included the variables − DMD + VFA, or − DMD + VFA − PR − FT − VL (Equations M6 and M5 respectively, Table 5), while the best predictive equations of in vitro CH 4 i included the variables + VFA − FT, or − PR + VL + TGP (Equations M11 and M12, respectively, Table 5). The significant positive sign on VL in Equation M12 is concerning, as stoichiometrically, the production of VL utilizes H and therefore is associated with a lower CH 4 emission. This illustrates a limitation of empirical modelling (whether it be meta-analysis or a ML), that the resulting equations strive to find the best statistical relationship to the data, regardless of biological principles. It is possible this could be related to the relatively small contribution to total VFA made by VL, or correlation with specific feed ingredient properties.
The best performing univariate equations (U6 (CH 4 d), U12 (CH 4 i)) were based on DMD and TGP, respectively. The correlation between CH 4 d and TGP was low ( Table 2), indicating that the DMD correction to CH 4 i (CH 4 d) accounted for much of the strong relationship between CH 4 i and TGP. Such simple regressions may be used when VFA data are not reported, but would miss considerable variance explained by defining the type of VFA being produced (see multivariate equations).
The models produced via ML methodologies, ANN and SVR, have much higher predictability (CCC, RMSPE analysis) of the CH 4 i and CH 4 d outcomes compared to the meta-analysis models. This was a result of the meta-analysis models being limited to including only significant X variables (p < 0.05), while the ML methodologies have no such limitation. As well, the ML methodologies mapped more complex response surfaces between multiple X variables and the Y variable, based on linear, radial or polynomial shapes. While this resulted in a greatly improved prediction on related (internal evaluation) data (Table 6), it may end up fitting noise or other unrelated data characteristics in the training dataset, resulting in a diminished predictability on unrelated data (external evaluation). Such an external evaluation would be a required next step to test the globalization ability of such ML models-in particular considering the relatively small size of the training dataset and the data hungry nature of ML models.
Unsurprisingly, in both meta-analysis and ML models TGP was a significant driving variable, as an indicator of the overall extent of fermentation occurring in vitro. Directionally, the meta-analysis and ML methods agree, whereby increasing TGP increases CH 4 i and CH 4 d. The variable DMD was particularly relevant with the CH 4 d models, where increasing DMD resulted in a lower CH 4 d (Tables 5 and 7).
Interestingly, while pH did not appear in many highly significant meta-analysis equations (Tables 3 and 4), it did appear to have a strong presence in the ML models, as illustrated by the behaviour analysis (Table 7). According to the pH dependent VFA stoichiometry of [32], an increase in ruminal pH causes a shift in soluble carbohydrate fermentation towards AC and away from PR and BT, and a shift in starch fermentation towards AC and BT and away from PR. For FT = 2 (concentrates), the ANN_2d and SVR_1d equations to predict CH 4 d show a tendency for CH 4 to increase as pH increases (by 11% and 37%, respectively) ( Table 7). In line, when pH is decreased, CH 4 d also decreased (Table 7). However, when FT = 1 (forages), and pH increases the ANN_2d prediction decreases (−14%), while the SVR_1d prediction increases (9%). For the ANN_2d equation, it is difficult to conceptualize where the −14% in CH 4 d comes from, aside from a nuance in the database.

Conclusions
The current study successfully delivered models (using both meta-analysis and ML methodologies) which can be used to estimate CH 4 production from in vitro fermentation systems. Meta-analysis results indicate that equations containing DMD, VFA, PR, FT and VL resulted in the best prediction of CH 4 on an internal evaluation dataset of in vitro data. The ML models by far exceed the predictability achieved using meta-analysis methods, but should be evaluated on an external database to assess predictability and generalization potential on unrelated data, in particular given the limited database size and the data hungry nature of such ML methodologies. Between the ML methodologies assessed, ANN and SVR resulted in very similar predictive performance, but differences in fitting, as assessed by behaviour analysis, were evident. The models developed may be utilized to estimate CH 4 emissions in vitro, in instances where total gas and VFA production, but not CH 4 , are measured.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.