Interpretable Optimized Support Vector Machines for Predicting the Coal Gross Calorific Value Based on Ultimate Analysis for Energy Systems

García-Nieto, Paulino José; García-Gonzalo, Esperanza; Paredes-Sánchez, José Pablo; Menéndez-García, Luis Alfonso

doi:10.3390/modelling7010028

Open AccessArticle

Interpretable Optimized Support Vector Machines for Predicting the Coal Gross Calorific Value Based on Ultimate Analysis for Energy Systems

by

Paulino José García-Nieto

^1,*

,

Esperanza García-Gonzalo

¹

,

José Pablo Paredes-Sánchez

²

and

Luis Alfonso Menéndez-García

¹

Department of Mathematics, Faculty of Sciences, University of Oviedo, C/Leopoldo Calvo Sotelo, 18, 33007 Oviedo, Spain

²

Department of Energy, Polytechnic School of Engineering of Gijón, University of Oviedo, C/Luis Ortiz Berrocal, Campus de Gijón, 33203 Gijón, Spain

^*

Author to whom correspondence should be addressed.

Modelling 2026, 7(1), 28; https://doi.org/10.3390/modelling7010028

Submission received: 29 December 2025 / Revised: 21 January 2026 / Accepted: 22 January 2026 / Published: 26 January 2026

(This article belongs to the Section Modelling in Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

In energy production systems, the higher heating value (HHV), also known as the gross calorific value, is a key parameter for identifying the primary energy source. In this study, a novel artificial intelligence model was developed using support vector machines (SVM) combined with the Differential Evolution (DE) optimizer to predict coal gross calorific value (the dependent variable). The model incorporated the elements from coal ultimate analysis—hydrogen (H), carbon (C), oxygen (O), sulfur (S), and nitrogen (N)—as input variables. For comparison, the experimental data were also fitted to previously reported empirical correlations, as well as Ridge, Lasso, and Elastic-Net regressions. The SVM-based model was first used to assess the influence of all independent variables on coal HHV and was subsequently found to be the most accurate predictor of coal gross calorific value. Specifically, the SVM regression (SVR) achieved a correlation coefficient (r) of 0.9861 and a coefficient of determination (R²) of 0.9575 for coal HHV prediction based on the test samples. The DE/SVM approach demonstrated strong performance, as evidenced by the close agreement between observed and predicted values. Finally, a summary of the results from these analyses is presented.

Keywords:

higher heating value (HHV); energy system; support vector regression (SVR); ridge; lasso and elastic-net regressions; differential evolution (DE); ultimate analysis

1. Introduction

Coal is a black sedimentary rock composed predominantly of carbon, with varying proportions of elements such as oxygen, nitrogen, sulfur, and hydrogen. It is classified as a fossil fuel. Coal is formed from plant material over long geological periods under favorable sedimentary conditions that allow for its formation and preservation. It originates from plants that accumulated in wet, low-oxygen environments. Over millions of years, successive layers of organic matter—including plant debris, peat, and decaying vegetation—accumulate in these settings. As these layers continue to build up, they undergo physical and chemical transformations driven by pressure, temperature, and time, ultimately resulting in the formation of coal as a solid fuel [1,2].

Plant material accumulates in wetlands or swamp environments during coal formation, where oxygen availability is limited. The lack of oxygen slows the decomposition process, allowing organic matter to accumulate. As layers of organic material deepen, the weight of overlying sediments and water exerts increasing pressure on the lower layers. This pressure, combined with geothermal heat, initiates the transformation of the organic material. Compaction and heat induce physical and chemical changes in the plant matter, gradually converting it into peat. With continued burial and geological processes, peat undergoes further transformation. The expulsion of water, carbon dioxide, and other volatile components results in an increased concentration of carbon, ultimately leading to coal formation. As coalification progresses, peat is transformed into various coal ranks, ranging from lignite to anthracite. These coal ranks differ in carbon content, energy content, and physicochemical properties, which determine their suitability for energy applications [3,4].

Based on the degree of carbonization and the nature of the plant precursor, coal can be classified into four distinct types, as illustrated in Figure 1.

Peat: This represents the earliest stage of coal formation. It is brown, soft, dull, lightweight, and contains clearly visible plant remnants.
Lignite: This is a type of coal formed by the compression of peat. It is a crumbly material that still retains some identifiable plant structures. It is typically brown to black in color and exhibits a texture similar to that of the original wood.
Bituminous coal: This is a type of coal composed of organic sedimentary rock with a carbon content of approximately 80–90%. It is stratified, dull to greasy in appearance, hard and brittle, and black in color.
Anthracite: This is a fully carbonized coal that is black, hard, highly compact, and characterized by a bright, pearlescent luster.

Current global challenges encompass energy security, environmental protection, and sustainable development. Prolonged reliance on fossil fuels, particularly coal, has contributed to the greenhouse effect, global climate change, and resource depletion. Consequently, researchers worldwide are actively exploring alternative fuel utilization strategies that are both environmentally sustainable and efficient within the energy supply chain [3,4].

Coal contributes significantly to global energy production. The increasing demand for energy in thermal and power applications has driven the widespread use of this solid fossil fuel, and it is projected that coal consumption will nearly double by 2030 [5]. Coal is the most abundant fossil fuel worldwide and serves as a chemical repository of solar energy. It consists of over 50% organic matter (primarily carbon), including inherent moisture [6].

Ultimate analysis quantifies the mass fractions of carbon (C), hydrogen (H), oxygen (O), sulfur (S), and nitrogen (N) in coal, which are essential for assessing its combustion characteristics and heating value.

Figure 2 illustrates the steps involved in ultimate coal analysis:

Coal sampling: The coal sample is prepared by drying, grinding, and sieving to obtain uniformly small particles.
Laboratory test: The coal sample is combusted to convert its components into the corresponding oxides.
Detection: The combustion products are analyzed to determine the sample’s elemental composition.
Data analysis: The analysis results are used to assess the elemental composition and predict potential applications.

The calorific value, or heating value (HV), of a solid fuel determines its energy yield. Therefore, accurately determining the HV of coal is essential for various purposes, including classification, evaluation of energy potential, assessment of productive use, and precise valuation in the commodity market [7]. Furthermore, knowledge of HV is critical for the proper design and operation of coal-based systems [8]. Consequently, it is desirable to develop and implement methods that allow rapid and accurate determination of coal’s gross calorific value (HHV), offering substantial cost savings compared to traditional laboratory experiments. In this context, several previous studies have developed mathematical and empirical relationships to predict coal HHV based on the key elements obtained from ultimate analysis [9,10,11,12,13,14]. While traditional empirical correlations for predicting coal HHV are based on linear assumptions, machine learning (ML) techniques provide a data-driven approach capable of capturing complex, nonlinear relationships. These models not only help reduce the costs associated with conducting standard experimental tests, but they also have the potential to achieve greater accuracy than the previously mentioned empirical correlations.

Previous research on coal HHV prediction using statistical and machine learning approaches, as documented in earlier studies, includes decision tree regression [15], adaptive neuro-fuzzy inference systems [16], artificial neural networks (ANNs), and genetic algorithms (GAs) [17]. Additionally, a regression approach that integrates elements of proximate analysis with Gaussian process regression has been reported [18]. However, the potential of support vector machines (SVMs) to predict HHV across different coal types, deposits, and locations has not yet been investigated.

To the best of the authors’ knowledge, the methodology employed in this study is significant because it addresses a task that has not been previously undertaken. It combines the support vector machine (SVM) technique [19,20,21,22,23,24,25] with the Differential Evolution (DE) optimizer [26,27,28,29,30,31,32,33] to predict the gross calorific value (HHV) of coal from samples collected across various deposits and locations. For comparative purposes, Ridge, Lasso, and Elastic-Net regressions [34,35,36,37,38,39,40,41] were also applied to the dataset to evaluate the coal HHV as the target variable. The SVM approach [19,20,21,22,23,24,25], a supervised learning method renowned for its robustness and capacity to handle nonlinearities, is particularly well-suited for regression problems.

In several domains, including hydro-climatic parameters [42], solar radiation [43], and photovoltaic energy [44], SVM have demonstrated considerable efficacy. Several factors support the utility of the proposed SVM approach [19,20,21,22,23,24,25]: (1) because the loss function is data-driven, the majority of the dataset (referred to as training data) is utilized to construct the SVM-based model; (2) to enhance memory efficiency and accuracy in high-dimensional spaces, the SVM model relies on support vectors, which constitute a subset of the training points employed by the decision function; (3) the SVM technique incorporates the kernel trick, facilitating regression of nonlinear data and enabling the resolution of complex problems; and (4) when hyperparameters are properly tuned, the SVM technique is robust to outliers and achieves high predictive accuracy due to its reduced sensitivity to anomalous data points.

Based on the chemical elements determined from the comprehensive analysis of coal samples from diverse sources and locations, the primary objective of the present study is to evaluate the predictive performance of various machine learning approaches for coal HHV. These approaches include DE-optimized SVM models as well as DE-optimized Ridge, Lasso, and Elastic-Net regressions. The study examines the influence of five input elements (C, O, H, S, and N) to accurately predict coal HHV as the target variable.

This study is organized as follows. First, the instruments and methodologies employed in this investigation are described. Next, the results are presented and discussed. Finally, the main conclusions are summarized.

2. Materials and Methods

2.1. Experimental Dataset

The coal dataset used in this study consists of HHV measurements paired with experimental ultimate analysis data, which serve as the primary input variables. Laboratory data from previous research on coal as a solid fuel [45] were employed, encompassing 318 coal samples characterized by the following variables: HHV, C, O, H, S, and N. The HHV values were experimentally determined using a bomb calorimeter in accordance with standard procedures. These values were combined with ultimate analysis data to construct the dataset for model development and validation. In terms of the mathematical methodology, predictive models can estimate the heating value based on the main elemental components obtained from the analysis of fuel samples. Table 1 summarizes the five predictor variables and the output variable used in the regression approaches evaluated in this study.

HHV represents the maximum amount of energy that can be released during the combustion of a fuel, including the latent heat produced by the condensation of water vapor formed during combustion. Currently, the two primary approaches for estimating the HHV of fuels are the use of a bomb calorimeter and equations derived from Dulong’s formula [46]. In laboratory settings, an adiabatic bomb calorimeter is typically employed; however, this method is not always practical and can be costly [47]. The equations, derived from experimental data obtained through proximate or ultimate analyses of coal, are based on empirical modeling. However, empirical correlations often fail to capture nonlinear interactions, limiting their predictive accuracy across diverse coal types.

2.2. Methods for Mathematical Modeling

2.2.1. Support Vector Machines (SVM) Method

The original purpose of SVM was to address binary classification problems. However, it was soon recognized that the underlying principles could be extended to tackle a variety of other tasks, including regression problems [19,20,21,22,23,24,25]. Next, look at a dataset where the values of the covariates

x_{i} \in R^{p}, i = 1, 2, \dots, m

and the continuous output factor

y_{i} \in R, \forall i = 1, 2, \dots, m

are included in the training set. Consequently, one function

f (x) = p^{T} x + a

is produced via the support vector regression (SVR) technique, where p is the vector normal to the hyperplane, also referred to as the direction vector. The distance normal between the hyperplane and the origin of coordinates system is denoted by the expression

\frac{a}{‖p‖}

. Additionally, this approximation must guarantee that the model stays as flat as possible while permitting a maximum deviation of

ε

from the true value

y_{i}

for all training cases

x_{i}

. The Euclidean norm

{‖p‖}_{2}

is minimized to achieve horizontality, and the sum of differences greater than

ε

is penalized in order to fit the model. In reality, the support vector regression (SVR) approximation seeks to find a solution of the optimization issue that follows [38,39]:

\min_{p, b, ξ^{+} {, ξ}^{-}} \frac{1}{2} {‖p‖}^{2} + C \sum_{i = 1}^{m} (ξ_{i}^{+} + ξ_{i}^{-})

(1)

liable to

\{\begin{matrix} y_{i} - (p^{T} x_{i} + a) \geq ε + ξ_{i}^{+} & f o r & i = 1, \dots, m \\ (p^{T} x_{i} + a) - y_{i} \geq ε + ξ_{i}^{-} & f o r & i = 1, \dots, m \\ ξ_{i}^{+}, ξ_{i}^{-} \geq 0 & f o r & i = 1, \dots, m \end{matrix}\},

(2)

so that

ξ^{+}, ξ^{-} \in R^{m}

are specified as slack variables and

C

is termed as the regularization constant. In order to limit the penalization applied to data outside of the

ε

interval and mitigate overfitting, Equation (1) requires that constant

C

take a numerical value greater than zero. The trade-off between the goal function’s horizontal nature and the model’s decreased complexity is determined by this value [24,25,38,39]. Slack variables, which are linked to each training vector, allow deviations larger than

ε

while penalizing them in the objective function. Furthermore, an

ε -

insensitive tube is the area that

y_{i} \pm ε, \forall i

surrounds (see Figure 3).

This is a highly nonlinear problem; therefore, the kernelization method is applied. Using this approach, the original dataset is mapped into a higher-dimensional feature space, H. A kernel function,

K (x_{i}, x_{j})

, is employed to perform this mapping, allowing the computation of a scalar product in H. The dual form of the problem is used to reformulate the primal optimization problem, as expressed in Equation (1). In its dual formulation, the optimization problem is stated as follows, subject to the Karush-Kuhn-Tucker (KKT) conditions [19,20,21,22,23,24,25,38,39,42,43,44]:

\max_{β^{+}, β^{-}} \sum_{i = 1}^{m} y_{i} (β_{i}^{+} - β_{i}^{-}) - ε \sum_{i = 1}^{m} (β_{i}^{+} + β_{i}^{-}) - \frac{1}{2} \sum_{i, j = 1}^{m} (β_{i}^{+} - β_{i}^{-}) (β_{j}^{+} - β_{j}^{-}) K (x_{i}, x_{j})

(3)

liable to

\{\begin{matrix} \sum_{i = 1}^{m} (β_{i}^{+} - β_{i}^{-}) = 0, \\ 0 \leq β_{i}^{+} \leq C, & f o r & i = 1, \dots, m \\ 0 \leq β_{i}^{-} \leq C, & f o r & i = 1, \dots, m \end{matrix}\} .

(4)

The following mathematical expression,

f (x)

, can be used to predict the regression outcome for a new sample x:

f (x) = \sum_{i = 1}^{m} (β_{i}^{+} - β_{i}^{-}) K (x, x_{i}) + a .

(5)

The technical literature utilizes a variety of commonly used kernel functions, which, if we denote

r = {‖x_{i} - x_{j}‖}_{2}

, are expressed as follows [19,20,21,22,23,24,25,38,39,42,43,44]:

Linear type kernel:

K (x_{i}, x_{j}) = x_{i} \cdot x_{j},

(6)

Polynomial type kernel:

K (x_{i}, x_{j}) = {(σ x_{i} \cdot x_{j} + a)}^{p},

(7)

Sigmoid type kernel:

K (x_{i}, x_{j}) = \tan h (σ x_{i} \cdot x_{j} + a),

(8)

Radial basis function (RBF) type kernel:

K (x_{i}, x_{j}) = e^{- σ r^{2}} .

(9)

Thus, the type of kernel is determined by the parameters a, p and

σ

.

In conclusion, the appropriate kernel type must be selected and its optimal parameters determined in advance to map nonlinearly separable data into a feature space (a higher-dimensional space) and to apply the support vector machine (SVM) method for addressing the regression problem.

Furthermore, the following provides a concise description of the characteristic parameters of the SVR method [19,20,21,22,23,24,25,38,39,42,43,44]:

$ε$ hyperparameter: To ensure that errors smaller than ε are not penalized, the ε-insensitive loss function is used to evaluate empirical error. This parameter defines the maximum width of the allowable margin of error and corresponds to the second term in the objective function that depends on $ε$ .
Regularization constant ( $C$ ): Also referred to as the cost parameter, C represents the trade-off between the slack variables and the margin. Proper pre-tuning of this parameter is essential for the SVR method.
$σ$ , a, and p: In the final model, these parameters define the mathematical expressions of the different kernel functions.

Consequently, it is reasonable to apply a mathematical approach that accurately determines the aforementioned hyperparameters. In this study, the Differential Evolution (DE) optimizer, described below, was found to perform effectively.

2.2.2. Ridge Regression (RR) and Lasso Regression (LR)

In multiple regression models with highly correlated independent variables, Ridge Regression (RR) is an appropriate technique for estimating coefficients. It has been applied in various fields, including chemistry, engineering, and econometrics. Also known as Tikhonov regularization, RR is a method for regularizing ill-posed problems [34,35]. Specifically, it mitigates the issue of multicollinearity in linear regression, which frequently arises in models with numerous parameters. Overall, the approach introduces a modest bias in exchange for improved efficiency in parameter estimation.

RR has been proposed as a potential solution to address the limitations of least squares estimators in linear regression models with multicollinear (i.e., highly correlated) independent variables. By generally producing lower variance and mean squared error compared to conventional least squares estimates, RR provides more accurate parameter estimates. Accordingly, RR has also been employed to predict coal HHV [34,35,36,37,38,39,40,41].

By incorporating positive constants, the condition number of the diagonal elements is reduced, thereby mitigating issues associated with a nearly singular moment matrix

X^{T} X

. The simple Ridge estimator can be derived using the same functional form as the conventional least squares estimator [34,35,36,37,38,39,40,41,48,49]:

{\hat{β}}_{R R} = {(X^{T} X + λ I)}^{- 1} X^{T} y .

(10)

Here,

I

represents the identity matrix,

y

is the dependent variable, and

X

denotes the design matrix. To ensure numerical stability, a constant known as the Ridge parameter—also referred to as the complexity or regularization parameter—is added to the diagonal elements of the moment matrix. By imposing the constraint

‖β‖ \leq t

, where a predetermined tuning parameter t controls the degree of regularization, and formulating the problem using a Lagrangian approach, it can be shown that the following least squares optimization problem can be solved using this estimator [34,35,36,37,38,39,40,41]:

L^{R R} = \underset{β}{m i n} {(y - X β)}^{T} (y - X β) + λ {‖β‖}_{2}^{2} subject to {‖β‖}_{2}^{2} \leq t,

(11)

where

{‖β‖}_{2} = {(\sum_{i = 1}^{N} β_{i}^{2})}^{1 / 2}

. Therefore, the main advantage of RR over standard linear regression lies in its ability to flexibly balance the trade-off between bias and variance. Typically, an increase in bias is accompanied by a reduction in variance, whereas a decrease in bias often results in higher variance.

Additionally, the Lasso Regression (LR) methodology is closely related to RR and was also employed in this study to address the problem. Fundamentally, the LR loss function is defined as follows [34,35,36,37,38,39,40,41,48,49]:

L^{L R} = \underset{β}{m i n} {(y - X β)}^{T} (y - X β) + λ {‖β‖}_{1} subject to {‖β‖}_{1} \leq t

(12)

so that

{‖β‖}_{1} = \sum_{i = 1}^{N} |β_{i}|

and where the data determines the exact correlation between t and λ.

In summary, LR can perform variable selection by setting certain coefficients exactly to zero, unlike RR, which does not have this capability [34,35,36,37,38,39,40,41]. This difference arises from the distinct geometrical shapes of their respective constraint regions.

2.2.3. Elastic-Net Regression (ENR)

ENR combines the penalties of both RR and LR. By integrating these two regularization terms, the Elastic-net Lagrangian function is formulated as follows [34,35,36,37,38,39,40,41]:

L^{E N} = \underset{β}{m i n} {(y - X β)}^{T} (y - X β) + λ_{1} {‖β‖}_{1} + λ_{2} {‖β‖}_{2}^{2}

(13)

Indeed, if

λ_{1} = 0

, we have RR, whereas if

λ_{2} = 0

, we have LR. The

L_{1} - r a t i o

factor can be employed instead of the

λ

pair parameters, which determines the

L_{1}

penalty to

λ

ratio. The value

L_{1} - r a t i o = 0.5

is utilised in the computations herein, since it is the parameter that is most frequently used in most computational systems.

2.2.4. Differential Evolution (DE) Optimization Algorithm

Selecting an appropriate set of hyperparameters is essential. Hyperparameter tuning refers to the process of choosing or fine-tuning the optimal parameters. Hyperparameters are values that govern the learning behavior of algorithms. Consequently, a set of hyperparameters is identified during the optimization process, leading to the development of an optimal model that minimizes a specified loss function. The performance of these proposed approaches is commonly evaluated using cross-validation [26,27,28,29,30,31,32,33].

DE, a metaheuristic technique in evolutionary computation that iteratively seeks to improve the quality of candidate solutions, is employed here to optimize the problem. The DE optimizer is particularly well-suited for multidimensional, real-valued functions and does not require the objective function to be differentiable. Moreover, DE is well-suited for handling dynamic, noisy, or discontinuous optimization problems. It was chosen for its efficiency in exploring high-dimensional and non-convex search spaces, which are common in SVM hyperparameter tuning. By maintaining a population of candidate solutions and generating new candidates through the recombination of existing ones using a simple formula, DE progressively improves the solution by selecting the candidate with the highest fitness for the specified optimization task [26,27,28,29,30,31,32,33]. The variables of the optimization problem are represented in the algorithm as a vector of real values. There are NP such vectors in the population, and the length of each vector is n, corresponding to the number of parameters involved in the optimization problem.

Let

g

be the number of generations and

p

be the index of a vector in the population

(p = 1, \dots, N P)

; the vector is then defined as

x_{p}^{g}

. The components of

x_{p}^{g}

correspond to the problem variables

x_{p, m}^{g}

, where m indexes the variables within the individual (m = 1, …, n). Each variable is constrained within an interval bounded by the minimum and maximum values

x_{m}^{m i n}

and

x_{m}^{m a x}

, respectively. The differential evolution algorithm consists of four primary steps [26,27,28,29,30,31,32,33]:

Initialization;
Mutation;
Recombination; and
Selection.

The search process begins after initialization and concludes when a predetermined stopping criterion is met, such as a specified number of generations, elapsed time, or attainment of the desired solution quality. At this point, the processes of mutation, recombination, and selection are completed.

Initialization

The minimum and maximum values of each variable are used to initialize the population (first generation) randomly [26,27,28,29,30,31,32,33]:

x_{p, m}^{1} = x_{m}^{m i n} + r a n d (0, 1) \cdot (x_{m}^{m a x} - x_{m}^{m i n}) f o r p = 1, \dots, n .

(14)

In Equation (14), the expression

r a n d (0, 1)

generates a random number uniformly distributed belongs to the interval

[0, 1]

.

Mutation

Three individuals known as the target vectors

x_{a}

,

x_{b}

, and

x_{c}

are chosen at random, which are utilized to generate the NP fresh vectors that constitute the mutation. The fresh vectors

n_{p}^{t}

are constructed as follows [26,27,28,29,30,31,32,33]:

n_{p}^{g} = x_{c} + F \cdot (x_{a} - x_{b}) for p = 1, \dots, N P,

(15)

where a, b, c, and p are distinct individuals. The mutation rate is controlled by factor F, which is located in the interval

[0, 2]

.

Recombination

Following the creation of the NP fresh vectors, the trial vectors

t_{m}^{g}

are produced through a stochastic recombination process, followed by a comparison with the original vectors

x_{p}^{g}

as described below [26,27,28,29,30,31,32,33]:

t_{p, m}^{g} = \{\begin{matrix} n_{p, m}^{g} & if & r a n d (0, 1) < G R \\ x_{p, m}^{g} & otherwise \end{matrix}\} for p = 1, \dots, N P and m = 1, \dots, n .

(16)

The recombination rate is controlled by the parameter GR. The test vector comprises elements from both the original and the mutated vectors, as the selection process is conducted on a component-wise basis.

Selection

In order to choose the vectors for the next generation, those exhibiting the optimal values as stated in the fitness function, the original vectors and the trial vectors are directly contrasted [26,27,28,29,30,31,32,33]:

x_{p}^{g + 1} = \{\begin{matrix} t_{p}^{g} & if & f i t (t_{p}^{g}) > f i t (x_{p}^{g}) \\ x_{p}^{g} & otherwise \end{matrix}\}

(17)

2.3. The Precision of This Approximation

The primary goodness-of-fit metric used in this research for the regression problem is the coefficient of determination (

R^{2}

). Here, we are going to represent as

t_{i}

and

y_{i}

the observed and predicted values, respectively, by means of the expressions indicated below [50,51]:

${S S}_{r e g} = \sum_{i = 1}^{n} {(y_{i} - \bar{t})}^{2}$ : it is a representation of the sum of squares explained;
$S S_{t o t} = \sum_{i = 1}^{n} {(t_{i} - \bar{t})}^{2}$ : there is a clear correlation between this summation and sample variance;
$S S_{e r r} = \sum_{i = 1}^{n} {(t_{i} - y_{i})}^{2}$ : termed the residual sum of squares,

where

\bar{t}

represents the experimental data’s mean value, which can be found using:

\bar{t} = \frac{1}{n} \sum_{i = 1}^{n} t_{i} .

(18)

Therefore, the coefficient of determination is expressed as [50,51]:

R^{2} \equiv 1 - \frac{S S_{e r r}}{S S_{t o t}} .

(19)

The smaller the difference between the experimental and predicted data, the closer the

R^{2}

statistic is to 1.0.

The approaches used in this study, including the RR, LR, and ENR, have been developed with the goal of optimizing their estimates using the coefficient of determination

R^{2}

.

As stated earlier, the size of the permissible error margin

ε

, the cost constant (

C

), and ultimately p and

σ

, which establish the mathematical formulas of the various kernels, are the SVM hyperparameters that most significantly influence the SVM approximation. To identify the optimal hyperparameters for the SVM approach, we used DE algorithm [26,27,28,29,30,31,32,33].

The dataset is randomly divided into two subsets, with 80% allocated to the training set and 20% to the testing set. The DE/SVM model is then developed using the training data. To estimate the SVM hyperparameters, the differential evolution (DE) algorithm is applied in conjunction with a ten-fold cross-validation scheme [49,52]. Once the optimal parameters are identified, the final model is constructed using the training dataset. The trained model is subsequently used to predict the responses of the testing set. The goodness-of-fit of the proposed approach is then evaluated by comparing the predicted values with the corresponding observed values. A schematic representation of the overall procedure (i.e., the process flow diagram) of the DE/SVM approach employed in this study is presented in Figure 4.

The

R^{2}

is ascertained using cross-validation [50,51]. The predictive performance of the DE/SVM approximation was assessed using a k-fold cross-validation approach (with

k = 10

) [49,52]. The regression modeling process was conducted utilizing the R 4.4.1 software packages outlined below:

SVM approximation with different kernels: e1071 package (version 1.7-16) from R project [53];
DE optimizer: metaheuristicOpt package (version 2.0.0) [26,27,28,29]; and
Moreover, RR, LR and ENR models were implemented employing the R project’s glmnet package (version 4.1-8) [53,54,55,56].

The search intervals of the three parameters related to the DE/SVM approach used in this investigation are shown in Table 2.

The DE optimizer performs effectively in tuning the parameters of the SVM. By evaluating the cross-validation error at each iteration, the optimal values of the parameters

C

,

ε

and

σ

can be determined using the DE optimizer.

3. Results and Discussion

The best hyperparameters for the coal’s HHV as determined by DE optimizer are shown in Table 3 for the updated SVM-based technique.

For comparison purposes, the RR, LR and ENR models have been created. The

λ

parameter in these models determines their accuracy [39,40,41]. The optimal

λ

values are presented in Table 4.

The first-order terms of the DE/SVM method employing an RBF kernel are presented in Figure 5. This figure facilitates understanding of the relationships among the multiple input variables used in this approach. For example, HHV is plotted on the Y-axis against C on the X-axis, while the remaining four input variables are held constant (see the first graph in Figure 5). Similarly, with all other input variables kept constant, the second and third graphs in Figure 5 depict HHV on the Y-axis as a function of H and O on the X-axis, respectively. Finally, the fourth and fifth graphs illustrate HHV on the Y-axis as a function of N and S on the X-axis, respectively.

Similarly, the second-order terms of the DE/SVM method employing an RBF kernel are presented in Figure 6. In this case, with all other variables held constant, the first graph in Figure 6 depicts HHV on the Z-axis as a function of H on the Y-axis and C on the X-axis. The remaining graphs in Figure 6 follow a similar pattern, illustrating HHV on the Z-axis as a function of C and O, C and S, H and S, and O and S on the X- and Y-axes, respectively, while the other variables remain unchanged.

A selection of the most widely used empirical formulas for estimating coal HHV reported in the literature is presented in Table 5. These formulas are also based on the mass percentage of the components from the ultimate analysis.

The determination and correlation coefficients for the empirical correlations of coal gross calorific value (output variable) [9,10,11,12,13,14], as well as for the DE/SVM approach with different kernel types and for RR, LR, and ENR, are all presented in Table 6 and were evaluated using the testing data.

Based on recent statistical analyses, the SVM approach with an RBF kernel is the most effective model for estimating the gross calorific value (HHV) as a dependent variable across various coal types. This model achieved a value of R² of 0.9575 and of r of 0.9861 for coal HHV prediction. The close agreement between the experimentally measured data and the SVM predictions demonstrates the model’s consistent goodness-of-fit.

Table 7 and Figure 7 present an additional outcome of these analyses: the significance ranking of the five input variables in predicting coal gross calorific value in this comprehensive study. Variable importance was determined from the weights of a linear-kernel SVR model, where the magnitude of each weight reflects the relative contribution of the corresponding feature to the prediction. Although these weights are not directly interpretable as coefficients in linear regression due to the effects of regularization and the

ε

-insensitive loss function, their normalized values provide a relative ranking of feature importance.

According to the SVM model, C is the most significant factor in predicting HHV, followed by N, S, H, and O. Dulong previously emphasized the importance of C and H for HHV using models that linked these variables in solid fuels [45]. In this context, the literature suggests that theoretical models for solid fuel HHV may exist and that these models are likely to have a strong relationship with C, as well as significant influences from H and O [57].

Generally, C, O, and H are the primary parameters used to determine the potential HHV of a fuel, while S and N are relevant mainly due to their impact on the environmental aspects of fuel combustion. In this context, carbon, as the fundamental element in all carbonaceous fuels, plays a critical role. Meanwhile, the influence of nitrogen is relatively minor, as it is primarily associated with living biomass, where it is required for cellular synthesis.

According to the ranking order of variables (Table 7 and Figure 7), C is the primary component in the proposed model, as it directly contributes to the energy released during combustion and serves as the most significant indicator of coalification grade. Carbon is the predominant element in coal, typically constituting the largest proportion of its elemental composition. O is the second most important parameter in the model; it is also present in coal and influences both the heating value and the reactivity of coal during combustion. O is commonly bonded with carbon or hydrogen to form various functional groups, such as carboxyl and hydroxyl groups in coal [58].

H is the third most important element in the resulting model, as it contributes to coal combustibility. In contrast, S and N have a lower impact on the modeled heating value, likely due to their relatively low abundance in coal. Furthermore, during combustion, sulfur and nitrogen can be oxidized to form sulfur dioxide and nitrogen oxides, respectively, which are air pollutants. It is noteworthy that, for the development of energy applications, it is essential to characterize solid fuels both in terms of their heating value and their potential pollutant emissions [59,60].

The predicted and experimental values of coal HHV are compared in Figure 8 using the following models: the DE/SVM model with a quadratic kernel (Figure 8a), the most accurate empirical correlation E6 (Figure 8b), the DE/SVM model with a cubic kernel (Figure 8c), RR (Figure 8d), the DE/SVM model with a sigmoid kernel (Figure 8e), the DE/SVM model with a linear kernel (Figure 8f), the ENR model (Figure 8g), the LR model (Figure 8h), and the DE/SVM model with an RBF kernel (Figure 8i). These comparisons highlight the importance of the SVM approach for achieving the most effective solution to the regression problem. The results clearly indicate that the DE/SVM model with an RBF kernel provides the best fit while meeting the critical statistical goodness-of-fit criterion (

R^{2}

).

In summary, the DE/SVM model with an RBF kernel achieved the highest predictive accuracy

(R^{2} = 0.9575),

significantly outperforming empirical correlations

(R^{2} = 0.4887)

. This performance underscores the model’s capability to capture nonlinear relationships between coal’s elemental composition (C, H, O, N, and S) and HHV.

4. Conclusions

By comparing the numerical and experimental results, the following summarizes the primary findings of the investigation:

Initially, calculating coal HHV requires solving a complex heat transfer problem that involves radiation, convection, and conduction—the three distinct modes of heat transfer. The solution to the corresponding partial differential equations (PDEs) is derived from the final comprehensive model. In practice, these PDEs can only be solved numerically using methods such as the finite difference method or the finite element method; moreover, solutions obtained with additional heuristic approximations can differ substantially. Therefore, it is crucial to develop new machine learning-based analysis techniques. Specifically, the most suitable approach for accurately predicting HHV in different types of coal from various deposits and locations is the DE/SVM method with an RBF kernel, as employed in this study. It should be noted that the SVM technique with a cubic kernel also provides strong predictive capability, albeit with slightly lower performance than the SVM approach with the RBF kernel.
Secondly, it was demonstrated that the DE/SVM approach can accurately predict coal HHV in the fuel industry.
Thirdly, a value of R² of 0.9575 was achieved when coal HHV, the dependent variable, was predicted using the DE/SVM approach with an RBF kernel, based on the testing dataset (20% of the observed data not used for training).
Fourthly, for fuel automation applications, a low-cost microcontroller device can be configured with the SVM approach to achieve reliable and efficient prediction of coal HHV.
Finally, the input variables used to predict coal HHV can be ranked according to their significance, which constitutes one of the primary conclusions of this study. Accordingly, the elements O, H, S, and N, in that order, may be considered the secondary most important predictors of coal HHV after C. While the linear SVM model provided initial insights into variable importance, future work could explore alternative approaches, such as permutation importance or SHAP values, to further validate these findings in nonlinear contexts.

While the DE/SVM approach presented in this study demonstrates superior accuracy for coal HHV prediction compared to empirical and regularized regression methods, several limitations should be acknowledged. The dataset employed in this investigation was restricted to ultimate analysis variables (C, H, O, N, and S); in this context, moisture content is typically reported on a dry basis, and ash is excluded through dry ash-free protocols. Future studies could incorporate these variables via proximate analysis to enhance model robustness across diverse coal types. Despite these limitations, this study provides a robust framework for HHV estimation using machine learning, offering a data-driven alternative to traditional empirical correlations.

Author Contributions

Conceptualization, P.J.G.-N., E.G.-G., J.P.P.-S. and L.A.M.-G.; methodology, P.J.G.-N., E.G.-G., J.P.P.-S. and L.A.M.-G.; software, P.J.G.-N., E.G.-G., J.P.P.-S. and L.A.M.-G.; validation, P.J.G.-N., E.G.-G., J.P.P.-S. and L.A.M.-G.; formal analysis, P.J.G.-N., E.G.-G., J.P.P.-S. and L.A.M.-G.; data curation, P.J.G.-N., E.G.-G., J.P.P.-S. and L.A.M.-G.; writing—original draft preparation, P.J.G.-N., E.G.-G., J.P.P.-S. and L.A.M.-G.; writing—review and editing, P.J.G.-N., E.G.-G., J.P.P.-S. and L.A.M.-G.; visualization, P.J.G.-N., E.G.-G., J.P.P.-S. and L.A.M.-G.; supervision, P.J.G.-N., E.G.-G., J.P.P.-S. and L.A.M.-G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors wish to acknowledge the Department of Mathematics at the University of Oviedo for its computational support.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
ANFIS	Adaptive Neuro-Fuzzy Inference System
ANN	Artificial Neural Network
C	Carbon (chemical element)/Regularization Constant (depending on context)
DE	Differential Evolution
DE/SVM	Differential Evolution/Support Vector Machines
ENR	Elastic-Net Regression
Eq	Equation
GA	Genetic Algorithm
GR	Recombination Rate (Differential Evolution)
H	Hydrogen
HHV	Higher Heating Value
HV	Heating Value
KKT	Karush–Kuhn–Tucker (conditions)
LR	Lasso Regression
ML	Machine Learning
MLT	Machine Learning Techniques
N	Nitrogen
NP	Number of Population members (Differential Evolution)
O	Oxygen
PDE	Partial Differential Equation
r	Correlation Coefficient
R²	Coefficient of Determination
RBF	Radial Basis Function
RR	Ridge Regression
S	Sulfur
SS_reg	Sum of Squares Explained
SS_tot	Sum of Squares Total
SS_err	Residual Sum of Squares
SVM	Support Vector Machines
SVR	Support Vector Regression

References

Finkelman, R.B.; Dai, S.; French, D. The Importance of Minerals in Coal as the Hosts of Chemical Elements: A Review. Int. J. Coal Geol. 2019, 212, 103251. [Google Scholar] [CrossRef]
Wang, X.; Tang, Y.; Wang, S.; Schobert, H.H. Clean Coal Geology in China: Research Advance and Its Future. Int. J. Coal Sci. Technol. 2020, 7, 299–310. [Google Scholar] [CrossRef]
Jiang, L.; Xue, D.; Wei, Z.; Chen, Z.; Mirzayev, M.; Chen, Y.; Chen, S. Coal Decarbonization: A State-of-the-Art Review of Enhanced Hydrogen Production in Underground Coal Gasification. Energy Rev. 2022, 1, 100004. [Google Scholar] [CrossRef]
Paredes-Sánchez, J.P.; López-Ochoa, L.M. Bioenergy as an Alternative to Fossil Fuels in Thermal Systems. In Advances in Sustainable Energy; Vasel, A., Ting, D.S.-K., Eds.; Lecture Notes in Energy; Springer International Publishing: Cham, Switzerland, 2019; Volume 70, pp. 149–168. [Google Scholar]
Seervi, K. Prediction of Calorific Value of Indian Coals by Artificial Neural Network. Bachelor’s Thesis, National Institute of Technology, Rourkela, India, 2015. [Google Scholar]
Paredes-Sánchez, J.P.; Las-Heras-Casas, J.; Paredes-Sánchez, B.M. Solar Energy, the Future Ahead. In Advances in Sustainable Energy; Vasel, A., Ting, D.S.-K., Eds.; Lecture Notes in Energy; Springer International Publishing: Cham, Switzerland, 2019; Volume 70, pp. 113–132. [Google Scholar]
Akkaya, A.V. Proximate Analysis Based Multiple Regression Models for Higher Heating Value Estimation of Low Rank Coals. Fuel Process. Technol. 2009, 90, 165–170. [Google Scholar] [CrossRef]
Paredes-Sánchez, B.M.; Paredes-Sánchez, J.P.; García-Nieto, P.J. Evaluation of Implementation of Biomass and Solar Resources by Energy Systems in the Coal-Mining Areas of Spain. Energies 2021, 15, 232. [Google Scholar] [CrossRef]
Channiwala, S.A.; Parikh, P.P. A Unified Correlation for Estimating HHV of Solid, Liquid and Gaseous Fuels. Fuel 2002, 81, 1051–1063. [Google Scholar] [CrossRef]
Mason, D.M.; Gandhi, K.N. Formulas for Calculating the Calorific Value of Coal and Coal Chars: Development, Tests, and Uses. Fuel Process. Technol. 1983, 7, 11–22. [Google Scholar] [CrossRef]
Selvig, W.A.; Wilson, I.H. Calorific Value of Coal. In Chemistry of Coal; Lowry, H.H., Ed.; Wiley: New York, NY, USA, 1945; Volume 1, p. 139. [Google Scholar]
Given, P.H.; Weldon, D.; Zoeller, J.H. Calculation of Calorific Values of Coals from Ultimate Analyses: Theoretical Basis and Geochemical Implications. Fuel 1986, 65, 849–854. [Google Scholar] [CrossRef]
Chelgani, S.C. Estimation of Gross Calorific Value Based on Coal Analysis Using an Explainable Artificial Intelligence. Mach. Learn. Appl. 2021, 6, 100116. [Google Scholar] [CrossRef]
Matin, S.S.; Chelgani, S.C. Estimation of Coal Gross Calorific Value Based on Various Analyses by Random Forest Method. Fuel 2016, 177, 274–278. [Google Scholar] [CrossRef]
Pekel, E.; Akkoyunlu, M.C.; Akkoyunlu, M.T.; Pusat, S. Decision Tree Regression Model to Predict Low-Rank Coal Moisture Content during Convective Drying Process. Int. J. Coal Prep. Util. 2020, 40, 505–512. [Google Scholar] [CrossRef]
Akkoyunlu, M.T.; Pekel, E.; Akkoyunlu, M.C.; Pusat, S.; Özkan, C.; Kara, S.S. Moisture Content Estimation during Fixed Bed Drying Process with Design of Experiment and ANFIS Methods. Int. J. Oil Gas Coal Technol. 2019, 22, 332. [Google Scholar] [CrossRef]
Akkoyunlu, M.C.; Pekel, E.; Akkoyunlu, M.T.; Pusat, S. Using Hybridized ANN-GA Prediction Method for DOE Performed Drying Experiments. Dry. Technol. 2020, 38, 1393–1399. [Google Scholar] [CrossRef]
Akkaya, A.V. Coal Higher Heating Value Prediction Using Constituents of Proximate Analysis: Gaussian Process Regression Model. Int. J. Coal Prep. Util. 2022, 42, 1952–1967. [Google Scholar] [CrossRef]
Vapnik, V.N. Statistical Learning Theory; Adaptive and Learning Systems for Signal Processing, Communications, and Control; John Wiley and Sons: New York, NY, USA, 1998. [Google Scholar]
Cristianini, N.; Shawe-Taylor, J. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, 1st ed.; Cambridge University Press: New York, NY, USA, 2000. [Google Scholar]
Schölkopf, B.; Smola, A.J.; Williamson, R.C.; Bartlett, P.L. New Support Vector Algorithms. Neural Comput. 2000, 12, 1207–1245. [Google Scholar] [CrossRef]
Witten, I.H.; Frank, E.; Hall, M.A. Data Mining: Practical Machine Learning Tools and Techniques, 3rd ed.; Morgan Kaufmann Series in Data Management Systems; Morgan Kaufmann: Burlington, MA, USA, 2011. [Google Scholar]
Hansen, T.; Wang, C.-J. Support Vector Based Battery State of Charge Estimator. J. Power Sources 2005, 141, 351–358. [Google Scholar] [CrossRef]
Li, X.; Lord, D.; Zhang, Y.; Xie, Y. Predicting Motor Vehicle Crashes Using Support Vector Machine Models. Accid. Anal. Prev. 2008, 40, 1611–1618. [Google Scholar] [CrossRef]
Steinwart, I.; Christmann, A. Support Vector Machines; Information Science and Statistics; Springer: New York, NY, USA, 2008. [Google Scholar]
Storn, R.; Price, K. Differential Evolution–A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces. J. Global Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
Feoktistov, V. Differential Evolution: In Search of Solutions; Springer Optimization and Its Applications; Springer: New York, NY, USA, 2006; Volume 5. [Google Scholar]
Price, K.V.; Storn, R.M.; Lampinen, J.A. Differential Evolution: A Practical Approach to Global Optimization; Natural Computing Series; Springer: Heidelberg, Germany, 2005. [Google Scholar]
Rocca, P.; Oliveri, G.; Massa, A. Differential Evolution as Applied to Electromagnetics. IEEE Antennas Propag. Mag. 2011, 53, 38–49. [Google Scholar] [CrossRef]
Chong, E.K.P. An Introduction to Optimization, 4th ed.; Wiley: Hoboken, New Jersey, NJ, USA, 2013. [Google Scholar]
Kennedy, J.; Eberhart, R.C.; Shi, Y. Swarm Intelligence, 8th ed.; The Morgan Kaufmann Series in Evolutionary Computation; Morgan Kaufmann: San Francisco, CA, USA, 2009. [Google Scholar]
Bergstra, J.; Bengio, Y. Random Search for Hyper-Parameter Optimization. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
Aggarwal, C.C. Linear Algebra and Optimization for Machine Learning: A Textbook; Springer: Cham, Switzerland, 2020. [Google Scholar]
Hastie, T.; Tibshirani, R.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer Series in Statistics; Springer: New York, NY, USA, 2003. [Google Scholar]
García-Nieto, P.J.; García-Gonzalo, E.; Arbat, G.; Duran-Ros, M.; Pujol, T.; Puig-Bargués, J. Forecast of the Outlet Turbidity and Filtered Volume in Different Microirrigation Filters and Filtration Media by Using Machine Learning Techniques. J. Comput. Appl. Math. 2024, 439, 115606. [Google Scholar] [CrossRef]
Izenman, A.J. Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning; Springer Texts in Statistics; Springer: New York, NY, USA, 2013. [Google Scholar]
Hastie, T.; Tibshirani, R.; Wainwright, M. Statistical Learning with Sparsity: The Lasso and Generalizations, 1st ed.; Chapman and Hall/CRC: Boca Raton, FL, USA, 2015. [Google Scholar]
Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: New York, NY, USA, 2013. [Google Scholar]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning: With Applications in R; Springer Texts in Statistics; Springer: New York, NY, USA, 2021. [Google Scholar]
Zhang, L.; Tedde, A.; Ho, P.; Grelet, C.; Dehareng, F.; Froidmont, E.; Gengler, N.; Brostaux, Y.; Hailemariam, D.; Pryce, J.; et al. Mining Data from Milk Mid-Infrared Spectroscopy and Animal Characteristics to Improve the Prediction of Dairy Cow’s Liveweight Using Feature Selection Algorithms Based on Partial Least Squares and Elastic Net Regressions. Comput. Electron. Agric. 2021, 184, 106106. [Google Scholar] [CrossRef]
Keeney, A.J.; Beseler, C.L.; Ingold, S.S. County-Level Analysis on Occupation and Ecological Determinants of Child Abuse and Neglect Rates Employing Elastic Net Regression. Child Abuse Negl. 2023, 137, 106029. [Google Scholar] [CrossRef] [PubMed]
Shrestha, N.K.; Shukla, S. Support Vector Machine Based Modeling of Evapotranspiration Using Hydro-Climatic Variables in a Sub-Tropical Environment. Agr. Forest Meteorol. 2015, 200, 172–184. [Google Scholar] [CrossRef]
Chen, J.-L.; Li, G.-S.; Wu, S.-J. Assessing the Potential of Support Vector Machine for Estimating Daily Solar Radiation Using Sunshine Duration. Energ. Convers. Manag. 2013, 75, 311–318. [Google Scholar] [CrossRef]
De Leone, R.; Pietrini, M.; Giovannelli, A. Photovoltaic Energy Production Forecast Using Support Vector Regression. Neural Comput. Appl. 2015, 26, 1955–1962. [Google Scholar] [CrossRef]
Richards, A.P.; Haycock, D.; Frandsen, J.; Fletcher, T.H. A Review of Coal Heating Value Correlations with Application to Coal Char, Tar, and Other Fuels. Fuel 2021, 283, 118942. [Google Scholar] [CrossRef]
Hosokai, S.; Matsuoka, K.; Kuramoto, K.; Suzuki, Y. Modification of Dulong’s Formula to Estimate Heating Value of Gas, Liquid and Solid Fuels. Fuel Process. Technol. 2016, 152, 399–405. [Google Scholar] [CrossRef]
Mandavgade, N.K.; Jaju, S.B.; Lakhe, R.R. Determination of Uncertainty in Gross Calorific Value of Coal Using Bomb Calorimeter. In Advanced Instrument Engineering: Measurement, Calibration, and Design; Lay-Ekuakille, A., Ed.; IGI Global: Hershey, PA, USA, 2013; pp. 292–299. [Google Scholar]
Bishop, C.M. Pattern Recognition and Machine Learning; Information Science and Statistics; Springer: New York, NY, USA, 2006. [Google Scholar]
Picard, R.R.; Cook, R.D. Cross-Validation of Regression Models. J. Am. Stat. Assoc. 1984, 79, 575–583. [Google Scholar] [CrossRef]
Wasserman, L. All of Statistics: A Concise Course in Statistical Inference; Springer Texts in Statistics; Springer: New York, NY, USA, 2004. [Google Scholar]
Freedman, D.; Marinho, R.; Purves, R. Statistics; Norton & Company: New York, NY, USA, 2007. [Google Scholar]
Efron, B.; Tibshirani, R. Improvements on Cross-Validation: The 632+ Bootstrap Method. J. Am. Stat. Assoc. 1997, 92, 548. [Google Scholar] [CrossRef]
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2020. [Google Scholar]
Onwubolu, G.C.; Babu, B.V. New Optimization Techniques in Engineering; Studies in Fuzziness and Soft Computing; Springer: Berlin/Heidelberg, Germany, 2004; Volume 141. [Google Scholar]
Friedman, J.; Hastie, T.; Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J. Stat. Soft. 2010, 33, 1–22. [Google Scholar] [CrossRef]
Tay, J.K.; Narasimhan, B.; Hastie, T. Elastic Net Regularization Paths for All Generalized Linear Models. J. Stat. Soft. 2023, 106, 1–31. [Google Scholar] [CrossRef] [PubMed]
Yin, C.-Y. Prediction of Higher Heating Values of Biomass from Proximate and Ultimate Analyses. Fuel 2011, 90, 1128–1132. [Google Scholar] [CrossRef]
Fan, L.; Meng, X.; Zhao, J.; Zhou, Y.; Chu, R.; Yu, S.; Li, W.; Wu, G.; Jiang, X.; Miao, Z. Reaction Site Evolution during Low-Temperature Oxidation of Low-Rank Coal. Fuel 2022, 327, 125195. [Google Scholar] [CrossRef]
Paredes-Sánchez, J.P.; Gutiérrez-Trashorras, A.J.; Xiberta-Bernat, J. Energy Potential of Residue from Wood Transformation Industry in the Central Metropolitan Area of the Principality of Asturias (Northwest Spain). Waste Manag. Res. 2014, 32, 241–244. [Google Scholar] [CrossRef]
Paredes-Sánchez, J.P.; Gutiérrez-Trashorras, A.J.; Xiberta-Bernat, J. Wood Residue to Energy from Forests in the Central Metropolitan Area of Asturias (NW Spain). Urban. For. Urban Green. 2015, 14, 195–199. [Google Scholar] [CrossRef]

Figure 1. Principal varieties of coal and the transformation procedure.

Figure 2. Process diagram that illustrates the experimental approach.

Figure 3. A single

ε -

insensitive tube in a regression scenario.

Figure 3. A single

ε -

insensitive tube in a regression scenario.

Figure 4. DE/SVM optimization workflow with RBF kernel.

Figure 5. First-order effects of coal components on HHV predicted by DE/SVM model with RBF kernel.

Figure 6. The DE/SVM technique with RBF kernel for the coal HHV represented graphically by second-order terms of the four most important independent variables.

Figure 7. Relative importance ranking of coal components for HHV prediction based on linear kernel SVR normalized weights.

Figure 8. Observed vs. predicted HHV for test data using: (a) DE/SVM model with quadratic kernel

(R^{2} = 0.2471)

; (b) Empirical correlation E6

(R^{2} = 0.4887)

; (c) DE/SVM model with cubic kernel

(R^{2} = 0.6252)

; (d) RR

(R^{2} = 0.7903)

; (e) DE/SVM model with sigmoid kernel

(R^{2} = 0.8042)

; (f) DE/SVM model with linear kernel

(R^{2} = 0.8162)

; (g) ENR

(R^{2} = 0.8310)

; (h) LR

(R^{2} = 0.8362)

; (i) DE/SVM model with RBF kernel

(R^{2} = 0.9575)

.

Figure 8. Observed vs. predicted HHV for test data using: (a) DE/SVM model with quadratic kernel

(R^{2} = 0.2471)

; (b) Empirical correlation E6

(R^{2} = 0.4887)

; (c) DE/SVM model with cubic kernel

(R^{2} = 0.6252)

; (d) RR

(R^{2} = 0.7903)

; (e) DE/SVM model with sigmoid kernel

(R^{2} = 0.8042)

; (f) DE/SVM model with linear kernel

(R^{2} = 0.8162)

; (g) ENR

(R^{2} = 0.8310)

; (h) LR

(R^{2} = 0.8362)

; (i) DE/SVM model with RBF kernel

(R^{2} = 0.9575)

.

Table 1. Mean and standard deviation of coal properties (inputs) and HHV (output).

Input Variables	Name of the Parameter	Mean	Standard Deviation
Carbon content (wt%)	C	78.85	8.11
Hydrogen content (wt%)	H	5.01	0.95
Oxygen content (wt%)	O	13.13	7.95
Nitrogen content (wt%)	N	1.30	0.43
Sulfur content (wt%)	S	1.72	1.88
Output Variable
Higher heating value (MJ/kg)	HHV	30.84	4.03

Note: wt% means weight percentage.

Table 2. Hyperparameter search ranges for DE/SVM optimization.

SVM Hyperparameters	Lower Limit	Upper Limit
$C$	$10^{- 2}$	$10^{2}$
$ε$	$10^{- 6}$	$10^{1}$
$σ$	$10^{- 4}$	$10^{1}$
a	$10^{- 4}$	$10^{1}$

Table 3. Optimal DE/SVM hyperparameters by kernel type.

Kernel	Values of Optimal Hyperparameters
Linear	$C = 1.275 \times 10^{0}$ $, ε = 1.033 \times 10^{- 1}$
Quadratic	$C = 8.787 \times 10^{1}$ $, ε = 3.663 \times 10^{- 2}$ $, a = 5.035 \times 10^{0}$ $, p = 2$
Cubic	$C = 2.854 \times 10^{1}$ $, ε = 5.368 \times 10^{- 3}$ $, a = 1.890 \times 10^{0}$ $, p = 3$
RBF	$C = 8.182 \times 10^{1}$ $, ε = 5.541 \times 10^{- 5}$ $, σ = 1.0 \times 10^{0}$
Sigmoid	$C = 9.415 \times 10^{1}$ $, ε = 1.212 \times 10^{- 1}$ $, σ = 9.278 \times 10^{- 3}$ $, a = 1.552 \times 10^{- 2}$

Table 4. The RR, LR and ENR models’ optimal factor

λ

hese modelsave been create.

Table 4. The RR, LR and ENR models’ optimal factor

λ

hese modelsave been create.

Model	$Optimal λ$
LR	0.001400628
RR	0.1136093
ENR	0.001759271

Table 5. The empirical correlations of HHV obtained from the components of the ultimate analysis [9,10,11,12,13,14].

Authors	Model Equation
Channiwala and Parikh [9]	HHV (MJ/kg) = (1.0632 + 1.486•10⁻³)•[(C/3) + H•(O–S)/8] (E1)
Channiwala and Parikh [9]	HHV (MJ/kg) = 0.3403•C + 1.2432•H − 0.0628•N − 0.0984•O + 0.1909•S (E2)
Channiwala and Parikh [9]	HHV (MJ/kg) = (0.0152•H + 0.9875)•[(C/3) + H–O– (S/8)] (E3)
Channiwala and Parikh [9]	HHV (MJ/kg) = 0.3391•C + 1.4357•H+0.0931•S − 0.1237•O (E4)
Mason and Gandhi [10]; Selvig and Wilson [11]	HHV (MJ/kg) = 0.336•C + 1.418•H − (0.153 − 0.00072•O)•O + 0.0941•S (E5)
Given et al. [12]	HHV (MJ/kg) = 0.336•C + 1.418•H − 0.145•O + 0.0941•S (E6)
Chelgani [13]	HHV (MJ/kg) = −0.110 + 0.385•C (E7)
Matin and Chelgani [14]	HHV (MJ/kg) = 22124.542 + 0.431•C + 0.283•S + 0.367•H + 0.645•N (E8)

Table 6. Performance metrics (

r

and

R^{2}

) for the DE/SVM approach with multiple kernels, regularized regressions, and empirical correlations for the test data.

Table 6. Performance metrics (

r

and

R^{2}

) for the DE/SVM approach with multiple kernels, regularized regressions, and empirical correlations for the test data.

Model	$R^{2}$	r
RBF-SVM	0.9575	0.9861
Linear-SVM	0.8162	0.9203
Quadratic-SVM	0.2471	0.4970
Cubic-SVM	0.6252	0.8010
Sigmoid-SVM	0.8042	0.9212
ENR	0.8310	0.9223
LR	0.8363	0.9243
RR	0.7903	0.9075
E1	0.3276	0.5724
E2	0.4305	0.6561
E3	0.1339	0.3659
E4	0.4619	0.6796
E5	0.4668	0.6832
E6	0.4887	0.6991
E7	0.2245	0.4739
E8	0.2471	0.4971

Table 7. Relative importance of coal components in predicting HHV, derived from linear-kernel SVR weights.

Variable	Absolute Weight
C	$0.7584$
O	$0.3555$
H	$0.3249$
S	$0.2954$
N	$0.1774$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

García-Nieto, P.J.; García-Gonzalo, E.; Paredes-Sánchez, J.P.; Menéndez-García, L.A. Interpretable Optimized Support Vector Machines for Predicting the Coal Gross Calorific Value Based on Ultimate Analysis for Energy Systems. Modelling 2026, 7, 28. https://doi.org/10.3390/modelling7010028

AMA Style

García-Nieto PJ, García-Gonzalo E, Paredes-Sánchez JP, Menéndez-García LA. Interpretable Optimized Support Vector Machines for Predicting the Coal Gross Calorific Value Based on Ultimate Analysis for Energy Systems. Modelling. 2026; 7(1):28. https://doi.org/10.3390/modelling7010028

Chicago/Turabian Style

García-Nieto, Paulino José, Esperanza García-Gonzalo, José Pablo Paredes-Sánchez, and Luis Alfonso Menéndez-García. 2026. "Interpretable Optimized Support Vector Machines for Predicting the Coal Gross Calorific Value Based on Ultimate Analysis for Energy Systems" Modelling 7, no. 1: 28. https://doi.org/10.3390/modelling7010028

APA Style

García-Nieto, P. J., García-Gonzalo, E., Paredes-Sánchez, J. P., & Menéndez-García, L. A. (2026). Interpretable Optimized Support Vector Machines for Predicting the Coal Gross Calorific Value Based on Ultimate Analysis for Energy Systems. Modelling, 7(1), 28. https://doi.org/10.3390/modelling7010028

Article Menu

Interpretable Optimized Support Vector Machines for Predicting the Coal Gross Calorific Value Based on Ultimate Analysis for Energy Systems

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Dataset

2.2. Methods for Mathematical Modeling

2.2.1. Support Vector Machines (SVM) Method

2.2.2. Ridge Regression (RR) and Lasso Regression (LR)

2.2.3. Elastic-Net Regression (ENR)

2.2.4. Differential Evolution (DE) Optimization Algorithm

Initialization

Mutation

Recombination

Selection

2.3. The Precision of This Approximation

3. Results and Discussion

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI