Arti�cial neural network as a tool for estimation of the higher heating value of miscanthus based on ultimate analysis

Miscanthus is a perennial energy crop that produces high yields and has the potential to be converted into energy. The ultimate analysis determines the composition of the biomass and the energy value in terms of the higher heating value (HHV), which is the most important parameter in determining the quality of the fuel. In this study, an arti�cial neural network (ANN) model based on the principle of supervised learning was developed to predict the HHV of miscanthus biomass. The developed ANN model was compared with the models of predictive correlations (suggested from the literature) and the accuracy of the developed model was determined by the correlation coe�cient. The paper presents data from 192 miscanthus biomass samples based on ultimate analysis and HHV. The developed model showed good properties and the possibility of prediction with high accuracy (r 2 = 0.975). The paper proves the possibility of using ANN models in practical application in determining fuel properties of biomass energy crops and greater accuracy in predicting HHV than the correlation models offered. coe�cient of determination (r 2 ), mean bias error (MBE) and mean percentage error (MPE). The RMSE shows the e�ciency of the model by comparing the predicted values with the already measured values. The value


Introduction
Recently, energy crops have been increasingly used as raw materials for energy production. Cultivation of energy crops is possible on neglected (marginal) agricultural land that is not used for growing food crops. The production of thermal energy from biomass is highly e cient and sustainable. The main advantage of using biofuel from biomass is the reduction of greenhouse gases due to the neutrality of carbon dioxide. Research on energy crops for biomass production shows the possibility of environmental protection and economic production e ciency and provides a sustainable way of energy production [1]. By using biomass as an energy source, a signi cant reduction in greenhouse gas emissions can be achieved. For this reason, biomass is considered a good substitute for fossil fuels and has been increasingly studied recently [2]. According to the European Commission [3], biomass is one of the most important renewable energy sources in the EU and can provide the possibility of a reliable energy supply. Miscanthus is an energy crop used to produce biomass, and its cultivation provides high yields per unit area. Miscanthus is a perennial energy crop with low agrotechnical requirements and can be grown on marginal soils The quality of biomass-derived fuels is in uenced by the physical and chemical properties of the biomass. The content of carbon, hydrogen, nitrogen, sulphur and oxygen determined by ultimate analysis are important chemical parameters that affects the quality of the fuel [4].
Author [5] state that ultimate analysis is important in determining the fuel properties. The heating value indicates the heat energy generated during combustion. HHV is an important energy property of fuels that de nes the energy e ciency of feedstock use and it is in uenced by the chemical composition of the raw material. HHV is an important aspect in evaluating the energy properties of biomass [6]. Biomass is composed of various elements, but carbon, hydrogen, and oxygen make up a majority (97-99%) of the biomass content [6]. Empirical methods for determining the composition and energy properties of biomass are time-consuming and costly, so mathematical models have recently been developed that can facilitate the prediction process. In determining the combustible properties of energy crops, prescribed laboratory methods are used that provide high precision of the nal results. Given this, ANN can be used as a mathematical tool for predicting the energy properties of biomass [7]. ANN as a form of non-linear models can calculate the HHV of miscanthus biomass, based on ultimate analysis, with high precision and are recognised as a potential method for predicting biomass heating value and reducing the time and cost of the process [8].
ANN belong to the eld of arti cial intelligence and have recently been increasingly used as a mathematical tool that enables predictions with great precision. ANN have several advantages over correlation-based models. They can handle a large amount of aggregated data and can detect nonlinear relationships between dependent and independent variants as well as possible interactions between variables [9]. The application of ANN as a model for biomass research is still at an early stage, but over time there is growing interest in its use [10]. Author [11] conducted research in which an ANN model was developed as an arti cial intelligence model for predicting biomass with higher heating values. The research shows the practical use of applying the ANN model as a method for predicting the energy values of biomass. [12] used ultimate analysis data of different types of waste in their research and developed the ANN model to predict the HHV. The model was used to predict energy properties to evaluate the possibility of converting waste into useful energy. Research has shown that algorithms can be successfully used in determining these properties. In a study conducted by the authors [13], an ANN model was developed to predict the gasi cation performance of different types of biomass. The developed model successfully simulated the vegetation process with an acceptable margin of error. The model also proved successful in predicting the calori c value of different biomass samples.
The aim of this work was to develop a ANN model for predicting HHV of miscanthus biomass based on ultimate analysis. In addition, already developed models of predictive correlations for HHV were collected from the literature and used for the calculations. The input data used for ANN and the predictive correlations were based on the ultimate analysis and included data on the percentage of nitrogen (N), carbon (C), sulphur (S), hydrogen (H) and oxygen (O). ANN was developed using the principle of supervised learning and compared the obtained data on predicted HHV with the experimentally obtained data on HHV. Yoon's interpretation method was used to determine the relative importance of the input parameters in the ANN model calculations. The author [8] states that it is of great importance to determine the factors of relevance (in uence) of input variables on the target result. The author [14] states that the relevance factor

Crop establishment and data collection
The authors [15] stated that the planting of miscanthus was established in 2011 at the Grassland Center (Medvednica). It was harvested in March 2020, at the beginning of the next growing season. The testing of Miscanthus biomass samples was performed in the laboratory of the Faculty of Agriculture in Zagreb. The samples were dried in a laboratory dryer. After drying, the samples were ground in a laboratory mill. Each sample was analyzed three times to ensure accurate analysis. The percentages of C, H, N, and S were determined simultaneously using the dry combustion method CHNS analyzer. The calori c value was determined using an oxygen bomb calorimeter, given in MJ/kg in dry mass. Data from ultimate analysis and HHV data for miscanthus were collected from the literature and are presented in supplementary Table 1

Statistical analysis
Statistical processing was performed using the software package TIBCO STATISTICA 13.3.0 (StatSoft TIBCO Software Inc.). The analyzed data are presented as means with standard deviation. Analysis of variance (ANOVA) with Tukey's HSD post hoc test to compare sample means was used to examine variation in observed parameters.
To show the performance of the developed ANN model and predictive correlations for calculating HHV with ultimate analysis inputs (N, C, S, H, and O), it is necessary to calculate statistical parameters: reduced chi-square (x 2 ), root mean square error (RMSE), coe cient of determination (r 2 ), mean bias error (MBE) and mean percentage error (MPE). The RMSE shows the e ciency of the model by comparing the predicted values with the already measured values. The value obtained by the MBE is used as an indicator of the standard deviation of the predicted values from the measured values [17]. The listed parameters are given by the following equations (1-3) [18]: Yoon's method of global sensitivity (Eq. 4) was used to calculate the direct in uence of the input parameters on the output variables, corresponding to the weighting coe cients within the ANN model [19]: Where w -denotes the weighting factor in the ANN model, i -input variable, j -output variable, k -hidden neuron, nnumber of hidden neurons, m -number of inputs.

ANN modelling
ANN are among the most researched areas of neurocomputing. A multilayer perceptron (MLP) is a neural network with hidden layers Fig. 1. ANN can adapt its internal structure depending on the input data and the nal goal of the function. The basic characteristics of ANN are the ability to learn independently, the ability to adapt the system to the available information and data processing, and to perform complex mathematical operations at high speed. The number of neurons and hidden layers in ANN can vary and is determined by the trial and error method [11]. Neural networks are categorized by their architecture, topology, and learning mode [20]. Neural networks take inputs, compute them, and convert them into outputs. This process is called the learning process of the network. The learning process of ANN can be supervised and unsupervised. In supervised learning mode, the model has access to output data for computations, while in unsupervised mode, there is no output data [21].  | | Models of ANN can provide a link between input and output data without using a complicated type of computational method. MLP ANN is recognized as the most effective type of ANN [22,9]. ANN is a mathematical structure developed from the motivation of the learning process in the human brain. ANN is a promising modeling technique for datasets with nonlinear relationships. Multilayer feedforward networks (MLP-ANN) consist of interdependent units (neurons). These neurons are arranged in the form of layers (input, hidden, and output layers). The number of neurons and hidden layers varies and can be determined by the method of trial and error so that the model error is minimal [11].
Different transfer functions and random values for weighting coe cients and bias were used. Training of the network data was set up during the ANN learning cycle to determine the number of neurons and adjust the weight coe cients in each neuron [23]. The biases and weight coe cients related to the hidden and the output layers of the model are represented in the matrices and vectors W 1 and B 1 and W 2 and B 2 , respectively [15]. The neural network model can be represented in matrix notation: Eq. (5) for calculating the output data of the neural network [24]: Where Y represents the output value, f 1 and f 2 represent the transfer function in the hidden and output layer, X represents the matrix of the input layer [25].
The Broyden -Fletcher -Goldfarb -Shanno (BFGS) algorithm was used for the calculations. The BFGS algorithm is one of the most effective algorithms for optimization and can be successfully used for the optimization of multivariate problems [26]. Table 1 presents the models of the proposed equations for the calculation of HHV biomass found in the literature [27,6,28]. The models are based on establishing correlations between variables based on ultimate analysis and HHV output values. 10 HHV = a + b ⋅ (C) 2 Nhuchhen & Afzal, 2017 Table 1 3. Results And Discussion Table 2 shows the mean values of the variables of the ultimate analysis and HHV with standard deviation and Tukey's HSD test of miscanthus.  The correlation analysis of the parameters of ultimate analysis and HHV was performed via Rstudio and related packages (corrplot).

Figure 2
Page 8/19 The diagram of the correlation matrix shows the correlation coe cients between the variables. Positive values of the correlation coe cient are shown in blue, while negative values are shown in red. The intensity of the colour in the circle is proportional to the correlation coe cient. In Fig. 2 it can be observed that the elements O, S, and N are positively correlated with the value of HHV, while C and H are negative. It can be seen that variable S has the highest positive correlation coe cient, i.e. a signi cant in uence on HHV, while variables N and O also have positively correlated values, but less in uence on HHV. The variable H in the correlation graph shown has a negative correlation value on HHV. Based on Fig. 2, HHV is best correlated with the concentrations of H, S, and N (when the blue color is shown, it is a positive correlation).
After determining the mean of all parameters, the correlations of the variables and their contribution were determined. The in uence of the variables (N, C, S, H, O, and HHV) and the samples are combined graphically.

Figure 3
Principal Component Analysis (PCA) is used in the search for orthogonal directions of greatest dispersion of given data with the task of nding patterns in the distribution of individual data with respect to the original data de ned in a space with multiple dimensions [30]. The analysis is also used to build predictive models, and it is easy to interpret the impact of individual variables on a given value. In Fig. 3 Table 3 shows the calculated statistical test of "goodness of t" for the proposed models to calculate the HHV value of HHV based on ultimate analysis  Table 3 Page 9/19 The presented models in the calculations did not show su cient accuracy and precision to be used as a reliable method for predicting HHV biomass of miscanthus. The correlation coe cient (r 2 ) was used as the most important statistical parameter to evaluate the suitability of the mathematical models, which was lowest for model 1 and model 8 (r 2 = 0.00) and highest for model 9 (r 2 = 0.47) in the calculations for 10 different models.

ANN model
In developing the model ANN, the input variables (N, C, S, H, and O) and the output value (HHV) had to be determined. The weights and biases were determined randomly by looking for values that would make the model accurate enough to predict the output.
The ANN model developed for the prediction of HHV showed a good ability to generalize data and predict. The model showed the best performance with 20 neurons in the hidden layer within the network, where a high r 2 value (0.975 overall) and an overall low sum of squares value (SOS) were achieved during the training cycle (Table 4.)  Table 4 shows the weight coe cients and biases of the developed MLP-ANN network model. It can be seen that the best results were obtained with a hidden layer with a number of 20 hidden neurons, where the experimental values of HHV best match the values of HHV calculated with the ANN model. Table 4   Table 5 shows the training performance of the model ANN, expressed by the correlation coe cient (r 2 = 0.975) and by the training error of the model of 0.004. Table 5 shows the results of the statistical test indicating the deviations between the observed values and the expected values. The values shown indicate the ability of the algorithm to predict according to the given model data.  Tanh   Table 5 Therefore, the ANN structural model MLP 5-20-1 proved to be su ciently accurate to predict HHV based on the N, C, S, H, and O contents. The training performance value (0.975) shows that the model is able to predict values almost equal to the measured values.
Scatterplot is one of the most common visualization techniques, and displays and displays the behaviour of the entered data [31,32 ]. Figure 4. shows the data of the predicted HHV versus the target HHV, which largely shows the overlap.

Figure 4
The calculated parameters comprising the statistical test "goodness of t" are shown in   Table 6 Figure 5 The range in which the relevance factor is determined is between − 1 and 1. The increase of HHV is mainly in uenced by the increase of the parameters N and S. The in uence of input variables was studied according to Yoon The predictive correlation models offered in the literature are used as nonlinear models to predict HHV biomass of miscanthus. As shown in the paper, the use of the predictive models does not provide suitability and su cient accuracy in determining the HHV miscanthus with respect to the input parameters. Using ANN as a nonlinear model to determine the HHV value provides a more convenient way of prediction and provides more accurate weighting coe cients and biases, which are the basis for establishing correlations between input parameters and output data.

Conclusion
The use of ANN models to predict the energy properties of biomass has been increasingly explored recently. The calculations performed according to the proposed non-linear mathematical models are not suitable enough to predict the HHV biomass of miscanthus (r 2 ≤ 0.47). Incorporating available data from the ultimate analysis of miscanthus the developed neural network model showed high accuracy in predicting the higher heating value (overall r 2 = 0.975). The factors N, C, S, H, and O in uence the value of HHV. In the developed model, the increase in HHV is mainly in uenced by the increase in the values of the parameters N and S. Although these models are not yet widely used as mathematical models for prediction (especially for variables that have nonlinear relationships), they offer the possibility of obtaining the desired result with less time, lower cost, and satisfactory accuracy, which can replace existing empirical methods.