Prediction of the Calorific Value and Moisture Content of Caragana korshinskii Fuel Using Hyperspectral Imaging Technology and Various Stoichiometric Methods

De, Xuehong; Li, Haoming; Zhang, Jianchao; Li, Nanding; Wan, Huimeng; Ma, Yanhua

doi:10.3390/agriculture15141557

Open AccessArticle

Prediction of the Calorific Value and Moisture Content of Caragana korshinskii Fuel Using Hyperspectral Imaging Technology and Various Stoichiometric Methods

by

Xuehong De

^1,2,*,†,

Haoming Li

^1,†,

Jianchao Zhang

¹

,

Nanding Li

¹,

Huimeng Wan

¹ and

Yanhua Ma

^1,2

¹

Faculty of Mechanical and Electrical Engineering, Inner Mongolia Agricultural University, Hohhot 010020, China

²

Inner Mongolia Engineering Research Center of Intelligent Equipment for the Entire Process of Forage and Feed Production, Hohhot 010018, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Agriculture 2025, 15(14), 1557; https://doi.org/10.3390/agriculture15141557

Submission received: 24 May 2025 / Revised: 25 June 2025 / Accepted: 9 July 2025 / Published: 21 July 2025

(This article belongs to the Section Agricultural Technology)

Download

Browse Figures

Versions Notes

Abstract

Calorific value and moisture content are the key indices to evaluate Caragana pellet fuel’s quality and combustion characteristics. Calorific value is the key index to measure the energy released by energy plants during combustion, which determines energy utilization efficiency. But at present, the determination of solid fuel is still carried out in the laboratory by oxygen bomb calorimetry. This has seriously hindered the ability of large-scale, rapid detection of fuel particles in industrial production lines. In response to this technical challenge, this study proposes using hyperspectral imaging technology combined with various chemometric methods to establish quantitative models for determining moisture content and calorific value in Caragana korshinskii fuel. A hyperspectral imaging system was used to capture the spectral data in the 935–1720 nm range of 152 samples from multiple regions in Inner Mongolia Autonomous Region. For water content and calorific value, three quantitative detection models, partial least squares regression (PLSR), random forest regression (RFR), and extreme learning machine (ELM), respectively, were established, and Monte Carlo cross-validation (MCCV) was chosen to remove outliers from the raw spectral data to improve the model accuracy. Four preprocessing methods were used to preprocess the spectral data, with standard normal variate (SNV) preprocessing performing best on the quantitative moisture content detection model and Savitzky–Golay (SG) preprocessing performing best on the calorific value detection method. Meanwhile, to improve the prediction accuracy of the model to reduce the redundant wavelength data, we chose four feature extraction methods, competitive adaptive reweighted sampling (CARS), successive pojections algorithm (SPA), genetic algorithm (GA), iteratively retains informative variables (IRIV), and combined the three models to build a quantitative detection model for the characteristic wavelengths of moisture content and calorific value of Caragana korshinskii fuel. Finally, a comprehensive comparison of the modeling effectiveness of all methods was carried out, and the SNV-IRIV-PLSR modeling combination was the best for water content prediction, with its prediction set determination coefficient

(R_{P}^{2})

, root mean square error of prediction (RMSEP), and relative percentage deviation (RPD) of 0.9693, 0.2358, and 5.6792, respectively. At the same time, the moisture content distribution map of Caragana fuel particles is established by using this model. The SG-CARS-RFR modeling combination was the best for calorific value prediction, with its

R_{P}^{2}

, RMSEP, and RPD of 0.8037, 0.3219, and 2.2864, respectively. This study provides an innovative technical solution for Caragana fuel particles’ value and quality assessment.

Keywords:

hyperspectral imaging technology; characteristic variable selection; biomass fuel; Caragana korshinskii

1. Introduction

Caragana korshinskii is an excellent shrubby plant with strong adaptability, drought and cold resistance, and high temperature resistance. It is planted in large areas in northern China, such as Inner Mongolia, Shanxi, and other places. Its branches are solid, tough, and oily, with good ecological and economic benefits. Caragana korshinskii has unique growth characteristics. It needs to be pruned during its growth period to rejuvenate and prevent degradation. Within 3 years after harvesting, the green shoots and stalks can be processed into feed, but Caragana korshinskii will become lignified and lose its nutrients after more than 3 years, making it more difficult to use as feed. To continue the economic value of Caragana korshinskii, processing it into solid fuel particles convenient for storage and transportation is the best technical way [1].

The production and sales of Caragana fuel particles have a large market in eastern Inner Mongolia. Before the processed Caragana fuel particles are put on the market, the quality of Caragana fuel should be tested. The moisture content and calorific value are important indicators for measuring the quality of biomass fuel [2]. At the same time, moisture content and calorific value are also indispensable reference indicators for market pricing of biomass particles in the trading process. Calorific value is an index to measure the amount of energy released during fuel combustion. Those with high calorific value can release more heat during combustion [3]. Moisture content will affect the mass fraction of combustible substances in biomass fuel. High moisture content will reduce the mass fraction of combustible substances in unit mass fuel, thus reducing the calorific value of fuel. At present, the detection method for this is still the traditional chemical analysis, which requires professional laboratory personnel to carry out detection in the laboratory, with a heavy workload and high professional requirements. To solve this technical problem, based on the characteristics of spectral imaging technology [4], this paper proposes a new detection method for measuring its calorific value and moisture content, which has important practical significance for improving the fine utilization level of Caragana pellet fuel.

Spectral analysis technology uses the spectral characteristics of substances to analyze and study the structure and chemical composition of substances [5]. In recent years, hyperspectral detection and analysis technology has been proven to have great potential in the field of fuel quality detection. The essence of spectral detection is to extract the information of hydrogen-containing groups by appropriate chemometrics methods. These hydrogen-containing groups (C-H, N-H, O-H, etc.) are related to physical and chemical properties and microstructure parameters [6]. Shen et al. [7] used hyperspectral technology to predict the partial least squares regression (PLS) model of moisture, ash, volatile matter, fixed carbon, and calorific value of Chinese fir, pine, and cotton stalks. The research results showed that the test accuracy of ash, moisture, and volatile matter using the first derivative spectral PLS model was high and reached the practical level. Wang et al. [8] established partial least squares regression (PLSR) and improved partial least squares regression (MPLS) models of calorific value and moisture for different fineness of waste biomass agricultural and forestry fuels. The results showed that the determination coefficient of high calorific value content was 0.9367. Zhang et al. [9] established a prediction model for gross calorific value (GCV) of woody and herbaceous samples. The determination coefficient

R^{2}

of the woody subset model was between 0.88 and 0.90, and the root prediction deviation (RPD) was between 2.90 and 3.10. The determination coefficient

R^{2}

of the herbaceous subset model was between 0.91 and 0.95, and the RPD was between 3.31 and 4.53. Han et al. [10] used spectral analysis technology to build PLS, support vector regression (SVR), and backpropagation neural network (BPNN) prediction models for the moisture content of biomass fuel. The results show that the BPNN model has the best prediction effect on water content, and the root mean square error can be reduced by 15.48% combined with the BPNN model with a supervised depth self-encoder. Li et al. [11] established the prediction model of calorific value and ash content of Salix psammophila, extracted the features through wavelet transform (LWT), competitive adaptive weighting (CARS) and continuous projection (SPA), established the partial least squares (PLS) algorithm and convolutional neural network (CNN) model, and optimized the model through whale optimization algorithm (WOA). The results show that CARS-WOA-CNN has the best prediction effect on calorific value (CV) and ash (AC), with

R^{2}

values of 0.858 and 0.751, respectively. Fagan et al. [12] built a PLS algorithm model to predict the moisture, calorific value, ash, and carbon content of two special bioenergy crops. The results showed that the root mean square errors of cross validation of moisture, calorific value, ash, and carbon content were 0.90% (

R^{2}

= 0.99), 0.13 MJ/kg (

R^{2}

= 0.99), 0.42% (

R^{2}

= 0.58), and 0.57% (

R^{2}

= 0.88), respectively. The prediction model of moisture and calorific value has excellent accuracy. Mancini et al. [13] used a partial least squares (PLS) algorithm to build a prediction model for the gross calorific value and ash content of wood chips. The research showed that the root mean square error (RMSECV) of gross calorific value prediction was 234 j/g, and the RMSECV of ash was 0.44%. The PLS model has good prediction ability. Noushabadi et al. [14] predicted the high calorific value of biomass fuel particles based on machine learning technology, and used ant colony adaptive neuro fuzzy inference system and genetic algorithm radial basis function to estimate the high calorific value of biomass fuel. The results show that the genetic algorithm radial basis function has high application potential in the prediction of high calorific value. In conclusion, spectral analysis technology has a wide range of applications in the detection of particulate fuel.

The purpose of this study is to develop a non-contact rapid detection method of moisture and calorific value of Caragana korshinskii fuel, to realize the real-time dynamic monitoring of the quality of Caragana korshinskii particulate fuel in the production process. The specific research objectives are as follows: (1) A total of 152 Caragana korshinskii fuel particle samples were collected and prepared from four primary production regions within the Inner Mongolia Autonomous Region, China. Subsequently, spectral data were acquired for these samples. The spectral data of the powder sample and rod sample were extracted by using Envi5.6 software, CV2 Library of Python 3.7 software, and the GDAL module of osgeo library. (2) The moisture content and calorific value of Caragana fuel were determined using multiple chemometric methods adhering to national standards. (3) Four types of preprocessing methods and four feature extraction algorithms were selected to process and select spectral data for moisture content and calorific value of the fuel. Finally, three quantitative detection models (PLSR, RFR, ELM) were established, and the most stable modeling combination was selected through comparison. (4) The moisture content distribution of Caragana korshinskii pellet fuel was visualized by generating a pseudo-color map.

2. Materials and Methods

2.1. Preparation of Samples

Caragana fuel particles from the following sources were selected for this experiment:

(1) Caragana korshinskii was collected from Helingeer County, Hohhot City, Inner Mongolia Autonomous Region, in September 2024. The average diameter of Caragana korshinskii was 16 mm. The Caragana korshinskii fuel pellets were crushed and sieved at the Inner Mongolia Agricultural University Machinery Plant using existing crushing and molding equipment; then the moisture content was adjusted and made into 8 mm sized Caragana korshinskii fuel pellets. A total of 41 samples were prepared by randomly weighing 150 g as a sample. They were sealed and stored in a 14 cm × 20 cm sealed bag and stored in a refrigerator (refrigerator temperature 6 °C). The preparation process of Caragana pellet fuel is shown in Figure 1.

(2) In October 2024, standardized 8 mm diameter fuel pellets manufactured from Caragana were acquired from Xingwei Bi-technology Co., Ltd., based in Tongliao City, Inner Mongolia Autonomous Region, China. Ningtiao fuel pellets, 6 mm in size, were purchased from Rui’er Biomass Energy Development Co., Ltd., in Xing’an League, Inner Mongolia Autonomous Region, China and 4 mm sized Ningtiao fuel pellets were acquired from Longshunzhuang Agriculture and Animal Husbandry Co., Ltd., in Ulanqab City, Inner Mongolia Autonomous Region, China. The particles from each region were sampled using 150 g per bag. The specific sample information obtained is shown in Table 1.

Firstly, 10 g of samples were randomly selected from each region, and according to the national standard GB/T28731-2012 [15], the initial moisture content was determined using an electric constant temperature air drying oven (202-00 T, Shanghai Lichen Bangxi Instrument Technology Co., Ltd., Shanghai, China). Set the oven at 105 °C, put in the sample, dry for 4 h ± 0.1 h, weigh and record the dried bottle of feed, accurate to 1 mg, put it in the oven again for 1 h, weigh and record, and repeat until the last weight change is within 0.1%, indicating that the drying is complete. After the initial moisture content is determined, the Caragana pellet fuel is sprayed evenly with a watering can, and then packaged and placed in the refrigerator for 24 h to control the moisture content for different gradients. The moisture content distribution range of the sample is shown in Figure 2a below.

2.2. Spectral Collection and Composition Determination

The spectral image of the material was collected using a Specim FX17 hyperspectral imager. Before spectral acquisition, the spectrometer was preheated for 30 min, and the position of the equipment’s moving platform and scanning speed were adjusted to ensure that the spectrum could fully capture the entire Petri dish containing the sample. The positioning speed of the platform was set to 40 mm/s, the scanning speed to 16.9 mm/s, the target acquisition range to 71–260 mm, and the spectral resolution to 8 nm. The detection schematic diagram and sample acquisition diagram of the hyperspectral imaging system are shown in Figure 3.

Moisture content for each sample was assessed via electric constant temperature blast drying (model DHG-9140a; Shanzhi Instrument Equipment Co., Ltd., Shanghai, China), in accordance with GB/T 28731-2012. The formula employed for the calculation is presented below.

H = \frac{M_{1} - M_{2}}{M_{1}}

(1)

where H is the moisture content, M1 is the mass before drying, and M2 is the mass after drying.

The fuel particles of Caragana korshinskii were dried in an oven at 45 °C until a constant mass was obtained. After drying, the particles were crushed and then passed through 20, 30, and 60 mesh sieves individually. The spectrum of the pulverized sample was photographed again, and then the calorific value of each sample was measured by a ZDHW-HN8000 microcomputer automatic calorimeter (Hebi Huaneng Electronic Technology Co., Ltd., Henan, China). The average value of each sample was taken as the calorific value of the sample after three measurements. The distribution range of calorific value is shown in Figure 2b. See Figure 4 for spectrum acquisition and experimental design.

2.3. Data Processing Method

2.3.1. Extraction of Average Spectral Data

Data extraction uses Envi 5.6 software to extract spectral features of samples, selecting regions of interest (ROIs) for extraction. Each sample selects 10 ROIs, calculates the average spectrum as the representative spectrum of the sample, and obtains 224 spectral data points for each sample.

2.3.2. Removal of Abnormal Samples

Due to the influence of noise from spectral instruments and experimental errors, there may be abnormal samples that are detrimental to modeling. In this paper, the Monte Carlo cross-validation (MCCV) method is selected for the removal of abnormal samples. It can detect outliers in both spectral and property arrays. The partial least squares regression (PLSR) model is established using Monte Carlo sampling (MCS) [16,17]. The mean (MEAN) and standard deviation (STD) of the prediction residuals for each sample are calculated. 75% of the samples are selected as the training set, and 25% as the test set. This process is repeated multiple times until all samples have been predicted.

2.3.3. Spectral Data Preprocessing

The particle and powder fuel surfaces of Caragana korshinskii are uneven, and the sample size is uneven. These factors will produce light scattering during shooting, which will have a certain impact on the spectral data. To eliminate these spectral data anomalies caused by external factors and improve the modeling accuracy, the collected spectral data are preprocessed. In this paper, four preprocessing methods are selected: multivariate scattering correction (MSC), standard normal variate (SNV), Savitzky–Golay (SG), and maximum minimum normalization (MMN). MSC can eliminate baseline shifts in spectra caused by scattering due to variations in sample size during spectral acquisition. Its advantage lies in its ability to effectively handle scattering effects induced by differences in sample packing density and particle size. However, a significant drawback is its substantial dependence on the selection of reference spectra. Consequently, this method is particularly applicable in scenarios where variations in the physical state of the sample—such as packing density and particle size—exert a pronounced influence on spectral scattering. SNV can eliminate spectral scattering induced by sample inhomogeneity. Its advantage is that it allows for independent data correction for each data point within each individual spectrum. However, when outliers are present in the sample, they can affect the standard deviation, consequently impacting the correction effectiveness. This method is thus more suitable for processing spectral data of powder and granular samples [18,19]. An SG filter can reduce high-frequency noise in spectral data. Its primary advantage is that, while performing data smoothing, it is capable of preserving the shape and position of spectral absorption peaks. However, at the beginning and end of the data, processing inaccuracies may arise due to incomplete window coverage. Consequently, this method is particularly applicable to datasets that exhibit significant noise requiring smoothing treatment. MMN technique adjusts the scale range of spectral data, thereby eliminating substantial intensity variations. Its primary advantage lies in the scaling of the data, which mitigates the influence of absolute intensity differences across different spectra. However, the presence of extreme outliers (both maximum and minimum values) can directly affect the normalization range, potentially leading to data loss. Consequently, this method is particularly well-suited for processing spectral data acquired under varying acquisition conditions [20,21].

2.3.4. Characteristic Variable Screening

To filter out some irrelevant and redundant features from the complex spectral data, the key information most closely related to the nature of the research object is selected to improve the computational efficiency, simplify the model, and improve the accuracy at the same time. In this paper, four feature extraction methods are selected to extract the characteristic wavelength of Caragana korshinskii biomass fuel.

Competitive adaptive weighted sampling (CARS) is a characteristic wavelength sampling method based on the combination of Monte Carlo sampling and partial least squares model regression coefficient [22]. The characteristic wavelength is selected by the PLSR model. The frequency of characteristic wavelength selection is recorded by adaptive weighted sampling (ARS), and the low contribution wavelength with small absolute value of regression coefficient is filtered by exponential decay function (EDF). Conduct N times of sampling, and select the set with the minimum cross-validation root mean square error (RMSECV) in n subsets as the selected characteristic wavelength [23].

Continuous projection algorithm (SPA) is a forward circulation algorithm, which is widely used in spectral analysis [24]. It mainly projects the wavelength to other wavelengths continuously, selects the wavelength with the largest projection vector as the characteristic wavelength, and finally determines the characteristic wavelength set through the correction model. In each iteration, SPA evaluates the orthogonal projection of each wavelength in the candidate wavelength set to the currently selected wavelength set. By calculating the projection error between the candidate wavelength and the selected wavelength, the wavelength that minimizes the overall model error is selected. This process continues until the number of selected wavelengths reaches a set value n or until the addition of new wavelengths significantly improve the prediction performance of the model.

Iteratively retains informative variables (IRIV) is an iterative screening feature selection algorithm. Its core is to gradually identify and retain the features most relevant to the target variable, and eliminate irrelevant or redundant features through multiple iterative screenings, variable importance assessment, and cyclic optimization [25]. In each iteration, the algorithm selects features from the current feature set that can maximize the retention of information related to the target variable and evaluates the importance of the contribution of features to the prediction performance of the model. By generating a matrix containing 0 and 1 to simulate the feature selection process, the least square method and cross validation are used to evaluate the performance of the feature subset, and the nonparametric test method is used to distinguish feature types and remove invalid or interfering features. Finally, the key feature set most related to the target variable is determined [26].

Genetic algorithm (GA) is a heuristic optimization algorithm based on the biological evolution process, which uses the steps of population initialization, fitness evaluation, selection, crossover, mutation, replacement, convergence check, and iteration to find the best feature points [27]. Its characteristic is to search for the global optimal solution by simulating the biological evolution process in the whole feature space. The fitness function is used to evaluate the advantages and disadvantages of each feature combination, guide the search process to evolve to a better feature combination, and maintain the diversity of the population through crossover and mutation operations to avoid premature convergence to a suboptimal solution.

2.3.5. Model Establishment and Evaluation

In this paper, three prediction models, partial least squares regression (PLSR), random forest regression (RFR), and extreme learning machine (ELM), were established. The relationship between the characteristic spectrum data and the moisture content and calorific value of Caragana fuel particles was discussed, and the optimal prediction model was determined.

Partial least squares regression (PLSR) is the most commonly used modeling method in chemometrics analysis. It combines the advantages of various algorithms to realize regression modeling, data structure simplification, and correlation analysis between variables at the same time [28]. Its core is to project the high-dimensional data space of the independent variable and the dependent variable into the corresponding low-dimensional space, obtain the mutually orthogonal eigenvectors of the independent variable and the dependent variable, and then establish the univariate linear regression relationship between the eigenvectors of the independent variable and the dependent variable. In the calculation process, the algorithm considers both spectral data and stoichiometric values to process the covariance of the data. The foundational steps for analyzing the PLSR algorithm are depicted in Figure 5a. The 10-fold cross-validation method is used for cross-validation, gradually increasing the number of principal component fractions and comparing the root mean square errors of different principal component fractions to determine the best number of hidden variables, to achieve the best prediction performance.

Random forest regression (RFR) is an integrated learning model based on decision trees. Its core mechanism is to build multiple decision trees by using bootstrap sampling and a feature random selection strategy, and aggregate the prediction results of these trees on average, which effectively reduces the risk of over-fitting and improves the generalization ability of the model [29]. The algorithmic framework of the random forest is illustrated in Figure 5b. Stochastic Forest regression can handle high-dimensional data and adapt to complex nonlinear relationships, and has strong robustness to outliers and noise [30]. The number of trees and leaf nodes directly affects the complexity of the random forest model and the probability of overfitting. To find out the optimal number of trees and leaf nodes in the model, the test starts from the default value of 50 for the tree and tests in steps of 10 to 300. The number of leaf nodes starts from the default value of 1 and tests in steps of 2 to 25. The change in root mean square error was observed, and the result with the minimum root mean square error was selected as the best parameter. Finally, it was determined that the number of trees in the random forest model with moisture content is 100, and the number of leaf nodes is 5. The number of trees in the random forest model with calorific value is 100, and the number of leaf nodes is 15.

Extreme learning machine (ELM) is a nonlinear regression modeling method based on an artificial neural network. The model is composed of the input layer, hidden layer, and output layer. The input layer is responsible for receiving features, the hidden layer contains a large number of neurons, and the output layer is used to produce the final prediction results [31]. The architecture of the ELM network is depicted in Figure 5c. The key feature of ELM is to randomly initialize the connection weights from the hidden layer to the output layer [32]. These weights remain fixed during the training process. Therefore, determining the number of neurons in the hidden layer and selecting the best activation function are the keys to achieving the best performance of the ELM model. The sigmoid function is selected as the activation function, and the best number of neurons in the hidden layer is found by using the five-fold cross-validation.

To evaluate the prediction performance of the model, the indices of the models built by different modeling methods are compared, and the best combination model is selected. The evaluation indices of this model are as follows: test set determination coefficient (

R_{P}^{2}

) and training set determination coefficient (

R_{C}^{2}

), training set root mean square error (

R M S E P

) and test set root mean square error (

R M S E C

), and relative percentage deviation (

R P D

). To facilitate a comparative analysis and judgment of the models’ performance, the standard error (SE) was introduced as a key metric. The closer the result of the determination coefficient is to 1, the better the regression effect of the model is, and the smaller the root mean square error of the training set and the test set is, indicating that the higher the accuracy of the model, the better RPD values and lower standard errors (SE) correlate with improved and more stable model performance [33].

3. Results

3.1. Analysis of Original Data

The moisture content and calorific value of Caragana fuel were measured three times for each sample, and the average value was taken. The measurement results are shown in Table 2. The spectral data of the Caragana fuel rod and powder samples were extracted. The average spectral curve of 152 samples is shown in Figure 5. The hyperspectral reflectance curves of Caragana korshinskii fuel with different moisture content and calorific value are different, but the trend of its spectral lines is consistent as a whole. There are multiple absorption peaks in the whole wavelength range, and the absorption peaks are more obvious in the wavelength ranges of 1185.88–1234.88 nm and 1431.99–1470.92 nm. The absorption peak near 1200 nm may be related to lignin present in Caragana korshinskii and C-H bond in cellulose, while the absorption peak near 1450 nm may be related to C-H combination (aromatic group) or O-H double frequency (water). The specific absorption peak positions are shown in positions 1 and 2 in Figure 6a,b [34].

3.2. Elimination of Abnormal Samples

The original spectrum was removed by Monte Carlo random sampling (MCS), and 75% of the samples were selected as the training set to establish the partial least squares regression model. The remaining 25% is used as the test set, and the prediction residuals of a group of all samples are obtained by 10,000 cycles. The mean and variance (STD) of the prediction residuals of each sample are found, and the mean–std diagram is drawn to eliminate abnormal samples. See Figure 7 for the mean–std diagram of moisture content and high calorific value. It can be seen from the variance distribution of Caragana korshinskii’s mean high calorific value in Figure 7a that samples 110, 115, and 137 are significantly deviated from the main sample, and from the variance distribution of Caragana korshinskii’s mean moisture content in Figure 7b that samples 42, 87, and 118 are significantly deviated from the main sample, so the outliers of these deviated from the main sample are removed as abnormal samples.

3.3. Spectral Data Preprocessing

Moisture Content Pretreatment

To improve the performance of the model and reduce the interference caused by external noise, this study selected four pretreatment methods: MMN, MSC, SG, and SNV. To find the best pretreatment method, PLSR, RFR, and ELM models are established based on the preprocessed data. The specific pretreatment results are shown in the table below (see Table 3 for the moisture content pretreatment results and Table 4 for the high calorific value pretreatment results). From the results of water content pretreatment in Table 3, it can be seen that the data obtained by the four pretreatment methods in the PLSR model has improved the prediction accuracy of the model compared with the original data. Among them, the spectral effect after SNV pretreatment is the best. See Figure 8a for the spectral chart after pretreatment. The determination coefficient

R_{P}^{2}

of the prediction set is 0.9433, and the root mean square error RMSEP is 0.3387. In the same way, it can be seen that the SNV pretreatment method is also more effective in the RFR and ELM models. In the RFR model, the determination coefficient

R_{P}^{2}

of the spectral prediction set after SNV processing is 0.9043, and the root mean square error RMSEP is 0.3857. In the ELM model, the determination coefficient

R_{P}^{2}

of the spectral prediction set after SNV processing is 0.9242, and the root mean square error RMSEP is 0.3638. Considering comprehensively, the data obtained based on the SNV pretreatment method is selected for analysis in the PLSR, RFR, and ELM models of subsequent moisture content.

From the pretreatment results of calorific value in Table 4, it can be seen that SG, as the optimal pretreatment method, has significantly improved the effect of PLSR and RFR models, which may be related to the removal of high-frequency noise in spectral data. The spectrum after pretreatment is shown in Figure 8b, so the data obtained based on the SG pretreatment method is selected for analysis in subsequent RFR and PLSR models. In the ELM model, the effect of the model built from the data pretreated by MSC and the data pretreated by SG is equivalent, but the RMSEP of the prediction set pretreated by SG is 0.4009, which is significantly lower than the RMSEP of the prediction set pretreated by MSC of 0.4775, indicating that its prediction accuracy is better. Considered comprehensively, the data obtained based on the SG pretreatment method is also selected for analysis in the subsequent ELM models.

3.4. Feature Extraction

There are 224 bands in the original spectrum, and the full band is mixed with some useless bands, which affects the speed of modeling as well as the accuracy of the model. In this paper, we choose four feature extraction algorithms, CARS, SPA, GA, and IRIV, to extract the feature bands from the original spectrum.

The ten-fold cross-validation method is used for feature wavelength selection using CARS, and the number of sampling times is 50. Figure 9 shows the feature extraction of water content data after SNV preprocessing using CARS, and Figure 9a shows that the proportion of its wavelength selection gradually decreases as the number of sampling times increases, and it starts to refine the selection in the 20th time. Figure 9b shows that the cross-validation mean value of its cross-validation is lowest in the 23rd sampling. The root square error value is the smallest; at this time, 27 feature wavelengths are selected. Figure 9c shows the trend of the regression coefficient path with the increase in sampling iterations. Figure 9d shows the location of the selected feature points. Figure 10 shows the feature extraction of calorific value data after SG pretreatment using cars. Similarly, it can be seen from Figure 10a that the proportion of wavelength selection gradually decreases with the increase in sampling times. From the 15th time of fine sampling, Figure 10b shows that the root mean square error of cross validation first decreases and then increases, reaching the minimum at the 16th time. At this time, 53 characteristic wavelengths are selected, and the error increases after the 16th time, indicating that the data with high correlation with water content is eliminated. Figure 10c shows the variation trend of the regression coefficient path with the increase in sampling times.

SPA is used to extract the features of water content data after SNV pretreatment. The data extraction results are shown in Figure 11. Figure 11a shows that when 25 feature numbers are selected, the minimum RMSE value is 0.2054, and the distribution of the selected feature numbers is shown in Figure 11b. The results for extracting features from the calorific value data pretreated by SG are shown in Figure 12. When 28 feature numbers are selected, the RMSE is at least 0.2356.

The number of iterations for feature extraction based on GA is set to 100, and the feature extraction results of Caragana fuel moisture content are shown in Figure 13. It can be seen from Figure 13a,b that when the number of selected characteristic wavelengths is 32, the prediction accuracy is 96.58%, and the RMSE value is 0.2576. At this time, 32 characteristic wavelengths are the best choice. Figure 13 c is the histogram of the number of selected wavelengths, and it can be seen that 32 samples have been selected more than or equal to six times. The feature extraction results of the Caragana fuel calorific value are shown in Figure 14. Similarly, it can be seen from Figure 14a–c that 40 characteristic wavelengths have been selected for more than or equal to six times. When 40 characteristic wavelengths are selected, the prediction accuracy reaches the highest 81.52%, and the RMSE is 0.2992.

When IRIV is used to select the characteristic wavelength, the five-fold cross-validation method is used. The feature extraction results of Caragana fuel moisture content are shown in Figure 15. The relationship between iteration times and reserved variables is shown in Figure 15a. After three iterations, the reserved number of samples is reduced from 224 to 55, and the reserved number is unchanged at the fifth iteration. After reverse elimination, 46 characteristic wavelengths are reserved, and their distribution is shown in Figure 15b. The feature extraction results of the Caragana fuel calorific value are shown in Figure 16. The number of samples retained in the first three iterations decreased sharply and remained unchanged in the seventh iteration. After reverse elimination, a total of 33 characteristic wavelengths were retained.

3.5. Modeling Results and Analysis

According to the moisture content and calorific value of Caragana korshinskii fuel, PLSR, RFR, and ELM models were established by combining the characteristic wavelengths of the above four feature extraction methods. The results are shown in Table 5. It can be seen from Table 5 that among the PLSR models built for the moisture content of Caragana fuel, the four feature wavelength extraction methods have significantly improved compared with the modeling method of only preprocessing, and the IRIV-PLSR model has the best effect, with

R_{P}^{2}

, RMSEP, and RPD of 0.9693, 0.2358, and 5.6792, respectively. SPA-RFR model is the best in RFR model, and its

R_{P}^{2}

, RMSEP, and RPD are 0.9318, 0.3161 and 3.8796, respectively. Among the ELM models, the CARS-ELM model has the best effect, and its

R_{P}^{2}

, RMSEP, and RPD are 0.9514, 0.3019, and 4.5960, respectively. Finally, by comparing the models built by the three models corresponding to the best feature extraction, it is found that the SNV-IRIV-PLSR model has the best effect and the highest prediction accuracy, so this model is more suitable for the quantitative detection of Caragana fuel moisture content. The results of the real and predicted values of the training set and prediction set of the model are shown in Figure 17.

Based on the results presented in Table 6, it can be concluded that the optimal PLSR model specifically for the moisture content metric demonstrates significantly superior performance compared to the RFR and ELM models. Regarding the key performance indicators of determination coefficient, root mean square error, and relative percent deviation, the combined effect of the SNV-IRIV-PLSR model consistently achieves the best results. Furthermore, Standard Error (SE) was introduced to evaluate the predictive performance of the models. As a general rule of thumb, a smaller SE value is preferred. An SE value less than 0.3 indicates that the model’s error is smaller than the inherent variability, signifying that the model is suitable for practical application. Specifically, the SE value of 0.1761 for the optimal combination in Table 6 confirms the excellent performance of the model.

It can be seen from Table 7 that among the PLSR models built for Caragana fuel calorific value, the performance improvement effect of the model built by the IRIV-PLSR combination is not obvious. The other three kinds of feature extraction have significantly improved the prediction effect of the model, and SG-GA-PLSR has the best effect. Its

R_{P}^{2}

, RMSEP, and RPD are 0.7494, 0.3174, 2.3306, respectively. In the RFR model, SG-CARS-RFR had the best effect, and its

R_{P}^{2}

, RMSEP, and RPD are 0.8037, 0.3219, 2.2864, respectively. In the ELM model, the four feature extraction methods have significantly improved the prediction performance of the model, and SG-SPA-ELM has the best effect. Its,

R_{P}^{2}

, RMSEP, and RPD are 0.7051, 0.3616, 1.8661, respectively. Finally, comparing the three models corresponding to the best feature extraction model, SG-CARS-RFR is the best combination with the highest prediction accuracy. Therefore, this model is more suitable for quantitative detection of Caragana fuel calorific value, and the results of the real value and predicted value of the model are shown in Figure 18.

The data in Table 8 indicates that, although the modeling effectiveness of the PLSR, ELM, and RFR models established for Caragana calorific value reaches above 0.7, further improvement is still warranted. Among the optimal combinations of these three models, the SG-CARS-RFR combination of the RFR model achieved a prediction set accuracy of 0.8037, which is significantly superior to the PLSR and ELM models. However, the RMSE and RPD of the prediction set for the PLSR model were slightly higher than those of the RFR model. To further validate the model’s performance, standard error (SE) was introduced. It was found that the SE of the RFR model was only 0.0083 higher than that of the PLSR model. Therefore, considering all factors comprehensively, the SG-CARS-RFR combination was selected as the optimal model.

3.6. Moisture Content Visualization

Through modeling and analysis, the SNV-IRIV-PLSR model was determined to be the best quantitative detection model for the moisture content of Caragana fuel particles. This model was then used to visually analyze the moisture content. Using the regression coefficient equation of the SNV-IRIV-PLSR model, the moisture content of each pixel on the hyperspectral images of different Caragana korshinskii particulate fuels was calculated individually. The moisture content of each pixel was then mapped to the grayscale range of 0–255 to form a moisture content distribution map of the Caragana korshinskii particulate fuels. This map was subsequently modified to generate a pseudo-color map, as shown in Figure 19. The results indicate that the moisture content distribution of Caragana fuel particles is relatively uniform. On a single Caragana fuel particle, moisture is mainly concentrated in the central region, while the moisture content at the edges is relatively low. This distribution pattern is likely related to the uniformity of moisture content gradient control and the environmental conditions during spectral image acquisition. By visualizing the pseudo-color map, we can intuitively observe the moisture content distribution within the fuel particles. This not only aids in analyzing the internal characteristics of the particles but also provides a theoretical basis for subsequent quality grading.

4. Discussion

Biomass resources, as renewable and low-carbon energy sources, are crucial for global sustainable development. Optimizing the utilization technologies of biomass represents a key strategy for the global community to address climate change and achieve a green economic transition. Numerous experts have continuously explored the utilization of biomass resources within their respective research domains. For instance, Ainur Seilkhan et al. [35] investigated the green applications of Taraxacum kok-saghyz (TKS) from Kazakhstan in the pharmaceutical, biofuel, and rubber industries, aiming to identify sustainable alternatives to conventional resources. Similarly, Azhar Zhubanova et al. [36] enhanced the ethanol production capacity from cheese whey using cell immobilization technology, with their research targeting the cost and substrate sourcing challenges in biofuel production. Leena Kapoo et al. [37] evaluated the fuel properties and chemical composition of biofuels produced through the co-pyrolysis of rice husk and peanut shell, primarily seeking a cost-effective and renewable biofuel source. This study aligns with the objectives of the aforementioned scholars; it aims to develop novel detection methods within the field of biomass fuel quality assessment, specifically based on Caragana biomass fuel particles. At present, there are some deficiencies in the quality detection methods of biomass fuel particles, such as the traditional oxygen bomb calorimetry method, the air drying method, etc. These quality detection methods have strict requirements on the detection conditions and have a large lag, which makes them difficult to be directly applied to the production practice [38,39,40]. In this study, a rapid method for measuring the moisture content and calorific value of Caragana korshinskii fuel is proposed, which is a rapid detection model based on hyperspectral imaging technology. The SNV-IRIV-PLSR model for moisture content was the best, and the prediction accuracy was the highest. On the training set,

R_{C}^{2}

is 0.9712, and RMSEC is 0.2207, which shows that the model can accurately fit the training data. On the test set,

R_{P}^{2}

is 0.9693, and RMSEP is 0.2358, which shows that the model has good prediction ability even on the trained data. In addition, the root mean square error of the test set is slightly higher than that of the training set, which may be because the distribution of sample characteristics in the test set is different from that in the training set. The RPD value is 5.6792, which indicates that the model has high prediction accuracy and practical application potential. The SG-CARS-RFR model for calorific value has the best effect and the highest prediction accuracy. In the training set,

R_{C}^{2}

is 0.8130, and RMSEC is 0.2941; in the test set,

R_{P}^{2}

is 0.8037, RMSEP is 0.3219, and RPD value is 2.2864. The difference between the determination coefficients of the correction set and the prediction set is only 0.0093, which indicates that the model has good generalization ability and certain stability, and also has certain practical application potential.

Quality detection is a necessary step before the fuel particles of Caragana korshinskii enter the market. In this study, the moisture content and calorific value of the processed and produced Caragana korshinskii fuel particles were detected, which intersected with the detection method proposed for the original harvested shrub fuel. The market prospects and practical significance of the rapid detection method for the finished fuel particles processed and produced in the factory are greater. A complete and rapid detection technology is of great significance for the refined utilization of large energy enterprises. However, there are still some deficiencies in the current study. In terms of sample collection and preparation, the selection of Caragana fuel from four regions may have certain limitations. At the same time, selecting more advanced and complex models may improve the prediction effect of heat value.

5. Conclusions

In this study, Caragana korshinskii fuel particles were selected as the research samples, and spectral imaging technology was combined with stoichiometry. For the moisture content and calorific value of Caragana korshinskii fuel particles, Monte Carlo sampling and four kinds of pretreatments were selected. The three quantitative detection models established by four kinds of feature extraction were compared and analyzed. For moisture content, the SNV-IRIV-PLSR modeling effect was the best. The SNV pretreatment method eliminated the spectral scattering caused by sample nonuniformity and reduced the possible test error in the modeling process. IRIV extracted 46 characteristic wavelengths, accounting for 20.5% of the full band wavelength. The PLSR model constructed using the characteristic wavelengths extracted by IRIV showed good generalization ability.

R_{P}^{2}

, RMSEP, and RPD were 0.9693, 0.2358, 5.6792, respectively. For calorific value, the SG-CARS-RFR modeling effect is the best. The SG preprocessing method eliminates the high-frequency noise in the spectral data. CARS extracts 53 characteristic wavelengths, accounting for 23.6% of the full band wavelength. The RFR model constructed using the characteristic wavelengths extracted by cars shows good generalization ability. Its

R_{P}^{2}

, RMSEP, and RPD are 0.8037, 0.3219, 2.2864, respectively. In conclusion, hyperspectral technology can realize the rapid detection of the moisture content and calorific value of Caragana fuel.

Author Contributions

Conceptualization, X.D. and H.L.; methodology, H.L.; software, H.L., J.Z. and N.L.; validation, H.L. and H.W.; formal analysis, H.L.; investigation, H.L.; resources, Y.M.; data curation, X.D.; writing—original draft preparation, X.D. and H.L.; writing—review and editing, X.D.; visualization, N.L.; supervision, X.D.; project administration, X.D.; funding acquisition, X.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Natural Science Foundation of Inner Mongolia Autonomous Region (2024MS05026); Research Program of Science and Technology at Universities of Inner Mongolia Autonomous Region, grant number NJZZ23047; Program for Improving the Scientific Research Ability of Youth Teachers of Inner Mongolia Agricultural University, grant number BR230154; and Inner Mongolia Autonomous Region “First Class Discipline Scientific Research Special Project” (Intelligent Equipment Creation for the Whole Industry Chain of Grassland and Characteristic Economic Coarse Grains, No. YLXKZX-NND-046).

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy concerns.

Conflicts of Interest

The authors declare no conflicts of interest.

Correction Statement

This article has been republished with a minor correction to the readability of Figure 6. This change does not affect the scientific content of the article.

References

Zhang, J. Study on the Molding Mechanism, Physical Properties, and Combustion Characteristics of Caragana Korshinskii Solid Fuel. Doctoral Dissertation, Shanxi Agricultural University, Taiyuan, China, 2014. [Google Scholar]
Shao, C.Y.; Zhao, Y.M.; Lu, L.L. Research progress on application of near infrared spectroscopy rapid analysis technology. Chem. Bull. 2024, 87, 898–912. [Google Scholar]
Costa, M.A.M.; da Silva, B.M.; de Almeida, S.G.C.; Felizardo, M.P.; Costa, A.F.M.; Cardoso, A.A.; Dussán, K.J. Evaluation of the efficiency of a Venturi scrubber in particulate matter collection smaller than 2.5 µm emitted by biomass burning. Environ. Sci. Pollut. Res. Int. 2023, 30, 8835–8852. [Google Scholar] [CrossRef]
Wang, F.; Lin, H.; Xu, P.; Bi, X.; Sun, L. Egg Freshness Evaluation Using Transmission and Reflection of NIR Spectroscopy Coupled Multivariate Analysis. Foods 2021, 10, 2176. [Google Scholar] [CrossRef]
Wu, X.; Liang, X.; Wang, Y.; Wu, B.; Sun, J. Non-Destructive Techniques for the Analysis and Evaluation of Meat Quality and Safety: A Review. Foods 2022, 11, 3713. [Google Scholar] [CrossRef] [PubMed]
Beć, K.B.; Grabska, J.; Czarnecki, M.A. Spectra-structure correlations in NIR region: Spectroscopic and anharmonic DFT study of n-hexanol, cyclohexanol and phenol. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2018, 197, 176–184. [Google Scholar] [CrossRef] [PubMed]
Shen, Y.Y. Research on Performance Detection of Agricultural and Forestry Biomass Raw Materials and Carbon Briquette Fuel. Master’s Thesis, Zhejiang University, Hangzhou, China, 2013. [Google Scholar]
Wang, X.Y.; Sun, X.B.; Wang, J.; Xie, G.H. Difference and correlation between ash content and calorific value of waste biomass and its determination method. J. China Agric. Univ. 2022, 27, 160–172. [Google Scholar]
Guo, G.; Zhang, M.L.; Gong, Z.J.; Zhang, S.Z.; Wang, X.Y. Construction of biomass ash content model based on near infrared spectroscopy and complex sample partition set. Spectrosc. Spectr. Anal. 2023, 43, 3143–3149. [Google Scholar]
Han, Y.; Dong, C.Q.; Zhang, J.J.; Hu, X.Y.; Xue, J.J.; Zhao, Y.; Wang, X.Q. Moisture prediction of biomass fuel based on near-infrared spectroscopy and deep learning algorithm. Energy Fuels 2024, 38, 6062–6071. [Google Scholar] [CrossRef]
Li, Y.; Xu, H.K.; Lan, X.Z.; Wang, J.X.; Su, X.M.; Bai, X.P.; Via, B.K.; Pei, Z.Y. Predicting calorific value and ash content of sand shrub using Vis-NIR spectra and various chemometrics. Renew. Energy 2024, 230, 120805. [Google Scholar] [CrossRef]
Fagan, C.C.; Everard, C.D.; McDonnell, K. Prediction of moisture, calorific value, ash and carbon content of two dedicated bioenergy crops using near-infrared spectroscopy. Bioresour. Technol. 2011, 102, 5200–5206. [Google Scholar] [CrossRef]
Mancini, M.; Rinnan, Å.; Pizzi, A.; Toscano, G. Prediction of gross calorific value and ash content of woodchip samples by means of FT-NIR spectroscopy. Fuel Process. Technol. 2018, 169, 77–83. [Google Scholar] [CrossRef]
Noushabadi, A.S.; Dashti, A.; Ahmadijokani, F.; Hu, J.; Mohammadi, A.H. Estimation of higher heating values (HHVs) of biomass fuels based on ultimate analysis using machine learning techniques and improved equation. Renew. Energy 2021, 179, 550–562. [Google Scholar] [CrossRef]
GB/T 28731-2012; Solid Biofuels Fuel Specification and Classes. General Administration of Quality Supervision, Inspection and Quarantine of the People’s Republic of China & Standardization Administration of the People’s Republic of China: Beijing, China, 2012.
Hao, Y.; Chen, B.; Zhu, R. Analysis of several wavelet denoising methods in near infrared spectral preprocessing. Spectrosc. Spectr. Anal. 2006, 26, 1838–1841. [Google Scholar]
Sun, J.Y.; Li, M.Z.; Zheng, L.H.; Hu, Y.G.; Zhang, X.J. Real-time analysis of northern fluvo aquic soil parameters based on near infrared spectroscopy. Spectrosc. Spectr. Anal. 2006, 426–429. [Google Scholar]
Luo, J.; Ying, K.; Bai, J. Savitzky–Golay smoothing and differentiation filter for even number data. Signal Process. 2005, 85, 1429–1434. [Google Scholar] [CrossRef]
Barnes, R.J.; Dhanoa, M.S.; Lister, S.J. Standard normal variate transformation and de-trending of near-infrared diffuse reflectance spectra. Appl. Spectrosc. 1989, 43, 772–777. [Google Scholar] [CrossRef]
Isaksson, T.; Næs, T. The effect of multiplicative scatter correction (MSC) and linearity improvement in NIR spectroscopy. Appl. Spectrosc. 1988, 42, 1273–1284. [Google Scholar] [CrossRef]
Helland, I.S.; Næs, T.; Isaksson, T. Related versions of the multiplicative scatter correction method for preprocessing spectroscopic data. Chemom. Intell. Lab. Syst. 1995, 29, 233–241. [Google Scholar] [CrossRef]
Yao, K.; Sun, J.; Cheng, J. Development of Simplified Models for Non-Destructive Hyperspectral Imaging Monitoring of S-ovalbumin Content in Eggs during Storage. Foods 2022, 11, 2024. [Google Scholar] [CrossRef]
Sun, J.; Cong, S.L.; Mao, H.P.; Wu, X.H.; Zhang, X.D. Hyperspectral-based CARS-ABC-SVR prediction model for water content in lettuce leaves. J. Agric. Eng. 2017, 33, 178–184. [Google Scholar]
Zhang, J.; Rivard, B.; Rogge, D.M. The successive projection algorithm (SPA), an algorithm with a spatial constraint for the automatic search of endmembers in hyperspectral data. Sensors 2008, 8, 1321–1342. [Google Scholar] [CrossRef] [PubMed]
Sun, J.; Tang, K.; Wu, X.H.; Dai, C.X.; Chen, Y.; Shen, G.F. Nondestructive identification of green tea varieties based on hyperspectral imaging technology. J. Food Process Eng. 2018, 41, e12800. [Google Scholar] [CrossRef]
Yu, L.; Zhang, T.; Zhu, Y.X.; Zhou, Y.; Xia, T. Optimization of hyperspectral characteristic wavelength variables of soybean leaves and estimation of SPAD value based on IRIV algorithm. J. Agric. Eng. 2018, 34, 148–154. [Google Scholar]
Wang, H.; Xiang, Y.Z.; Li, W.Y.; Shi, H.Z.; Wang, X. Estimation of winter rape aboveground biomass based on UAV multispectral remote sensing. J. Agric. Mach. 2023, 54, 218–229. [Google Scholar]
Chen, X.C. Nondestructive Detection of Caragana Microparticle Quality Parameters Based on Hyperspectral Technology. Master’s Thesis, Inner Mongolia Agricultural University, Hohhot, China, 2024. [Google Scholar]
Zhang, M.Y. Detection and Classification of Silage Corn Feed Quality Based on Hyperspectral Data. Master’s Thesis, Inner Mongolia Agricultural University, Hohhot, China, 2023. [Google Scholar]
Song, X.; Xu, Q.; Li, H. Automatic quantification of retinal photoreceptor integrity to predict persistent disease activity in neovascular age-related macular degeneration using deep learning. Front. Neurosci. 2022, 16, 952735. [Google Scholar] [CrossRef]
Li, J. Quality Detection of Dry Alfalfa Based on Visible/Near Infrared Spectroscopy and Electronic Nose Information Fusion. Master’s Thesis, Inner Mongolia Agricultural University, Hohhot, China, 2024. [Google Scholar]
Peng, Y.; Lu, B.L. Discriminative manifold extreme learning machine and applications to image and EEG signal classification. Neurocomputing 2016, 174, 265–277. [Google Scholar] [CrossRef]
Li, W.; Xiao, Y.M.; Ishag, B.; Akoy, R.; Zhang, J.Y.; Yan, Q. Construction of near-infrared prediction model for key quality indicators of sainfoin hay. Pratacult. Sci. 2024, 45, 123–145. [Google Scholar]
Yu, Z.H.; Chen, X.C.; Zhang, J.C.; Su, Q.; Wang, K.; Liu, W.H. Rapid and non-destructive estimation of moisture content in Caragana korshinskii pellet feed using hyperspectral imaging. Sensors 2023, 23, 7592. [Google Scholar] [CrossRef]
Seilkhan, A. An Overview of Green Applications of Natural Products for Pharmaceutical, Biofuel, and Rubber Industries: Case Study of Kazakh Dandelion (Taraxacum kok-saghyz Rodin). ES Energy Environ. 2024, 25, 1171. [Google Scholar] [CrossRef]
Zhubanova, A.; Abdieva, G.; Ualieva, P.; Akimbekov, N.; Malik, A.; Tastambek, K. Whey-to-Bioethanol Valorisation: Fermentation with Immobilized Yeast Cells. Eng. Sci. 2024, 27, 995. [Google Scholar] [CrossRef]
Kapoor, L.; Mohan, T.B.G.; Ranjith, K. Evaluating Fuel Properties and Chemical Composition of Biofuel Produced Via Co Pyrolysis of Rice Straw and Groundnut Shell. ES Energy Environ. 2024, 25, 1124. [Google Scholar] [CrossRef]
Hou, J.Y.; Liu, L.X. Laboratory determination of coal calorific value and study on its influencing factors. Inn. Mong. Coal Econ. 2025, 58–60. [Google Scholar]
Guo, H.R.; Zhao, Z.X.; Bao, Z.; Qiu, Y.H.; Peng, B.; Liu, Q.Y.; He, L. Evaluation of uncertainty in determining the heat value of biomass raw materials using a bomb calorimeter. Chem. Res. Appl. 2022, 34, 399–406. [Google Scholar]
Guan, J. Study on the Determination Method of Calorific Value of Solid Biomass Formed Fuel. Master’s Thesis, Guangdong Technological University, Guangzhou, China, 2020. [Google Scholar]

Figure 1. Caragana fuel particles preparation process.

Figure 2. Distribution of calorific value and moisture content of Caragana fuel particles. (a) Distribution map of moisture content for 152 samples. (b) Distribution map of calorific value for 152 samples.

Figure 3. Hyperspectral imaging system and sample collection. (a) Spectral data acquisition system. (b) Spectral data acquisition of samples.

Figure 4. Flow chart of sample collection and data determination.

Figure 5. Algorithmic principle flowcharts of three models. (a) Simple-step diagram of PLSR algorithm. (b) Principle of random forest regression algorithm. (c) ELM network structure diagram.

Figure 6. 152 Average spectra of Caragana fuel samples. (a) Average spectrum of 152 Caragana bar samples. (b) Average spectrum of 152 Caragana powder samples.

Figure 7. Variance distribution of moisture content and calorific value of Caragana fuel. (a) Calorific value mean variance distribution; (b) moisture content mean variance of water content.

Figure 8. Spectrum obtained by optimal pretreatment of the moisture content and calorific value. (a) Moisture content spectrogram after treatment by the SNV method; (b) calorific value spectrum after treatment by the SG method.

Figure 9. Feature extraction of water content data after SNV pretreatment by cars. (a) The changing trend in the number of sampled variables; (b) 10-fold RMSECV; and (c) regression coefficients of each variable with an increment in runs, where the red line represents the position with the lowest 10-fold RMSECV. (d) Selected bands determined using the CARS algorithm.

Figure 10. Feature extraction of calorific value data after SG pretreatment by cars. (a) The changing trend in the number of sampled variables; (b) 10-fold RMSECV; and (c) regression coefficients of each variable with an increment in runs, where the red line represents the position with the lowest 10-fold RMSECV. (d) Selected bands determined using the CARS algorithm.

Figure 11. SPA feature extraction of moisture content data after SNV pretreatment. (a) RMSE of different feature selection numbers of water content; (b) characteristic wavelength distribution selected by SPA.

Figure 12. SPA feature extraction of calorific value data after SG pretreatment. (a) RMSE of the number of feature selection with different calorific values; (b) characteristic wavelength distribution selected by SPA.

Figure 13. Feature extraction of moisture content data after SNV pretreatment by GA. (a) Wavelength-selective frequency; (b) prediction accuracy; (c) RMSE change.

Figure 14. GA feature extraction of calorific value data after SG pretreatment. (a) Wavelength-selective frequency; (b) prediction accuracy; (c) RMSE change.

Figure 15. Feature extraction of moisture content data after SNV pretreatment by IRIV. (a) SNV-IRIV variable selection process of water content; (b) distribution of characteristic variables for SNV-IRIV selection of water content.

Figure 16. Feature extraction of calorific value data after SG pretreatment by IRIV. (a) Calorific value SG-IRIV variable selection process; (b) distribution of characteristic variables selected by calorific value SG-IRIV.

Figure 17. Operation results of the SNV-IRIV-PLSR model. (a) SNV-IRIV-PLSR training set results; (b) SNV-IRIV-PLSR prediction set results.

Figure 18. Operation results of the SG-CARS-RFR model. (a) SG-CARS-RFR training set results; (b) SG-CARS-RFR test set results.

Figure 19. Distribution of moisture content in different ranges (different colors correspond to different moisture contents, with blue representing the lowest moisture content and red representing the highest moisture content).

Table 1. Sample information about Caragana fuel particles in various regions.

Region	Procurement Location	Particle Size (mm)	Number of Samples	Total
Hohhot City	Inner Mongolia Agricultural University Machinery Factory	8	41	152
Tongliao City	Xingwei Biotechnology Co., Ltd.	8	35
Xing’an League City	Rui’er Biomass Energy Development Co., Ltd.	6	40
Wulanchabu City	Longshunzhuang Agriculture and Animal Husbandry Co., Ltd.	4	36

Table 2. Determination results of the moisture content and high calorific value of Caragana fuel.

Sample Quality Parameters	Number of Samples	Min	Max	Mean	Standard Error
Moisture content (%)	152	6.08%	11.06%	7.96%	1.33
High calorific value (kj/g)	152	18.01	20.72	19.54	0.68

Table 3. Prediction results of different pretreatments corresponding to PLSR, RFR, and ELM modeling for water content.

Model	Pretreatment Method	Number of Latent Variables/Number of Neurons	Training Set		Test Set
Model	Pretreatment Method	Number of Latent Variables/Number of Neurons	$R_{C}^{2}$	RMSEC	$R_{P}^{2}$	RMSEP	RPD
PLSR	RAW	11	0.9701	0.2151	0.8853	0.4349	3.5008
	MMN	10	0.9655	0.2300	0.9239	0.3791	3.9669
	MSC	9	0.9693	0.2176	0.9161	0.3648	3.9596
	SG	10	0.9578	0.2629	0.9288	0.3302	4.1716
	SNV	11	0.9691	0.2212	0.9433	0.3387	4.3748
RFR	RAW		0.9653	0.2552	0.8382	0.4748	2.5184
	MMN		0.9715	0.2208	0.8884	0.4600	3.0330
	MSC		0.9741	0.2139	0.8989	0.4196	3.1863
	SG		0.9687	0.2446	0.8499	0.4449	2.6151
	SNV		0.9738	0.2193	0.9043	0.3857	3.2749
ELM	RAW	20	0.9049	0.4140	0.8047	0.5624	2.2938
	MMN	26	0.9412	0.3175	0.9195	0.3902	3.5725
	MSC	22	0.9476	0.2969	0.8801	0.4825	2.9277
	SG	21	0.8938	0.4409	0.8839	0.4216	2.9751
	SNV	30	0.9588	0.2696	0.9242	0.3638	3.6813

Table 4. Prediction results of different pretreatments corresponding to PLSR, RFR, and ELM modeling for calorific value.

Model	Pretreatment Method	Number of Latent Variables/Number of Neurons	Training Set		Test Set
Model	Pretreatment Method	Number of Latent Variables/Number of Neurons	$R_{C}^{2}$	RMSEC	$R_{P}^{2}$	RMSEP	RPD
PLSR	RAW	6	0.7243	0.3060	0.5864	0.3960	1.9711
	MMN	12	0.7364	0.3203	0.6692	0.3690	1.7877
	MSC	12	0.7424	0.3063	0.6891	0.3747	1.9979
	SG	11	0.7669	0.2969	0.7234	0.3304	2.1217
	SNV	12	0.7080	0.3282	0.6324	0.3651	1.9415
RFR	RAW		0.8090	0.3020	0.6241	0.4242	1.6524
	MMN		0.7634	0.3262	0.6787	0.4206	1.7873
	MSC		0.7547	0.3345	0.6378	0.4343	1.6834
	SG		0.8235	0.2861	0.7435	0.3663	2.0002
	SNV		0.7240	0.3653	0.6687	0.3906	1.7601
ELM	RAW	8	0.7596	0.3445	0.5510	0.4119	1.5118
	MMN	14	0.6356	0.4135	0.5644	0.4554	1.5355
	MSC	24	0.7189	0.3399	0.6155	0.4775	1.6338
	SG	15	0.7544	0.3505	0.6112	0.4009	1.6252
	SNV	16	0.7732	0.3233	0.6086	0.4515	1.6194

Table 5. Moisture content of Caragana fuel: comparison of four characteristic wavelength extraction methods based on PLSR, RFR, and ELM models.

Model	Feature Extraction Method	Number of Latent Variables/Number of Neurons	Feature Points	Training Set		Test Set
Model	Feature Extraction Method	Number of Latent Variables/Number of Neurons	Feature Points	$R_{C}^{2}$	RMSEC	$R_{P}^{2}$	RMSEP	RPD
PLSR	RAW	11		0.9691	0.2212	0.9433	0.3387	4.3748
	CARS	10	27	0.9689	0.2311	0.9475	0.2863	4.6447
	SPA	8	25	0.9560	0.2802	0.9404	0.2888	4.1180
	GA	7	32	0.9650	0.2540	0.9540	0.2599	4.4559
	IRIV	9	46	0.9712	0.2207	0.9693	0.2358	5.6792
RFR	RAW		0	0.9738	0.2193	0.9043	0.3857	3.2749
	CARS		27	0.9675	0.2438	0.9247	0.3356	3.6912
	SPA		25	0.9765	0.2076	0.9318	0.3161	3.8796
	GA		32	0.9605	0.2573	0.9102	0.4202	3.3798
	IRIV		46	0.9656	0.2288	0.8949	0.4687	3.1244
ELM	RAW	30	0	0.9588	0.2696	0.9242	0.3638	3.6813
	CARS	27	27	0.9719	0.2193	0.9514	0.3019	4.5960
	SPA	30	25	0.9672	0.2403	0.9455	0.2972	4.3401
	GA	27	32	0.9723	0.2222	0.9227	0.3615	3.6445
	IRIV	23	46	0.9557	0.2805	0.9331	0.3373	3.9181

Table 6. Optimal combination results of PLSR, ELM, and RFR detection models for moisture content of lemon fuel pellets under different variable selection methods.

Algorithm Combinations	Number of Latent Variables/Number of Neurons	Feature Points	Training Set		Test Set		RPD	SE
Algorithm Combinations	Number of Latent Variables/Number of Neurons	Feature Points	$R_{C}^{2}$	RMSEC	$R_{P}^{2}$	RMSEP	RPD	SE
SNV-IRIV-PLSR	9	46	0.9712	0.2207	0.9693	0.2358	5.6792	0.1761
SNV-CARS-ELM	27	27	0.9719	0.2193	0.9514	0.3019	4.5960	0.2175
SNV-SPA-RFR		25	0.9765	0.2076	0.9318	0.3161	3.8796	0.2578

Table 7. Comparison of the results of four feature wavelength extraction methods based on PLSR, RFR, and ELM models for the calorific value of Caragana fuel.

Model	Feature Extraction Method	Number of Latent Variables/Number of Neurons	Feature Points	Training Set		Test Set
Model	Feature Extraction Method	Number of Latent Variables/Number of Neurons	Feature Points	$R_{C}^{2}$	RMSEC	$R_{P}^{2}$	RMSEP	RPD
PLSR	RAW	11	0	0.7669	0.2969	0.7234	0.3304	2.1217
	CARS	9	53	0.7558	0.3037	0.7415	0.3163	2.2764
	CARS	9	53	0.7558	0.3037	0.7415	0.3163	2.2764
	SPA	9	28	0.7849	0.2890	0.7383	0.3728	1.9214
	GA	6	40	0.7794	0.2890	0.7494	0.3174	2.3306
	IRIV	11	33	0.7726	0.2928	0.7272	0.3398	2.1563
RFR	RAW		0	0.8235	0.2861	0.7435	0.3663	2.0002
	CARS		53	0.8130	0.2941	0.8037	0.3219	2.2864
	SPA		28	0.8029	0.2975	0.7784	0.3503	2.1522
	GA		40	0.8045	0.3015	0.7235	0.3640	1.9268
	IRIV		33	0.8027	0.3025	0.7670	0.3355	2.0987
ELM	RAW	15	0	0.7544	0.3505	0.6112	0.4009	1.6252
	CARS	30	53	0.7836	0.3244	0.6962	0.3725	1.8386
	SPA	13	28	0.7718	0.3351	0.7051	0.3616	1.8661
	GA	8	40	0.7777	0.3411	0.6538	0.3487	1.7224
	IRIV	15	33	0.7686	0.3190	0.6801	0.4327	1.7917

Table 8. Results of optimal combinations of PLSR, ELM, and RFR detection models for Caragana fuel calorific value using different variable selection methods.

Algorithm Combinations	Number of Latent Variables/Number of Neurons	Feature Points	Training Set		Test Set		RPD	SE
Algorithm Combinations	Number of Latent Variables/Number of Neurons	Feature Points	$R_{C}^{2}$	RMSEC	$R_{P}^{2}$	RMSEP	RPD	SE
SG-GA-PLSR	6	40	0.7794	0.2890	0.7494	0.3174	2.3306	0.4291
SG-SPA-ELM	13	28	0.7718	0.3351	0.7051	0.3616	1.8661	0.5359
SG-CARS-RFR		53	0.8130	0.2941	0.8037	0.3219	2.2864	0.4374

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

De, X.; Li, H.; Zhang, J.; Li, N.; Wan, H.; Ma, Y. Prediction of the Calorific Value and Moisture Content of Caragana korshinskii Fuel Using Hyperspectral Imaging Technology and Various Stoichiometric Methods. Agriculture 2025, 15, 1557. https://doi.org/10.3390/agriculture15141557

AMA Style

De X, Li H, Zhang J, Li N, Wan H, Ma Y. Prediction of the Calorific Value and Moisture Content of Caragana korshinskii Fuel Using Hyperspectral Imaging Technology and Various Stoichiometric Methods. Agriculture. 2025; 15(14):1557. https://doi.org/10.3390/agriculture15141557

Chicago/Turabian Style

De, Xuehong, Haoming Li, Jianchao Zhang, Nanding Li, Huimeng Wan, and Yanhua Ma. 2025. "Prediction of the Calorific Value and Moisture Content of Caragana korshinskii Fuel Using Hyperspectral Imaging Technology and Various Stoichiometric Methods" Agriculture 15, no. 14: 1557. https://doi.org/10.3390/agriculture15141557

APA Style

De, X., Li, H., Zhang, J., Li, N., Wan, H., & Ma, Y. (2025). Prediction of the Calorific Value and Moisture Content of Caragana korshinskii Fuel Using Hyperspectral Imaging Technology and Various Stoichiometric Methods. Agriculture, 15(14), 1557. https://doi.org/10.3390/agriculture15141557

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction of the Calorific Value and Moisture Content of Caragana korshinskii Fuel Using Hyperspectral Imaging Technology and Various Stoichiometric Methods

Abstract

1. Introduction

2. Materials and Methods

2.1. Preparation of Samples

2.2. Spectral Collection and Composition Determination

2.3. Data Processing Method

2.3.1. Extraction of Average Spectral Data

2.3.2. Removal of Abnormal Samples

2.3.3. Spectral Data Preprocessing

2.3.4. Characteristic Variable Screening

2.3.5. Model Establishment and Evaluation

3. Results

3.1. Analysis of Original Data

3.2. Elimination of Abnormal Samples

3.3. Spectral Data Preprocessing

Moisture Content Pretreatment

3.4. Feature Extraction

3.5. Modeling Results and Analysis

3.6. Moisture Content Visualization

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Correction Statement

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI