# Image-Based Model for Assessment of Wood Chip Quality and Mixture Ratios

^{1}

^{2}

^{*}

## Abstract

**:**

^{2}> 0.9 for the brightness-based model and R

^{2}> 0.8 for the texture based one during the validation tests. Even when reducing the data used for model training down to two or three mixture classes—which could be necessary or beneficial for the industrial application of our approach—sampling rates of n < 5 were sufficient in order to obtain significant predictions.

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Fuel Selection and Classification

^{3}, WC featured an approximately 75% higher value than FR (125 ± 12 kg/m

^{3}). Measuring the bulk density of a mixture (1:1) of these fuels led to 176 ± 9 kg/m

^{3}, a value lying close to the calculated average. The lower value of FR is not surprising, as it results from the generally higher aspect ratios and irregular shapes of individual particles, due to the high amounts of branches and bark. Table 1 sums up the determined fuel properties.

#### 2.2. Image Capturing

#### 2.3. Image Processing

#### 2.3.1. Preprocessing

#### 2.3.2. Feature Vector Formulation

#### 2.3.3. Histogram-Based Feature Vector

^{8}= 256 different values, so calculating the histogram reduced the amount of information from width x height (pixels) to 256. The feature vector’s size could decrease even further by applying different techniques, such as histogram equalization (i.e., neighboring brightness values aggregated into bigger classes) or simply ignoring low-intensity histogram areas (in photographs often close to 0 or 255; i.e., pure white and dark black).

#### 2.3.4. Texture-Based Feature Vector

_{g}is the total number of an image’s grey values and p(i,j) describes the probability that a pixel i is located in adjacency to a pixel with a value of j.

#### 2.4. Regression Model Selection and Setup

- Ordinary least squares Linear Regression
- Lasso Linear Regression featuring L1 regularization
- Ridge Linear Regression with L2 regularization

_{i}and ${\widehat{y}}_{i}$ are the actual and predicted data, respectively. For $\alpha $ = 0, Equation (2) equals the ordinary least squares loss function. The model generally seeks to minimize L by identifying (“learning”) appropriate coefficients ${\beta}_{i}$ in an iterative process for the target function:

- Decision Tree Regression (DT)
- Random Forest Regression (RF)
- Gradient Boosting Regression (GB) and
- Extra Tree Regression (ET).

#### 2.4.1. Model Evaluation and Error Estimation

_{i}and its corresponding prediction or modelled value ${\widehat{y}}_{i}$ of the ith sample ${\overline{y}}_{i}$.

^{2}:

^{2}is the squared correlation coefficient, yielding information about the amount of data points being well represented by the model; R

^{2}is consequently highly affected by outliers. Since the model training happens with 5-fold cross validation, the accuracy can be denoted as a mean value of R

^{2}and its standard deviation.

#### 2.4.2. Parameter Variations

^{2}-value, revealed that a linear loss function during adaboost regression (the function used for updating adaboost-internal weights) never led to worse results than the corresponding square or exponential approach. Subsequently, due to the trade-off between the number of estimators and the learning rate, the second optimization happened with a fixed learning rate. After the intrinsic hyperparameter tuning, a total of 201 different parameter sets were used for model training.

#### 2.5. Training Strategy

## 3. Results and Discussion

^{2}and the RMSE serve as evaluation criteria. The plot is divided into model-specific subsections, which are internally sorted by the R

^{2}value based on the validation data set. In general, both feature vectors are, in combination with a properly trained model, capable of reaching high predictive accuracy with R

^{2}> 0.8. The corresponding RMSE values lie below 10% (in units of mixture fraction or fuel quality). The best histogram-based model leads to a RMSE of 3.8% (DT model) for unknown validation data and the best texture-based one reaches 5.1% (Linear Lasso). Overall, the histogram-based approach exhibits higher accuracies and lower errors. It furthermore shows a lower discrepancy between the best and worst models of similar kind. This will be discussed in the following sections with respect to the model parametrization, input data selection and the training strategy.

#### 3.1. Influence of Regression Model and Parametrization

^{2}lying in the range of 0.87 ± 0.006 for the histogram-based approach and 0.90 ± 0.007 for the texture-based approach. However, regarding the validation accuracy, many of these models are outranked by the best performing nonlinear models. When well-trained linear models exhibit low validation accuracy, this is due to a non-representative (or insufficient) provision of training data or due to an underfitted model. Indeed, the bias–variance decomposition shows a trend towards lower variances and higher bias for the underperforming linear models during validation. Consequently, the model either encounters unknown (as in “never seen before”) features during validation or features that did not find representation in the model parameters β

_{i}. The first factor can be addressed by increasing the amount or quality of training data (or at least the share used for training); the latter by using more complex regression models, such as the tree-type models.

^{2}value amounts for 0.06–0.1. The adaboost algorithm improves the median accuracy by 0.03–0.10 for histogram-based regression. For texture-based regression, the median improvement is higher up to ΔR

^{2}= 0.20 for the DT model. This is due to the DT model being the only non-ensemble model, thus gaining the largest benefit from the boosting algorithm. In all cases, there is also no correlation of R

^{2}and the number of estimators. Increasing the trees deepness affects the prediction accuracy only for values up to 4. Higher values increase the computational needs without improving the accuracy. We consequently suggest choosing parameters leading to shorter calculation times (e.g., a limited number of tree layers (up to 4) and adaboost estimators (up to 200)).

#### 3.2. Influence of the Training Data

^{2}into account (cf. the error bars in Figure 6). Depending on the (random) choice of the training folds, the regression leads to models of different predictive capabilities. While some trained subsets reach standard deviations of as low as 0.02 regarding R

^{2}, certain models feature values of 0.2 or more; in these cases, the training results in flawed and insufficient models.

#### 3.3. Influence of the Training Strategy

^{c}) ought to be applied in order to obtain real mixture values.

#### 3.4. Repercussions for the Industrial Application

## 4. Summary and Conclusions

- Both histogram- and texture-based modelling approaches are capable of identifying fuel mixture qualities with an acceptable error. Especially for the first, the obtainable accuracy (as R
^{2}) values are larger than 0.90. Furthermore, models based on histogram features are more robust to subpar training compared to texture information. The generally most reliable results were obtained by the gradient boosting (GB) models and the linear model for both feature vectors, as mirrored in the low RMSE values of their predictions. - The application of the adaboosting algorithm enhances the median model accuracies by values of ΔR
^{2}= 0.02–0.04 for the histogram-based and up to ΔR^{2}= 0.15–0.20 for the texture-based feature vector. L1/2 regularization improves the linear models’ accuracies, especially for the texture-based regression (for 10^{−5}< α < 10^{−4}) and has close to no effect on the histogram-based models. - The data preparation and training process are the principal reasons for underperforming models and require the most attention during model preparation. Especially the image labelling (i.e., assigning a class to images and/or providing training images with exactly determined properties) is challenging for lab-scale considerations and even more so for the target purpose in industrial furnaces.
- It is possible to train models relying on a reduced amount of distinct classes, interpolating in between. Although the predictive accuracy significantly shrinks when using three instead of nine mixture classes during the training process, we find that it is still possible to obtain reliable predictions, even for sample sizes n < 5. This enables us to reliably distinguish fluctuations of 10% in the fuel mixture quality. If not otherwise possible, even a two-point calibration is applicable to continuously monitor feedstock quality, consequently, still revealing otherwise unobtainable information. In this case, a linear model should be used due to its better interpolation capabilities.

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Reid, W.T. The relation of mineral composition to slagging, fouling and erosion during and after combustion. Prog. Energy Combust. Sci.
**1984**, 10, 159–169. [Google Scholar] [CrossRef] - Plankenbühler, T.; Müller, D.; Karl, J. Influence of Fine Fuel Particles on Ash Deposition in Industrial-Scale Biomass Combustion: Experiments and Computational Fluid Dynamics Modeling. Energy Fuels
**2019**, 33. [Google Scholar] [CrossRef] - Plankenbühler, T.; Müller, D.; Karl, J. Slagging prevention and plant optimisation by means of numerical simulation. In Proceedings of the 25th European Biomass Conference & Exhibition, Stockholm, Sweden, 12–15 June 2017; pp. 653–659. [Google Scholar] [CrossRef]
- Baxter, L.L.; Miles, T.R.; Miles, T.R., Jr.; Jenkins, B.M.; Milne, T.; Dayton, D.; Bryers, R.W.; Oden, L.L. The behavior of inorganic material in biomass-fired power boilers: Field and laboratory experiences. Fuel Process. Technol.
**1998**, 54, 47–78. [Google Scholar] [CrossRef] - Niu, Y.; Tan, H.; Hui, S. Ash-related issues during biomass combustion: Alkali-induced slagging, silicate melt-induced slagging (ash fusion), agglomeration, corrosion, ash utilization, and related countermeasures. Prog. Energy Combust. Sci.
**2016**, 52, 1–61. [Google Scholar] [CrossRef] - Schön, C.; Kuptz, D.; Mack, R.; Zelinski, V.; Loewen, A.; Hartmann, H. Influence of wood chip quality on emission behaviour in small-scale wood chip boilers. Biomass Conv. Bioref.
**2019**, 9, 71–82. [Google Scholar] [CrossRef] - Gendek, A.; Aniszewska, M.; Chwedoruk, K. Bulk density of forest energy chips. Ann. Warsaw Univ. Life Sci. SGGW Agric.
**2016**, 67, 101–111. [Google Scholar] - Lokare, S.S.; Dunaway, J.D.; Moulton, D.; Rogers, D.; Tree, D.R.; Baxter, L.L. Investigation of ash deposition rates for a suite of biomass fuels and fuel blends. Energy Fuels
**2006**, 20, 1008–1014. [Google Scholar] [CrossRef] - Eriksson, G.; Hedman, H.; Boström, D.; Pettersson, E.; Backman, R.; Öhman, M. Combustion characterization of rapeseed meal and possible combustion applications. Energy Fuels
**2009**, 23, 3930–3939. [Google Scholar] [CrossRef] - Paulrud, S.; Erik, J.; Nilsson, C. Particle and handling characteristics of wood fuel powder: Effects of different mills. Fuel Process. Technol.
**2002**, 76, 23–39. [Google Scholar] [CrossRef] - Rodriguez, J.M.; Edeskär, T.; Knutsson, S. Particle Shape Quantities and Measurement Techniques—A Review. EJGE
**2013**, 18, 169–198. [Google Scholar] - Rezaei, H.; Lim, C.J.; Lau, A.; Sokhansanj, S. Size, shape and flow characterization of ground wood chip and ground wood pellet particles. Powder Technol.
**2016**, 301, 137–746. [Google Scholar] [CrossRef][Green Version] - Igathinathane, C.; Pordesimo, L.O.; Columbus, E.P.; Batchelor, W.D.; Sokhansanj, S. Sieveless particle size distribution analysis of particulate materials through computer vision. Comput. Electron. Agric.
**2009**, 66, 147–158. [Google Scholar] [CrossRef] - Ding, F.; Benaoudia, M.; Bédard, P.; Lanouette, R.; Lejeune, C.; Gagné, P. Wood chip physical and measurement. Pulp Pap. Canada
**2005**, 2, 27–32. [Google Scholar] - Rhe, C. Multivariate NIR spectroscopy models for moisture, ash and calorific content in biofuels using bi-orthogonal partial least squares regression. Analyst
**2005**, 130, 1182–1189. [Google Scholar] [CrossRef] - Fridh, L.; Volpé, S.; Eliasson, L.; Fridh, L. A NIR machine for moisture content measurements of forest biomass in frozen and unfrozen conditions unfrozen conditions. Int. J. For. Eng.
**2017**, 28, 42–46. [Google Scholar] [CrossRef] - Daugbjerg, P.; Hartmann, H.; Bo, T.; Temmerman, M. Moisture content determination in solid biofuels by dielectric and NIR reflection methods. Biomass Bioenergy
**2006**, 30, 935–943. [Google Scholar] [CrossRef] - Pan, P.; Mcdonald, T.P.; Via, B.K.; Fulton, J.P.; Hung, J.Y. Predicting moisture content of chipped pine samples with a multi-electrode capacitance sensor. Biosyst. Eng.
**2016**, 145, 1–9. [Google Scholar] [CrossRef][Green Version] - Tomas, E.; Andersson, T. A Machine Learning Approach for Biomass Characterization. Energy Procedia
**2019**, 158, 1279–1287. [Google Scholar] [CrossRef] - Tao, J.; Liang, R.; Li, J.; Yan, B.; Chen, G. Fast characterization of biomass and waste by infrared spectra and machine learning models Junyu. J. Hazard. Mater.
**2019**, 387, 121723. [Google Scholar] [CrossRef] - Nieto, P.G.; Garcia-Gonzalo, E.; Paredes-Sánchez, J.P.; Sánchez, A.B.; Fernández, M.M. Predictive modelling of the higher heating value in biomass torrefaction for the energy treatment process using machine-learning techniques. Neural Comput. Appl.
**2019**, 31, 8823–8836. [Google Scholar] [CrossRef] - Xing, J.; Luo, K.; Wang, H.; Gao, Z.; Fan, J. A comprehensive study on estimating higher heating value of biomass from proximate and ultimate analysis with machine learning approaches. Energy
**2019**, 188, 116077. [Google Scholar] [CrossRef] - Samadi, S.H.; Ghobadian, B.; Nosrati, M. Environmental Effects Prediction of higher heating value of biomass materials based on proximate analysis using gradient boosted regression trees method. Energy Sources Part A Recover. Util. Environ. Effect
**2019**. [Google Scholar] [CrossRef] - Gatternig, B.; Karl, J. Prediction of ash-induced agglomeration in biomass-fired fluidized beds by an advanced regression-based approach. Fuel
**2015**, 161, 157–167. [Google Scholar] [CrossRef] - Nixon, M.; Aguado, A.S. Feature Extraction & Image Processing, 2nd ed.; Academic Press, Inc.: Cambridge, MA, USA, 2008. [Google Scholar]
- Haralick, R.M.; Dinstein, I.; Shanmugam, K. Textural features for image classification. IEEE Trans. Syst. Man Cybern
**1973**, 3, 610–621. [Google Scholar] [CrossRef][Green Version] - Coelho, L.P. Mahotas: Open source software for scriptable computer vision. J. Open Res. Softw.
**2013**. [Google Scholar] [CrossRef] - Tasdemir, S.B.Y.; Tasdemir, K.; Aydin, Z. ROI Detection in Mammogram Images Using Wavelet-Based Haralick and HOG Features. In Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA, 17–20 December 2018; pp. 105–109. [Google Scholar] [CrossRef]
- Lishani, A.; Boubchir, L.; Khalifa, E.; Bouridane, A. Human gait recognition based on Haralick features. Signal Image Video Process.
**2017**. [Google Scholar] [CrossRef] - Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: New York, NY, USA, 2013; Volume 26. [Google Scholar]
- Freund, Y.; Schapire, R.E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci.
**1997**, 55, 119–139. [Google Scholar] [CrossRef][Green Version] - Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res.
**2011**, 12, 2825–2830. [Google Scholar] - von der Lippe, P.M. Induktive Statistik: Formeln, Aufgaben, Klausurtraining; Oldenbourg-Verlag: München, Germany, 1998. [Google Scholar]

**Figure 3.**Averaged normalized histograms of 0%, 50% and 100% fuel quality classes, based on the greyscale image. Gray lines indicate the positions used for the feature vector formulation, exemplarily highlighting feature 3.

**Figure 4.**Extracted histogram-based features as mean values with respect to the fuel class; axis is not normalized for the purpose of better visibility only. Error bars indicate the standard deviation.

**Figure 5.**Extracted texture-based Haralick features as mean values with respect to the fuel quality class; error bars indicate the standard deviation.

**Figure 6.**Overview on the models’ average performances for the histogram (

**top**), and texture-based models (

**bottom**) as R

^{2}(left axis) and RMSE values (right axis); error bars indicate the standard deviation of the average 5-fold cross validation training.

**Figure 7.**Relation between training and validation accuracy (as R

^{2}) of the predicted fuel quality for both feature vector types.

**Figure 8.**Exemplary high and low accuracy predictions for different fuel classes (Settings: GB regressor, R

^{2}(validation set) = 0.885, RMSE = 6.91).

**Figure 9.**Results for the fuel quality obtained from different training strategies. (

**a**): reference case with all classes used for training; (

**b**): training using 0, 25, 50, 75 and 100%; (

**c**): 0, 50 and 100% class; (

**d**): 0 and 100%; (

**e**) and (

**f**): as (b) and (d), but with linear model. Grey shade indicates the range of ± 10% error.

**Figure 10.**(

**left**): Estimating fuel properties from calculated mixture ratios; (

**right**): Interpolation of bulk density and ash content from calculated mixture ratios for two and three training classes.

Parameter | Standard | Wood Chips WC | Forest Residues FR |
---|---|---|---|

water content | DIN EN 14774-3 | 18.9 ± 0.9% | 22.6 ± 1.0% |

ash content | DIN EN 14775 | 3.8 ± 1.2 wt.% | 5.9 ± 0.2 wt.% |

bulk density | ISO 17828 | 222 ± 23 kg/m^{3} | 125 ± 12 kg/m^{3} |

fine particles < 5 mm | - | 3–12 wt.% | 3–12 wt.% |

fine particles < 1 mm | - | 0.5–1.2 wt.% | 0.5–1.2 wt.% |

**Table 2.**Overview of average predictive accuracy as R

^{2}of the predicted fuel quality for the different methods and algorithms.

Model | Histogram-Based Regression | Texture-Based Regression | |||||
---|---|---|---|---|---|---|---|

accuracy as R^{2} | accuracy as R^{2} | ||||||

Best | Average | Median | Best | Average | Median | ||

Tree-type models | |||||||

DT | 0.875 ± 0.05 | 0.771 ± 0.09 | 0.780 ± 0.09 | 0.650 ± 0.31 | 0.479 ± 0.29 | 0.472 ± 0.33 | |

with adaboost | 0.909 ± 0.04 | 0.842 ± 0.09 | 0.848 ± 0.09 | 0.825 ± 0.12 | 0.669 ± 0.18 | 0.672 ± 0.17 | |

ET | 0.869 ± 0.03 | 0.773 ± 0.08 | 0.801 ± 0.08 | 0.690 ± 0.12 | 0.576 ± 0.19 | 0.592 ± 0.17 | |

with adaboost | 0.936 ± 0.02 | 0.853 ± 0.08 | 0.879 ± 0.06 | 0.869 ± 0.09 | 0.700 ± 0.11 | 0.743 ± 0.12 | |

RF | 0.900 ± 0.02 | 0.807 ± 0.10 | 0.811 ± 0.11 | 0.778 ± 0.11 | 0.670 ± 0.14 | 0.700 ± 0.13 | |

with adaboost | 0.934 ± 0.01 | 0.798 ± 0.09 | 0.797 ± 0.09 | 0.839 ± 0.06 | 0.683 ± 0.14 | 0.670 ± 0.12 | |

GB | 0.892 ± 0.06 | 0.783 ± 0.14 | 0.800 ± 0.11 | 0.773 ± 0.06 | 0.609 ± 0.21 | 0.645 ± 0.17 | |

with adaboost | 0.913 ± 0.03 | 0.885 ± 0.11 | 0.868 ± 0.08 | 0.819 ± 0.09 | 0.713 ± 0.16 | 0.722 ± 0.13 | |

Linear models | |||||||

Linear | 0.884 ± 0.05 | 0.821 ± 0.08 | 0.829 ± 0.09 | 0.895 ± 0.04 | 0.795 ± 0.11 | 0.815 ± 0.08 | |

Linear-Lasso | 0.813 ± 0.03 | 0.748 ± 0.11 | 0.781 ± 0.09 | 0.911 ± 0.03 | 0.698 ± 0.19 | 0.755 ± 0.13 | |

Linear-Ridge | 0.890 ± 0.04 | 0.818 ± 0.07 | 0.833 ± 0.06 | 0.880 ± 0.10 | 0.678 ± 0.11 | 0.720 ± 0.08 |

**Table 3.**Overview of average predictive accuracy as RMSE of the predicted fuel quality for the different methods and algorithms.

Model | Histogram-Based Regression | Texture-Based Regression | |||||
---|---|---|---|---|---|---|---|

RMSE | RMSE | ||||||

Best | Average | Median | Best | Average | Median | ||

Tree-type models | |||||||

DT | 6.95 | 10.78 | 10.23 | 9.77 | 13.85 | 12.64 | |

with adaboost | 5.91 | 10.56 | 8.59 | 8.16 | 12.37 | 11.80 | |

ET | 5.68 | 10.55 | 9.77 | 11.04 | 12.79 | 13.91 | |

with adaboost | 6.61 | 10.85 | 9.13 | 9.12 | 13.60 | 12.13 | |

RF | 6.27 | 9.86 | 10.04 | 8.49 | 12.94 | 12.06 | |

with adaboost | 6.51 | 8.56 | 8.89 | 8.32 | 12.42 | 11.38 | |

GB | 5.66 | 8.45 | 8.60 | 7.94 | 10.87 | 10.97 | |

with adaboost | 5.29 | 7.52 | 6.93 | 7.56 | 10.16 | 9.78 | |

Linear models | |||||||

Linear | 8.90 | 11.93 | 12.17 | 5.56 | 7.17 | 7.12 | |

Linear-Lasso | 6.21 | 8.27 | 7.30 | 8.27 | 10.19 | 8.52 | |

Linear-Ridge | 6.03 | 7.79 | 7.80 | 7.57 | 8.72 | 8.84 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Plankenbühler, T.; Kolb, S.; Grümer, F.; Müller, D.; Karl, J.
Image-Based Model for Assessment of Wood Chip Quality and Mixture Ratios. *Processes* **2020**, *8*, 728.
https://doi.org/10.3390/pr8060728

**AMA Style**

Plankenbühler T, Kolb S, Grümer F, Müller D, Karl J.
Image-Based Model for Assessment of Wood Chip Quality and Mixture Ratios. *Processes*. 2020; 8(6):728.
https://doi.org/10.3390/pr8060728

**Chicago/Turabian Style**

Plankenbühler, Thomas, Sebastian Kolb, Fabian Grümer, Dominik Müller, and Jürgen Karl.
2020. "Image-Based Model for Assessment of Wood Chip Quality and Mixture Ratios" *Processes* 8, no. 6: 728.
https://doi.org/10.3390/pr8060728