Classification of Wood Chips Using Electrical Impedance Spectroscopy and Machine Learning

Wood chips are extensively utilised as raw material for the pulp and bio-fuel industry, and advanced material analyses may improve the processes in utilizing these products. Electrical impedance spectroscopy (EIS) combined with machine learning was used in order to analyse heartwood content of pine chips and bark content of birch chips. A novel electrode system integrated in a sampling container was developed for the testing using frequency range 42 Hz–5 MHz. Three electrode pairs were used to measure the samples in x-, y- and z-direction. Three machine learning methods were used: K-nearest neighbor (KNN), decision tree (DT) and support vector machines (SVM). The heartwood content of pine chips and bark content of birch chips were classified with an accuracy of 91% using EIS from pure materials combined with a k-nearest neighbour classifier. When using mixed materials and multiple classes, 73% correct classification for pine heartwood content (four groups) and 64% for birch bark content (five groups) were achieved.


Introduction
The variety of solid bio-based raw materials has rapidly extended during the last decade. This means also a large variation of quality and specific properties. Wood chips are extensively utilised as a raw material for many bio-refining industrial processes, including bio-energy production, pulp and liquid bio-fuel industry. High quality wood chips used for pulp production are commonly known as pulp chips. If the properties of chip materials are known beforehand, the processes may be improved, for example by adjusting the amount of chemicals. Large amount of resins or bark may cause problems in bio-refining processes.
To the extent of our knowledge, there is no feasible on line technique to determine the extractive content of wood chips [1][2][3][4][5]. Perhaps the main challenge in the measurement is the inhomogeneous nature of the material and temperature variation starting below zero in Northern countries. Near-infrared/infrared (NIR/IR) techniques have been used for laboratory analyses of extractives and for on line moisture content (MC) measurement of wood chips. The drawback of the techniques is that often the calibration is very specific and only the surface layer can be measured. On the contrary, it is possible to obtain information from a relevant volume of biomass by EIS technique, and not only from the surface.
Electrical impedance spectroscopy (EIS) has been widely used to study different types of biological materials [6]. One of the main applications has been to study fundamental electrical properties of materials and correlate these properties with material structure. It may be used to investigate the dynamics of bound or mobile charge in the bulk or interfacial regions of liquid or solid materials (e.g., ionic or insulator materials). Many processes take place throughout the material when it is electrically the bark was separated from the stem wood. Then the materials were mixed toKirjoita kaava tähän. get birch chip specimens with bark content of 100%, 75%, 50%, 25%, 0%. The pure materials are shown in Figure 1. The size distribution of the wood samples were similar, but the bark samples included more small particles (Figure 1). Electrical impedance (Z) can be considered a complex quantity, which consists of real (resistance R) and imaginary parts (reactance X). Reactance can be represented in two forms, inductive and capacitive. The impedance plane representation can be plotted as separated functions of frequency (e.g., capacitance and conductance spectra) or in a complex plane with frequency as the parametric variable. The measurements can be presented in the impedance plane by plotting the imaginary part as a function of the real part.
The impedance measurements were carried out in a container (polypropylene, wall thickness 1.6 mm) with the measurement electrodes attached on the walls of the container. Prior to the impedance spectrum measurements, the measurement container was weighed with a plastic bag (polyethylene, thickness 10 µm) without sample chips. Then, the plastic bag in the container was filled with chips and weighed again. The plastic bag was first put into the container and then filled carefully with the tested material until the container was full. After filling, the lid was installed. Temperature (T) and relative humidity (RH) were measured, too. The measurements were conducted with normal laboratory conditions (T 22-24 °C; RH 25%-40%). Weighing was used in determination of moisture content but not in further analyses.
Hioki 3531Z HiTester impedance analyser was used to measure the complex impedance spectrum from 42 Hz to 5 MHz. The total number of different frequencies were 34. The open circuit voltage was 10 V pp sine wave excitation and 16 averaging measurements were made with slow speed. Before the measurement series, open and short circuit compensations were carried out to calibrate the impedance system. Aluminum sheet electrodes were used in this study (Figures 2 and  3). They were glued inside the measurement container and connected to the impedance analyser via coaxial cables. Three electrode pairs were used to measure the wood chip samples in x-, y-and zdirection. Size of the electrodes in x-and y-direction was 50 × 80 mm (height × width) and d = 100 mm in z-direction; the distance between electrodes was 118 mm in z-direction and 160 mm in x-and y-direction. The electrodes were in contact with the plastic bag including wood chip sample during Electrical impedance (Z) can be considered a complex quantity, which consists of real (resistance R) and imaginary parts (reactance X). Reactance can be represented in two forms, inductive and capacitive. The impedance plane representation can be plotted as separated functions of frequency (e.g., capacitance and conductance spectra) or in a complex plane with frequency as the parametric variable. The measurements can be presented in the impedance plane by plotting the imaginary part as a function of the real part.
The impedance measurements were carried out in a container (polypropylene, wall thickness 1.6 mm) with the measurement electrodes attached on the walls of the container. Prior to the impedance spectrum measurements, the measurement container was weighed with a plastic bag (polyethylene, thickness 10 µm) without sample chips. Then, the plastic bag in the container was filled with chips and weighed again. The plastic bag was first put into the container and then filled carefully with the tested material until the container was full. After filling, the lid was installed. Temperature (T) and relative humidity (RH) were measured, too. The measurements were conducted with normal laboratory conditions (T 22-24 • C; RH 25-40%). Weighing was used in determination of moisture content but not in further analyses.
Hioki 3531Z HiTester impedance analyser was used to measure the complex impedance spectrum from 42 Hz to 5 MHz. The total number of different frequencies were 34. The open circuit voltage was 10 V pp sine wave excitation and 16 averaging measurements were made with slow speed. Before the measurement series, open and short circuit compensations were carried out to calibrate the impedance system. Aluminum sheet electrodes were used in this study (Figures 2 and 3). They were glued inside the measurement container and connected to the impedance analyser via coaxial cables. Three electrode pairs were used to measure the wood chip samples in x-, y-and z-direction. Size of the electrodes in x-and y-direction was 50 × 80 mm (height × width) and d = 100 mm in z-direction; the distance between electrodes was 118 mm in z-direction and 160 mm in x-and y-direction. The electrodes were in contact with the plastic bag including wood chip sample during the measurement and thus there was a capacitive connection between sample and electrodes. Electrodes in x and y-directions (electric field in horizontal direction) were curved and flat in z-direction (electric field in vertical direction). The measurement was carried out in a Faraday gage to reduce electromagnetic field noise.
Sensors 2020, 20, x FOR PEER REVIEW 2 of 14 the measurement and thus there was a capacitive connection between sample and electrodes.
Electrodes in x and y-directions (electric field in horizontal direction) were curved and flat in zdirection (electric field in vertical direction). The measurement was carried out in a Faraday gage to reduce electromagnetic field noise.   A series of impedance measurements was conducted, including drying and re-wetting of the samples. At first, the impedance measurements were made at original condition after the sample preparation (originally fresh wood). The next impedance measurements were made when the samples were drying at the laboratory (T = 22-24 °C, RH = 17%-19%) for several days. Before measurement, each sample was mixed to reduce the moisture distribution inside the sample. When the MCs of samples were below 20%, water was added into the samples to increase the MC. After adding the water, the samples were conditioned at least 24 h before the next impedance measurements. The series measurements including drying and rewetting was repeated two times. After the series, the samples were dried at +103 °C and the accurate MC was determined using weighing. Finally, the dried samples were measured again using EIS.  the measurement and thus there was a capacitive connection between sample and electrodes.

2.1.The Extractive Content Determination
Electrodes in x and y-directions (electric field in horizontal direction) were curved and flat in zdirection (electric field in vertical direction). The measurement was carried out in a Faraday gage to reduce electromagnetic field noise. A series of impedance measurements was conducted, including drying and re-wetting of the samples. At first, the impedance measurements were made at original condition after the sample preparation (originally fresh wood). The next impedance measurements were made when the samples were drying at the laboratory (T = 22-24 °C, RH = 17%-19%) for several days. Before measurement, each sample was mixed to reduce the moisture distribution inside the sample. When the MCs of samples were below 20%, water was added into the samples to increase the MC. After adding the water, the samples were conditioned at least 24 h before the next impedance measurements. The series measurements including drying and rewetting was repeated two times. After the series, the samples were dried at +103 °C and the accurate MC was determined using weighing. Finally, the dried samples were measured again using EIS. A series of impedance measurements was conducted, including drying and re-wetting of the samples. At first, the impedance measurements were made at original condition after the sample preparation (originally fresh wood). The next impedance measurements were made when the samples were drying at the laboratory (T = 22-24 • C, RH = 17-19%) for several days. Before measurement, each sample was mixed to reduce the moisture distribution inside the sample. When the MCs of samples were below 20%, water was added into the samples to increase the MC. After adding the water, the samples were conditioned at least 24 h before the next impedance measurements. The series measurements including drying and rewetting was repeated two times. After the series, the samples were dried at +103 • C and the accurate MC was determined using weighing. Finally, the dried samples were measured again using EIS.

The Extractive Content Determination
The extractive contents of the samples were determined according to the method [41]. The method includes acetone extraction and gas chromatography analysis. The amount of acetone-soluble matter in wood chips provides a measure of the content of wood extractives, often called resin. The acetone-soluble matter includes, e.g., fatty acids, resin acids, fatty alcohols, sterols, di-and tri-glycerides, steryl esters and waxes. In addition, acetone-extracts of wood chips and mechanical pulps may also contain phenolic compounds such as lignans.

Data Analysis
Three classification methods were used: k-nearest neighbor (KNN), decision tree (DT) and support vector machines (SVM) (Figure 4). The classification methods were tested and compared by using training and testing sets. The input from the training set was fed into a classifier and the classifier was trained. After the training, the trained classifier was applied to the testing set and the correctness of the operation was determined. The tests were carried out using Matlab2016b and Classification Learner app (The MathWorks, Inc., Natick, MA, US).
represents a class and the unknown pattern from the testing set is classified by finding the nearest neighbors from the sets of training patterns. Statistically more reliable results can be achieved by using more than one nearest neighbor. In KNN, the unknown pattern is placed in a class with most of the k-nearest neighbors in the training set.
Decision tree is a nonparametric classifier. It builds a tree-model based on the training data, where the root of the tree is the entire population of the input and each leaf represent the different classification outputs. The output leaf is selected by the decision nodes that represent different input values.
A support vector machine is a non-parametric classifier. It builds a hyperplane that maximizes the margin between the classes. Hyperplane is built based on the training observations, which are closest to different classes. Training observations, which are used to build the separating hyperplane are called support vector machines. The hyperplane can be linear or nonlinear separable.
The machine learning methods were validated using cross-validation, which separates the data to multiple training and test sets. Training sets are used to train the model and the test set is used to calculate the error of the trained model. Leave-one-out cross-validation was used, the method uses one sample as a test set and rest of the data set as a training set. Leave-one-out cross validation builds as many different models as there are samples and then calculates the average error of the models. Every sample is used once as a test set and the error is calculated based on these samples. Average accuracy of the models was used to determine the accuracy of the selected model. As leave-one-out cross-validation creates a single result for each of the models, deviation of the results was not calculated.
Neighborhood component analysis (NCA) was used to select optimal frequencies from the spectral data. Different frequencies were used for different classification sets. K-nearest neighbor classifier is a nonparametric classifier. The training set for each class represents a class and the unknown pattern from the testing set is classified by finding the nearest neighbors from the sets of training patterns. Statistically more reliable results can be achieved by using more than one nearest neighbor. In KNN, the unknown pattern is placed in a class with most of the k-nearest neighbors in the training set.
Decision tree is a nonparametric classifier. It builds a tree-model based on the training data, where the root of the tree is the entire population of the input and each leaf represent the different classification outputs. The output leaf is selected by the decision nodes that represent different input values.
A support vector machine is a non-parametric classifier. It builds a hyperplane that maximizes the margin between the classes. Hyperplane is built based on the training observations, which are closest to different classes. Training observations, which are used to build the separating hyperplane are called support vector machines. The hyperplane can be linear or nonlinear separable.
The machine learning methods were validated using cross-validation, which separates the data to multiple training and test sets. Training sets are used to train the model and the test set is used to calculate the error of the trained model. Leave-one-out cross-validation was used, the method uses one sample as a test set and rest of the data set as a training set. Leave-one-out cross validation builds as many different models as there are samples and then calculates the average error of the models. Every sample is used once as a test set and the error is calculated based on these samples. Average accuracy of the models was used to determine the accuracy of the selected model. As leave-one-out cross-validation creates a single result for each of the models, deviation of the results was not calculated.
Neighborhood component analysis (NCA) was used to select optimal frequencies from the spectral data. Different frequencies were used for different classification sets.
Trial and error were used to choose the best hyperparameters for each of the models. Models were validated using different hyperparameters and model with the best accuracy were used as a classifier. KNN were tested using 1, 10 and 100 for number of neighbours. Euclidean distance metric function was used in all KNN models. Support vector machine models were tested using linear, polynomial and Gaussian kernel functions. One was used for a multiclass method in all support vector machine models. Decision tree models were tested using 4, 20 and 100 values for maximum number of splits. Gini index was used for split criterion method in all decision tree models. All data were standardized before training the models.

Results
Impedance spectra were measured from three different directions, all consisting of 34 different frequencies from 42 Hz to 5 MHz. The impedance modulus and phase values with standard deviations are presented at 10 kHz and 1 MHz (Table 1).  Examples of the measured complex electrical impedance spectra are shown for fresh pine ( Figure 5) and for fresh birch ( Figure 6).
The impedance spectra of pine sapwood chips and heartwood chips were considerably different (Figure 5a,d) in respect of the dispersion frequency and the magnitudes of the real and imaginary parts. For the mixed materials (Figure 5b,c) the differences in spectra were reduced but observable. The measurements at x-and y-directions ( Figure 3) were quite similar but z-direction was different.
The impedance spectra of birch wood chips and birch bark were different (Figure 6a,d). Especially the ratios of the real and imaginary parts of the spectra were different. The real and imaginary part of impedance measured from bark was smaller compared to the response of stem wood chips. When spectra of mixed materials (Figure 6b,c) were measured, the difference was reduced but still the differences can be recognized from the spectra. Similar to the pine chips, the measurements at x-and y-directions were quite similar, but the z-direction was different. The difference between the complex electrical spectra measured at three different directions is remarkably different when comparing stem wood material and bark. The impedance spectra of pine sapwood chips and heartwood chips were considerably different (Figure 5a,d) in respect of the dispersion frequency and the magnitudes of the real and imaginary parts. For the mixed materials (Figure 5b,c) the differences in spectra were reduced but observable. The measurements at x-and y-directions ( Figure 3) were quite similar but z-direction was different.   The impedance spectra of pine sapwood chips and heartwood chips were considerably different (Figure 5a,d) in respect of the dispersion frequency and the magnitudes of the real and imaginary parts. For the mixed materials (Figure 5b,c) the differences in spectra were reduced but observable. The measurements at x-and y-directions ( Figure 3) were quite similar but z-direction was different.  The three classification techniques were used to determine the efficiency of multi-parameter EIS in categorizing the raw materials. The shown accuracy is the average from each of the model results.
Several classification tests were made. Table 2 shows the classes and MC range for the tests including 75% or 100% pure material content. Classification results are shown in Tables 3 and 4. Table 2. Classification of heartwood content from pine chips and bark content from birch chips: material classes according to the content percentages and moisture content range of each class.  Table 3. Classification results, correct classification (%). Classification of heartwood content from pine chips, bark content from birch chips and pine chips from birch chips and birch bark. MC range 0-60%. Test groups and correct classification using decision tree (DT), support vector machine (SVM) and K-nearest neighbor (KNN). N = number of samples. Birch/pine classification included three classes: birch, pine and birch bark. Pine100  73  85  90  41  Pine75  69  85  89  80  Birch100  89  93  93  53  Birch 75  74  83  75  101  Birch/pine100  65  78  69  80  Birch/pine 75  71  84  86  152   Table 4. Classification results, correct classification (%). Pine heartwood/sapwood and birch bark/wood, 5 classes: 0, 25%, 50%, 75% and 100% mix ratios. Classification by using decision tree (DT), support vector machine (SVM) and K-nearest neighbor (KNN). N = number of samples.
Classification of pure materials gave the best results, by using SVM or KNN it was possible to achieve correct classification rate better than 90% in MC range 0-60%. The best results were achieved using KNN with 10 for the number of neighbors. SVM gave the best results using polynomial kernel. When increasing the number of classes, the correct classification (%) was reduced (Tables 3 and 4). When the MC range was above the FSP (about 30% MC), the classification results were improved, KNN classifier gave 73% classification accuracy for pine heartwood content and both KNN and SVM 66% for birch bark content. The best results were achieved using KNN with 1 for the number of neighbors. SVM gave the best results using polynomial kernel.
The results of the extractive content analyses showed that pine heartwood and birch bark contain substantial amount of extractives compared with birch wood or pine sapwood. The extractive content decreased when the material was dried, except for pine sapwood. The highest change was with birch wood samples.

Discussion
The study represents novel methods to improve analysis of biomass. So far the main interest of the studies has been moisture content using large scale of measurement methods including electromagnetic spectrum from very low frequencies [19,20] to high frequencies, e.g., gamma-rays [24]. This study showed that it is possible to classify materials according to their electrical impedance spectrum, and thus, it will be possible to determine more accurate models for MC, as one of the main issues affecting the accuracy of MC determination is the inhomogeneity of the studied material.
In the MC range, 0-60%, MC dominates the impedance measurement as expected. With narrower MC range (30-60%), which is typical MC range for industrial processes utilizing wood chips, the classification results improved significantly. It was possible to classify materials even into five classes according to heartwood/sapwood ratio or bark/stem wood ratio. In addition, the pure raw materials could be distinguished from each other using EIS with good accuracy in all MC ranges. The plastic insulator layer (plastic bag) between aluminum electrodes and the sample inhibited effectively the electrode polarization effect, which would has a high impact on the measurement. On the other hand, because of the insulation, there is no direct current going through the sample and the response is mainly capacitive. If the contact between electrodes and material would have been galvanic, accumulation of ions would have a significant effect to the results (effect of current density). By using thin plastic bags around the sample, it was possible to eliminate the electrode polarization effect, and it is more practical because plastic bags are used in industry.
The determined extractive contents were in accordance with previous studies [23]. Birch bark contains significantly more extractives than birch wood and pine heartwood, and is rich in extractives compared to pine sapwood. The effect of extractives has been reported earlier-electrical impedance spectroscopy may be used to distinguish wood materials according to their extractive content [9]. If the MC variation is high and MC is below the fiber saturation point (FSP), the MC dominates the EIS measurement. When MC is above the fiber saturation point or if the MC is limited to a certain range, the classification according to wood material's characteristics is possible using EIS. In addition to the extractive content, other characteristics of the material affect the results, too. This is important especially for the different types of materials such as birch stem wood and bark. Cellulose, hemicellulose, lignin and ash content is also very different between stem wood and bark [23,42,43].
When comparing the heartwood/sapwood ratio at MC range 30-60% there was no pure heartwood samples at that MC range. Thus, four group classification was used. On the other hand, MC may be determined accurately using EIS, and thus, the MC value might be useful for the material analyses.
When the data was divided into smaller classes, the distributions got narrower. For analyzing materials with 2-3 classes, the distributions were workable for the classification. When the number of classes increased to five, some of the test samples differed considerably from the ones used in the training, which affected the classification results.
One of the new findings in this study was that birch material could be distinguished from pine material with good accuracy in all tested MC ranges. Birch has higher density on average, which affects the capacitance. On the other hand, the anatomical structure of hardwood and softwood is different, which affects the results. Structural differences affect the electrical properties, e.g., when measuring at different directions of wood. Here, we measured the electrical properties of samples in x-, y-and z-directions ( Figure 3). The distributions of electrical parameters in different frequencies are different when measuring from top to bottom compared to the side-to-side measurements though the material is cut to chips and blended. The difference of bark from stem wood materials was detectable especially when comparing the difference of x-and y-direction with z-direction.
When comparing the classification methods, the DT classification gave the poorest results. The overall results of SVM and KNN were better and similar to each other. The inhomogeneous materials including distributions and complex non-linear data affected the results. The SVM uses vectors and KNN measurement points for classification, and both are non-linear methods. Thus, the methods produced similar classification results. If only few neighbor elements are used, KNN may handle non-linear data effectively [44]. All the studied machine learning methods could be used effectively to classify the studied material despite of the varying MC. The result shows the potential of combined electrical spectrum analysis with machine learning for advanced analysis of wood chips.
Artificial intelligence-based non-linear methods, with EIS, may be even more advantageous for industrial material analyses than the equivalent model analyses because of the efficiency of the models to extract useful non-linear information from the electrical spectra. The models may be implemented for real time measurement, which is often not possible for equivalent model analyses because of time consumption. In theory, the raw data from electrical spectra contain all information about the measurement. Even though the equivalent model analysis is perhaps the best way to make a theoretical model for an electrical system, it is not always the best choice for industrial application when quick real-time analysis is required.
EIS may be used together with the other measurement techniques. By using two or more methods together, the analysis may be improved. Machine vison and X-ray techniques are among the potential methods to increase the biomass analysis accuracy together with EIS. X-rays and EIS can measure the whole volume of biomass but machine vision systems can get information only from the surfaces. Tomographic systems, including electrical impedance tomography and/or X-ray tomography may give accurate spatial information of the biomass material.

Conclusions
Good classification efficiency was achieved by using electrical impedance spectrum analysis combined with multiple parameter analyses and machine learning techniques. It was not possible to distinguish small levels of bark or heartwood content if MC range was from 0-60% but pure materials were distinguished from each other. The results improved significantly when the MC range was limited to typical industrial process moisture contents above FSP (30-60%). An overall 73% correct classification was achieved for different degrees of pine heartwood content (four classes) when using KNN classifier. When birch bark content was studied, both KNN and SVM classifiers gave 66% correct classification rate (five classes). When pure materials were classified (two classes), KNN classifier gave 93% classification accuracy for birch bark content (MC range 0-58%), and 91% accuracy for pine heartwood content (MC range 0-58%), respectively. EIS method combined with machine learning holds potential for an advanced analysis method for wood chips for laboratory and industry.