Inﬂuence of Heartwood on Wood Density and Pulp Properties Explained by Machine Learning Techniques

: The aim of this work is to develop a tool to predict some pulp properties e.g., pulp yield, Kappa number, ISO brightness (ISO 2470:2008), ﬁber length and ﬁber width, using the sapwood and heartwood proportion in the raw-material. For this purpose, Acacia melanoxylon trees were collected from four sites in Portugal. Percentage of sapwood and heartwood, area and the stem eccentricity (in N-S and E-W directions) were measured on transversal stem sections of A. melanoxylon R. Br. The relative position of the samples with respect to the total tree height was also considered as an input variable. Different conﬁgurations were tested until the maximum correlation coefﬁcient was achieved. A classical mathematical technique (multiple linear regression) and machine learning methods (classiﬁcation and regression trees, multi-layer perceptron and support vector machines) were tested. Classiﬁcation regression trees (CART) was the most accurate model for the prediction of pulp ISO brightness ( R = 0.85). The other parameters could be predicted with fair results ( R = 0.64–0.75) by CART. Hence, the proportion of heartwood and sapwood is a relevant parameter for pulping and pulp properties, and should be taken as a quality trait when assessing a pulpwood resource.


Introduction
Blackwood and other acacia were introduced in Portugal at the beginning of the twentieth century in order to colonize dry and poor sandy soils along the coast.Some of the species are considered as invasive as they are characterized by vigorous tree growth, root sprouting and fire stimulated seed germination.Given the high timber value of Acacia melanoxylon in its region of origin, a research effort was made to study the wood quality in view of its valorization as a timber species in Europe.For instance, studies were made on heartwood development [1], density and mechanical properties [2] as well as on its utilization as raw-material for pulp and paper [3,4].
Blackwood is a medium density hardwood presenting basic density ranging between 450 and 850 kg/m 3 [2].The density that is usually higher for heartwood will influence the paper properties [5,6].
The wood quality of blackwood trees is primarily given by the amount of heartwood, which generally shows a rich brown color and a high natural durability [7,8].Heartwood, which has a high content of extractives, mostly dark-colored and phenolic, impacts negatively on wood chemical pulping [9], pulp yields, pulp brightness [10][11][12][13] and affects paper manufacturing [14][15][16].However, the presence of heartwood color can be a positive detail in furniture.The variation of heartwood and sapwood in Acacia melanoxylon growth in Portugal showed no significant variation between trees and site [1].
Knowledge about the influence of heartwood percentage on the properties of the final product is important for stand management and raw material selection in the industry.
Machine learning focuses on the biological learning process and tries to emulate it through algorithms that are able to learn from given data and provide new results.Machine learning techniques have shown to be an effective tool for modelling and predicting natural parameters in different fields.
Among the examples in the literature, these techniques were used to evaluate manufactured paper [17], to obtain the mechanical properties of wood and cork from physical properties [18][19][20], to predict paper properties from its density [5], to discriminate wood types [21] and to evaluate tree growth [22], to name a few.
The objective of this work is to develop a tool to predict pulp properties (pulp yield, Kappa number, ISO brightness (ISO 2470:2008), fiber length and fiber width) from characteristics of the raw material such as sapwood and heartwood proportion, stem eccentricity in N-S and E-W directions and the relative position of samples with respect to the total tree height.For this purpose, we first applied classical mathematical techniques (multiple linear regression), and then used different machine learning techniques, namely classification and regression trees (CART), Multi-Layer Perceptron (MLP), which is a particular case of artificial neural networks, and support vector machines (SVM).

Dataset
Transversal stem sections of A. melanoxylon belonging to 20 trees from four sites in Portugal were used-the Camarido National Forest (MNC), at the mouth of Minho River, in the littoral north close to Caminha; the Forest Perimeter of Ovar Dunes (PFDOVM), in the littoral north close to Ovar; the Forest Perimeter of Rebordões de Santa Maria (PFRSM), in the north mid-interior close to Ponte de Lima; and the Forest Perimeter of Crasto Mountain (PFC), in the centre interior close to Viseu [23].Selective harvesting was done for a sawn timber with diameter at breast height (dbh) above 40 cm over-bark, corresponding to a revolution age of about 50 years.Six samples were taken from each tree at different heights: one from the top, one from the bottom, and one from the 65%, 35%, 15% and 5% of the total height of the tree.
The marking out of heartwood was done visually by its color, its area being estimated by the calculation of a circle area, using the mean radius of direct measurements on four geographic exposures of each disc.The tree and heartwood volumes were calculated per stump comprised between the different height levels of the sampling.Conical sections for 0%-5%, 5%-15%, 15%-35%, 35%-65% and 65%-top were considered.The sapwood volume was calculated by the difference of heartwood with respect to total wood, according Rucha et al. [24].
The samples were subsequently chipped to particles with approximately 50 mm length, 10 mm width and 5 mm thickness, taking care to avoid any knotwood-since their extractive content may influence pulping [25]-and submitted to conventional Kraft pulping in a forced circulation digester under the following reaction conditions: active alkali charge 21.3% (as NaOH); sulfidity index 30%; liquor/wood ratio 4/1; time to temperature 90 min; time at temperature (160 • C) 90 min.Experiments were carried out with 25 g (oven dry) of wood, as reported in Santos et al. [6].
For each sample, the following parameters were gathered: position of the samples with respect to the relative height (%), total tree height (m), absolute height of the sample (m), tree diameter at different heights (cm), stem eccentricity in N-S and E-W directions (cm), sapwood and heartwood proportion and total area (cm 2 ) according to [23].Also, density (kg/m 3 ) of the wood chips was measured [26].The pulp was characterized by measuring pulp yield (%), Kappa number, ISO brightness (%), fiber length (mm) and fiber width (µm) according Santos et al. [6].

Classical Techniques: Multiple Linear Regression
In a multiple linear regression model, two or more explanatory variables relate to a single response variable by means of an adjusted linear function, where ŷi represents the estimated response of y i [27,28].The similarity of the prediction method is maximized with the least squares method and the correlation coefficient is defined as being S 2 Y the variance of Y.

Machine Learning Techniques: CART
Classification and regression trees (CART) are defined as a non-parametric model that is able to explain the response of a dependent variable given a set of independent variables [29].The process is based on the creation of subgroups from the initial input data, then analyzing their variance and minimizing this parameter until they are homogeneous.
The prediction of pulp properties was tackled as a regression problem, so the regression trees defined were optimized using mean-squared error (MSE).
Depending on the type of problem (regression or classification), different algorithms can be used to minimize the variance of the subgroups: (MSE) for regression trees, towing rule, cross entropy or Gini's diversity index for classification trees [30].

Machine Learning Techniques: MLP
Artificial neural networks (ANNs) emulate the natural connections between our neurons and how they process information.The structure of multilayer perceptron (MLP) is based on layers (input, intermediate or hidden and output) formed by perceptions that are not back propagated, connected by activation functions and weights that are adjusted by training the algorithm until the best output is obtained [31].
Hence, an ANN can be defined by ( 1) how its layers are interconnected; (2) how its weights are updated, i.e., how it learns; (3) how the activation function converts input neurons into their output [31,32].
The training of the network was completed with a 10-fold cross-validation process that guaranteed that all the data were used as training data and as testing data to optimize the weights.

Machine Learning Techniques: SVM
Since Cortes and Vapnik introduced a new methodology to approach binary classification problems [33], support vector machines (SVM) have become an important machine learning technique for pattern recognition problems due to the global optimum character, parsimony and flexibility [34].This method is based on finding an optimal separating hyperplane able to classify the observations of an initial input space into a certain number of classes, which are at maximum distance from each other.However, SVM have also been applied to regression problems, usually renamed as support vector regression (SVR) [35,36].

Predictive Model: Input and Output Data
The predictive model defined in this paper consists of some input variables used to obtain other output variables (the desired prediction results) by means of the selected machine learning technique (Figure 1).

Predictive Model: Input and Output Data
The predictive model defined in this paper consists of some input variables used to obtain other output variables (the desired prediction results) by means of the selected machine learning technique (Figure 1).The problem was approached as a regression problem, where a numeric output is given.Several input sets were tested in order to find the most suitable input data for the model: Input 1: sapwood in the wood discs (%), heartwood in the wood discs (%), total area of the wood discs (cm 2 );  Input 2: heartwood in the wood discs (%), total area of the wood discs (cm 2 );  Input 3: heartwood in the wood discs (%), total area of the wood discs (cm 2 ), stem eccentricity in N-S direction (cm), stem eccentricity in E-W direction (cm);  Input 4: heartwood in the wood discs (%), total area of the wood discs (cm 2 ), stem eccentricity in N-S direction (cm), stem eccentricity in E-W direction (cm), relative position of the samples with respect to the total tree height (m).
The relevance of input variables was studied by means of PCA, whose results will be discussed in the corresponding section.
Regarding the output variables, the prediction of the following properties was assessed:

Results
The basic statistic parameters of the selected samples used as input variables to obtain other output variables with the machine learning technique are reported in Table 1, while Table 2 contains a summary of the results.The problem was approached as a regression problem, where a numeric output is given.Several input sets were tested in order to find the most suitable input data for the model:

•
Input 1: sapwood in the wood discs (%), heartwood in the wood discs (%), total area of the wood discs (cm 2 ); • Input 2: heartwood in the wood discs (%), total area of the wood discs (cm 2 ); • Input 3: heartwood in the wood discs (%), total area of the wood discs (cm 2 ), stem eccentricity in N-S direction (cm), stem eccentricity in E-W direction (cm); • Input 4: heartwood in the wood discs (%), total area of the wood discs (cm 2 ), stem eccentricity in N-S direction (cm), stem eccentricity in E-W direction (cm), relative position of the samples with respect to the total tree height (m).
The relevance of input variables was studied by means of PCA, whose results will be discussed in the corresponding section.
Regarding the output variables, the prediction of the following properties was assessed:

Results
The basic statistic parameters of the selected samples used as input variables to obtain other output variables with the machine learning technique are reported in Table 1, while Table 2 contains a summary of the results.PCA analysis was performed in order to study the relevance of the variables of each input set.Since these variables belong to different properties and magnitudes, they were standardized before performing PCA.The results regarding the total variance explained by each principal component are included in Table 3.The performance of the predictive models was assessed by means of their Pearson's correlation coefficient (R).Furthermore, root mean square error (RMSE) was also considered to assess CART, MLP and SVM predictive models.It gives the standard deviation of the model prediction error, so the smaller its value, the better the model performance is.Regarding multiple linear regression results, R 2 was obtained.
Table 4 shows in more detail the performance of CART models for the different inputs and outputs.Figures 2 and 3 display the real and predicted values for Output 3 and Output 4, respectively, when obtained by means of the trained CART from Input 4.

Discussion
PCA analysis provides valuable information regarding the role played by each variable in the predictive model (Table 3).The four input sets were studied and their principal components show the following: regarding Input 1, there is a linear dependence between two of the variables (quite obvious, since % Sapwood and % Heartwood make 100%).The remaining Input sets do not show a linear dependence between their variables and none of them could be removed without losing a considerable amount of information.
According to the results obtained, multiple linear regression turned out not to be appropriate for the given data when the input was Input 2, Input 3 or Input 4 since the resulting R 2 statistic was negative.When Input 1 was used, multiple linear regression performed poorly whatever the output

Discussion
PCA analysis provides valuable information regarding the role played by each variable in the predictive model (Table 3).The four input sets were studied and their principal components show the following: regarding Input 1, there is a linear dependence between two of the variables (quite obvious, since % Sapwood and % Heartwood make 100%).The remaining Input sets do not show a linear dependence between their variables and none of them could be removed without losing a considerable amount of information.
According to the results obtained, multiple linear regression turned out not to be appropriate for the given data when the input was Input 2, Input 3 or Input 4 since the resulting R 2 statistic was negative.When Input 1 was used, multiple linear regression performed poorly whatever the output was, with Output 3 showing the highest R 2 = 0.46.These results are coherent with PCA analysis.Furthermore, given the complexity of the data, results of multiple linear regression with Inputs 2, 3 and 4 are proof of the non-linear relationship among the variables considered.Bearing this in mind, machine learning techniques were applied to the problem.
Among the machine learning techniques tested, MLP and SVM performed similarly: poor results were obtained regardless of the input and output, as shown in Table 2.This fact could be explained by the limited number of input variables and observations available, since they usually perform well with considerable databases [37].
Based on our results, CART is the technique that best predicts wood pulp properties, better than multiple linear regression, MLP and SVM.All the output variables tested showed higher correlation coefficients and lower RMSE values when they were obtained through CART models, regardless of the input parameters (Table 4).Pulp ISO brightness gave the best results (R = 0.85, RMSE = 2.425), followed by Kappa number (R = 0.74, RMSE = 0.758), whereas the remaining properties (wood chips density, pulp yield, fiber length and fiber width) had similar results (R = 0.64-0.68 and RMSE consistent with the dispersion of data).Furthermore, the results show that Input 3 and Input 4 perform better in general terms than Input 1 and 2, which is in accordance with PCA results that indicated the distributed importance of the variables.
No models were found in the literature to predict these output variables with these input variables.However, other models with similar Pearson's correlation coefficients were found for wood quality prediction.Fritz et al. [38] used structure-from-motion with multi-view stereo-photogrammetry and obtained a R = 0.70, to estimate the stem radius of oak (Quercus robur), hornbeam (Carpinus betulus), and maple (Acer pseudoplatanus) trees.Knapic and Pereira [39] found significant positive correlations for heartwood diameter with tree diameter at breast height (R = 0.63).
Furthermore, CART provides valuable information regarding the influence of the different input parameters, since the structure of the tree represents the regression process until the result is reached [29].Stem eccentricity in both N-S and E-W directions showed up as an important input parameter for an accurate prediction of all the output variables.It is also the case of the relative position of the samples with respect to the total tree height, which is included in the Input 4 configuration that gave the most accurate results for five of the six outputs predicted.
Stem eccentricity is frequently linked to the presence of reaction wood that is produced on inclined trunks or branches.Depending on reaction wood proportion, the wood properties could be affected to a greater or lesser extent.The eccentricity of a log strongly influences timber processing and the yield of wood components.Tension wood i.e., reaction wood in hardwoods, produces paper with inferior strength properties (namely tensile index, brust index and tear index) in comparison to normal wood, given the differences in the morphologic ultra-structure [40].

Conclusions
This research work focuses on predicting pulp properties from raw-material characteristics, testing the suitability and applicability of different mathematical techniques to correlate the available data and obtain the density of wood chips, pulp yield, Kappa number of the pulp, pulp ISO brightness and fiber length and width.
A classical mathematical approach, namely multiple linear regression, as well as machine learning techniques, namely CART, MLP and SVM, were used for this purpose.CART has been shown as the most appropriate technique to predict pulp properties among the four tested.The poor results of the remaining techniques can be due to the limited database and the heterogeneity of the data, thus being difficult to model.The pulp properties best predicted were ISO brightness and Kappa number, which could be modelled reasonably well by means of CART.
Taking advantage of CART structure, the influence of the different input parameters could be analyzed.Thereby, stem eccentricity, an indicator of the presence of reaction wood, was shown to influence the strength of the final paper products due to the resultant differences in the morphologic ultra-structure.
This kind of study provides valuable information to correct or prevent undesirable properties in the final products, as well as highlighting the existing relationships between the different variables of the process.

Figure 1 .
Figure 1.Predictive model defined in this paper.CART, Classification and Regression Trees; MLP, Multi-Layer Perceptron; SVM, Support Vector Machines; N, North; S, South; E, East; W, West.

Figure 1 .
Figure 1.Predictive model defined in this paper.CART, Classification and Regression Trees; MLP, Multi-Layer Perceptron; SVM, Support Vector Machines; N, North; S, South; E, East; W, West.

Figure 2 .
Figure 2. Prediction of Output 3 with the trained CART model from Input 4.

Figure 2 .
Figure 2. Prediction of Output 3 with the trained CART model from Input 4.

Figure 2 .
Figure 2. Prediction of Output 3 with the trained CART model from Input 4.

Figure 3 .
Figure 3. Prediction of Output 4 with the trained CART model from Input 4.

Figure 3 .
Figure 3. Prediction of Output 4 with the trained CART model from Input 4.

Table 1 .
Basic descriptive statistics for the parameters that compose our dataset.

Table 2 .
Summary of the best results obtained for multiple linear regression, Classification and Regression Tress (CART), Multi-Layer perceptron (MLP) and Support Vector Machines (SVM) for each Output.

Table 4 .
Results for the CART model.Best results are displayed in bold.