A Newly Built Model of an Additive Stem Taper System with Total Disaggregation Model Structure for Dahurian Larch in Northeast China

Stem taper function is an important concept in forest growth and yield modeling, and forest management. However, the additivity of the function and the inherent correlations between stem components (diameter outside bark—dob, diameter inside bark—dib, and double-bark thickness—dbt) are seldom considered. In this paper, a total disaggregation model (TDM) structure was developed based on the well-known Kozak (2004) model to ensure the additivity of the stem components. The reconstructed model was fitted with the data of 1281 felled Dahurian larch trees from three regions of Daxing’anling Mountains in Northeast China. The results from TDM were compared with other additive model structures including adjustment in proportion (AP), non-additive taper models (NAM), and three logical structures of NSUR (AMO, SMI, SMB). The results showed that the difference was significant among the three regions. The performance of TDM was slightly better than those of other model structures. Therefore, TDM was considered as another optimal additive system to estimate stem, bark thickness, and volume predicting for Dahurian larch in Northeast China besides NSUR, a method widely used in calculating additive volume or biomass throughout the world. We believe this work is cutting-edge, and that this methodology can be applied to other tree species.


Introduction
The diameter outside bark (dob), inside bark (dib), and bark thickness are crucial measurements in forestry. In particular, dib is used to calculate the wood volume of logs and trees, and is usually measured by subtracting the bark thickness from the dob, whose accuracy mainly depends on the precise estimation of the bark thickness and dob. Incorrect estimation of bark thickness may lead to an inaccurate estimation of bark volume and stand volume in forest inventory, increment study or the log trade [1]. In recent years, accurate predictions of bark products have become a matter of interest for forest managers [2][3][4][5]. The utility of bark has transformed from unnecessary residue to high-value biomass energy [6]. In addition, the value of bark volume has been growing progressively, as it can be used to estimate biomass and quantify carbon stocks [7]. Therefore, new bark thickness models are being developed, and their importance has been gradually recognized by forestry scholars.
Previously, model researchers usually considered the dob, dib or bark thickness separately, and often neglected the inherent correlations among them, even if the accuracy of a single model was very high. Recently, many scholars have solved the additive problem by constructing additive models in forestry, such as additive volume, biomass or crown width equations [8][9][10][11][12][13]. In this study, we used the stem taper function to construct additive models for dob, dib and bark thickness. Through stem taper function, were able to quantify and estimate the whole stem volume, single log volumes of any length, merchantable height in response to any diameter or from any stump height, and diameter inside/outside bark at any point of the stem [14]. Numerous taper functions have been evaluated for a wide range of tree species in different regions. In most cases, they were referenced to dbh, total height, and height from the stump to predict the diameter inside bark [15][16][17], and the diameter outside bark [18][19][20][21]. The bark thickness was rarely predicted in these studies [22,23].
In forestry, four methods are often used for forcing additivity of a set of the nonlinear models. These models include adjustment in proportion (AP), OLS with separating regression (OLSSR), nonlinear seemingly unrelated regression (NSUR), and total disaggregation method (TDM) [24][25][26][27][28][29][30][31]. The AP method directly partitions the total dob of a tree into two stem components-dib and double-bark thickness (dbt) by weighting, where dob, dib and dbt are estimated by OLS first. NSUR, known as a joint-generalized least square regression, proposed in the 1980s, is more flexible than AP and OLSSR [24,25], and has been widely used in additive volume or biomass equations throughout the world [32][33][34]. However, TDM is the best way to reflect the inherent correlations among stem components, integrating the advantages of NSUR and AP, which is also confirmed in the study.
In our study, data of 1281 harvested trees were used to establish an optimal additive system of bark thickness and stem taper (TDM) for Dahurian larch in Northeast China. The newly built system was also compared to other model structures. The additive system of stem taper was shown to provide consistent stem volume estimates for Dahurian larch with or without bark, and that the system would feasibly be a reliable method for scientific forest management.

Study Area and Data Collection
The study site is located to the north of Daxing'anling Mountains in the Heilongjiang and Inner Mongolia provinces of Northeast China (from 50 • 04 N to 53 • 32 N and from 121 • 50 E to 127 • 00 E). This region covers an area of 84,600 km 2 and is among the main areas producing high-quality wood in China. The natural secondary forest of Dahurian larch (Larix gmelinii) is the major forest type here. Elevation ranges from 400 to 1000 m, with a continental climate. The average annual precipitation is from 360 to 550 mm, of which 80% occurs in the summer. The average annual temperature ranges from −1.2 to −5.6 • C. The study area was assigned to three distinct regions [35]. The northwest and the southeast of the northern slope of Yilehuli Mountain are designated as regions one and two, respectively. The eastern slope of the northern part of Daxing'anling Mountain is described as region three ( Figure 1).
The data used in this study were collected from 1281 felled Dahurian larch trees. Before felling, diameter at breast height (D, 1.3 m) was measured to the nearest 0.1 cm for each tree. The trees were then felled and total height (H) was measured to the nearest 0.1 m. The diameters over bark (dob) were measured at heights (h) 2,4,6,8,10,15,20,30,40,50,60,70,80, and 90% of total height, averaging 14 sections per tree (Table 1). Bark thickness was taken with the bark gauge at each relative height. The diameter inside bark (dib) was estimated with the formula: dib = dob − dbt, where dbt is double-bark thickness.
The data of the three regions were combined and randomly divided into two groups: 964 trees (75%) for model fitting and 317 trees (25%) for model validation. Summary statistics for diameter at breast height and total height of sampled trees are provided in Table 1. The data used in this study were collected from 1281 felled Dahurian larch trees. Before felling, diameter at breast height (D, 1.3 m) was measured to the nearest 0.1 cm for each tree. The trees were then felled and total height (H) was measured to the nearest 0.1 m. The diameters over bark (dob) were measured at heights (h) 2,4,6,8,10,15,20,30,40,50,60,70,80, and 90% of total height, averaging 14 sections per tree (Table 1). Bark thickness was taken with the bark gauge at each relative height. The diameter inside bark (dib) was estimated with the formula: dib = dob − dbt, where dbt is double-bark thickness. The data of the three regions were combined and randomly divided into two groups: 964 trees (75%) for model fitting and 317 trees (25%) for model validation. Summary statistics for diameter at breast height and total height of sampled trees are provided in Table  1.

Base Model
To date, the stem taper function of Kozak [14] has performed well in several stem taper studies around the world [18,20,[36][37][38][39][40]. Therefore, we selected this variable exponent function to model the stem taper in this study: where d is stem taper; D is the diameter at breast height; H is total height; T = h/H, h is the height from the stump to any given height; m and b i are parameters, and m = 0.1; For Equation (1), the stem taper function of Kozak [14], various simplified forms can be generated by modifying some independent variables. Such modification can improve the model fitting precision and decrease the multicollinearity of complex models with many correlated variables. Initial model fitting results showed that Equation (1) could be simplified by removing the H b 3 term, and the achieved model provided better fitting and lower multicollinearity for modeling dob and dib. Removal of the b 4 T 4 and b 7 D terms from Equation (1) further improved the fitting precision for dbt. The final modified forms are as follows: where Y dob and Y dib is the stem taper of dob and dib, Y dbt is the double-bark thickness of dbt, ε dob , ε dib , ε dbt is an error term of dob, dib and dbt. Other parameters are the same as aforementioned.
We introduced a dummy variable r to account for stem taper difference of three regions (i.e., r 1 = 1 and r 2 = 0 for region one; r 1 = 0 and r 2 = 1 for region two; r 1 = 0 and r 2 = 0 for region three). After imposing r on b i , models (2)-(4) take the following forms: where λ i are the parameters of dummy variables. Other parameters are the same as mentioned above.

Total Disaggregation Method (TDM)
Tang et al. [26] proposed an additive method, disaggregation model structure. As proposed, a total model was first developed, and the estimated total model was disaggregated into components based on their proportions in the total model. The essence of this method is also NSUR. In this paper, the dob was a total model with the components of dib and dbt. This approach ensured that the sum of values of two components was equal to dob, satisfying the additive property between the total and components. The models are as follows: Based on the above model, the dob was separately fitted by nonlinear OLS. The dib and dbt were estimated by disaggregating the total model into components based on their proportions, through deducing and combining like terms, respectively, and using the NSUR method. The formulas took the following forms: where parameters are the same as described before.

Adjustment in Proportion (AP)
Adjustment in proportion method is another additive method proposed by Tang et al. [26], which ensures that the sum of values of trees components is equal to dob. For example, when an indicator variable r was considered, base models (5)-(7) were separately fitted by nonlinear OLS for dob, dib and dbt. There are many similarities between AP and TDM. We made a comparison, firstly, and chose the best method to compare with other model structures. The estimates of dib and dbt were calculated as follows: estimates of the parameter vectors obtained by fitting models (5)-(7) for dob, dib and dbt, respectively.

Non-Additive Taper Models (NAM)
The dib taper and dbt were separately fitted by OLS, and dob taper estimation was obtained by adding the two components.

Additive Taper Models with Dob Constraint (AMO)
Following the model structure, total amount control method as specified by Parresol [25], the dib taper function and dbt with NSUR approach were constrained to equal the dob taper equation as follows:

Subtraction Taper Models with Dib Constraint (SMI)
First logistic transformation of the total amount control method, the dib taper equation had the difference of simultaneous estimations of dob taper function and dbt taper equation with those of the NSUR approach. The formulas are as follows:

Subtraction Taper Models with Dbt Constraint (SMB)
Second logistic transformation of total amount control method, the dbt had the difference of simultaneous estimations of dob taper function and dib taper equation with those of the NSUR approach. The formulas are as follows: (14) where parameters are the same as aforementioned.
The model residuals of stem taper data often show heteroscedasticity. To overcome the problem, we chose 1/D k as the weight factor (where: k was determined for each model). This factor was multiplied and programmed in the PROC MODEL procedure [41] by specifying resid.D i = resid.D i / D i k (where: resid.D i is the model residual of D i ). The heteroscedasticity was obvious in the models of dbt which was corrected accordingly. For the models of dob and dib, it was almost absent.

Model Assessment and Evaluation
Initially, the accuracy of the single estimate models with and without dummy variables was assessed with the fitting dataset. Later, the predictive abilities of TDM models and other models were evaluated with both fitting and validation dataset. Five statistical indexes, i.e., adjusted coefficient of determination (R a 2 ), mean error (e), residual variance (δ), root mean square error (RMSE) and total relative error (TRE), were tested. The notations for these indexes are as follows: where Y i is the observation data,Ŷ i is the predicted value, N is the total number of observations, and p is the number of parameters. The condition number (CN) was used to test the multicollinearity. The criteria for a CN value that indicated the extent of multicollinearity were arbitrary. It was considered that no collinearity existed in the model when the CN value was less than 30. The value of CN between 30 and 100 signified the existence of multicollinearity, however the model was still acceptable [42].

Ranking of Models
The ranking method proposed by Poudel and Cao [43] was used to obtain the relative and exact position of different models. The relative rank of model i is defined as: where: Rank i is the relative rank of model i (i = 1, 2, . . . , m), S i is the goodness-of-fit statistics, including R a 2 , e, δ, RMSE and TRE, produced by model i. S min is the minimum value of S i , and S max is the maximum value of S i . Relative ranks of 1 and m indicate the best and the worst model. Since the magnitude and the order of the S i 's both are included in the ranking system, it should provide additional information than the conventional ordinal ranks.

Fitting the Base Model
Fitting statistics for dob, dib and dbt taper equations with and without dummy variables are displayed in Table 2. The CN values of models (2)-(4) were less than 40, indicating less or no multicollinearity in models. As expected, the models with dummy variables (5)-(7) reflected some degree of variation and outperformed models (2)- (4). The values of RMSE, R a 2 , e and δ of the models with dummy variables were superior to those of the models without dummy variables. Based on three optimal models, additive systems were constructed in the following section. Note: Models (2)-(4) are reduced models; Models (5)-(7) contain dummy variables.

Model Fitting for Six Methods
The six additive methods (Equations (9)- (14)) were constructed based on models (5)- (7). The goodness-of-fit statistics of six methods are shown in Table 3. Both total disaggregation methods (TDM, AP) provided the same results for dob and both of them were estimated by OLS. The same values of R a 2 were rendered for dib. However, there was a marginal difference in the values of RMSE, e, and δ between the methods. As per average rankings of the methods, the TDM appeared to be more attractive than AP for dib and dbt models.
Among the six models, the dob of TDM/AP, dib of NAM, and dbt of SMB were superior to the others. However, the dbt of SMI, and dob and dib of SMB, behaved inadequately while fitting. This was reflected by the average of total ranks. In general, TDM still performed slightly better than those with other methods (i.e., dib of TDM was slightly better than that of AP and SMB, dbt of TDM was second only to that of SMB). In this case, the TDM was slightly superior to the AMO, which has been widely used, in RMSE, e, δ, TRE and R a 2 by 1.8, 6.5, 3.1, 3.9 and 1.6%, respectively.   Table 4. All parameters were significant (p < 0.0001), including the estimates of dummy variables λ i , which indicated that there was a significant difference in the taper equation systems among the three regions.  Table 4. All parameters were significant (p < 0.0001), including the estimates of dummy variables λi, which indicated that there was a significant difference in the taper equation systems among the three regions.      Table 5 shows the slight superiority of TDM, compared with other model structures. The methods of NAM, AMO, SMI and SMB differed slightly, but were less accurate. The prediction precision of TDM was higher than these four methods as evident from the average ranks. The precision of TDM was closely followed by AP for all variables. The dob and dib were overestimated by TDM model structure. In contrast, they were underestimated by the NAM model. The dob was overestimated and the dib was underestimated consistently by AMO, SMI and SMB models. All model structures underestimated dbt. The differences in mean error (e) of dob, dib and dbt of each model structure reveals the essence of additivity. To evaluate the predictive ability of each method across the entire stem, the relative height was divided into nine sections. The six methods were further assessed on the basis of graphical analysis (Figures 3-5). The three figures show the e, δ, RMSE, and TRE for the six model structures across relative height classes for dob, dib and dbt.  Table 5 shows the slight superiority of TDM, compared with other model structures. The methods of NAM, AMO, SMI and SMB differed slightly, but were less accurate. The prediction precision of TDM was higher than these four methods as evident from the average ranks. The precision of TDM was closely followed by AP for all variables. The dob and dib were overestimated by TDM model structure. In contrast, they were underestimated by the NAM model. The dob was overestimated and the dib was underestimated consistently by AMO, SMI and SMB models. All model structures underestimated dbt. The differences in mean error ( ̅ ) of dob, dib and dbt of each model structure reveals the essence of additivity. To evaluate the predictive ability of each method across the entire stem, the relative height was divided into nine sections. The six methods were further assessed on the basis of graphical analysis (Figures 3-5). The three figures show the ̅ , , RMSE, and TRE for the six model structures across relative height classes for dob, dib and dbt.           Figure 3 shows that additive models of dob with TDM and AP methods slightly outperformed other models in terms of , RMSE, and TRE for most relative height classes, with the exception of relative height classes (rh = 0.7 and rh = 0.8), which could be attributed to the base of live crown. The change of tapering grades at the base of the live crown may result in the defective performance of taper models. The results were in accordance with those of Lee et al. [44] and Rodríguez et al. [23]. All methods underestimated the dob (0.2 = < rh < 0.6 and rh = 0.9) and exhibited the smallest residuals. Figure 4 shows that the additive models of dib with TDM and AP slightly outperformed other methods in terms of , RMSE, and TRE for most relative height classes, with the exception of relative height classes (rh = 0.2 and rh = 0.9). Most additive models with six methods nearly overestimated the dib, with the exception of relative height classes (0.2 = < rh < 0.6) for the model with the NAM method, relative height classes (0.3 = < rh <0.4) for the models with SMI and SMB methods, and relative height classes (rh = 0.9) for the models with TDM, AP and NAM methods. Figure 5 shows that additive models of dbt with the TDM method slightly outperformed other models in terms of , RMSE, and TRE for most relative height classes (0.3 = < rh < 0.8), and the model with the SMB method also outperformed other models in terms of ̅ , , RMSE, and TRE for lower section (rh = 0.1). All model structures underestimated values of dbt. These conclusions are consistent with Table 5.

Model Validation for Six Methods
The six methods were subsequently used to describe the stem tapers in the three regions ( Figure 6), each graph describing trees of the same size (i.e., same D and H), representing the average tree. There were no great differences, with graphs showing that all stem tapers for dob and dib were almost identical, and that these function plots had consistent trends, except for a slight difference in the lower section among the three regions. However, the graphs showed differences in the dbt curves across the six methods, and the predicted values of SMB for dbt were relatively smaller than those of other methods in region two at the interval (0.4 ≤ rh). The predicted values of TDM for dbt were relatively bigger to those of other methods in region three at the interval (0.5 ≤ rh ≤ 0.7).  Figure 3 shows that additive models of dob with TDM and AP methods slightly outperformed other models in terms of δ, RMSE, and TRE for most relative height classes, with the exception of relative height classes (rh = 0.7 and rh = 0.8), which could be attributed to the base of live crown. The change of tapering grades at the base of the live crown may result in the defective performance of taper models. The results were in accordance with those of Lee et al. [44] and Rodríguez et al. [23]. All methods underestimated the dob (0.2 = < rh < 0.6 and rh = 0.9) and exhibited the smallest residuals. Figure 4 shows that the additive models of dib with TDM and AP slightly outperformed other methods in terms of δ, RMSE, and TRE for most relative height classes, with the exception of relative height classes (rh = 0.2 and rh = 0.9). Most additive models with six methods nearly overestimated the dib, with the exception of relative height classes (0.2 = < rh < 0.6) for the model with the NAM method, relative height classes (0.3 = < rh <0.4) for the models with SMI and SMB methods, and relative height classes (rh = 0.9) for the models with TDM, AP and NAM methods. Figure 5 shows that additive models of dbt with the TDM method slightly outperformed other models in terms of δ, RMSE, and TRE for most relative height classes (0.3 = < rh < 0.8), and the model with the SMB method also outperformed other models in terms of e, δ, RMSE, and TRE for lower section (rh = 0.1). All model structures underestimated values of dbt. These conclusions are consistent with Table 5.
The six methods were subsequently used to describe the stem tapers in the three regions ( Figure 6), each graph describing trees of the same size (i.e., same D and H), representing the average tree. There were no great differences, with graphs showing that all stem tapers for dob and dib were almost identical, and that these function plots had consistent trends, except for a slight difference in the lower section among the three regions. However, the graphs showed differences in the dbt curves across the six methods, and the predicted values of SMB for dbt were relatively smaller than those of other methods in region two at the interval (0.4 ≤ rh). The predicted values of TDM for dbt were relatively bigger to those of other methods in region three at the interval (0.5 ≤ rh ≤ 0.7).

Discussion
In this study, our data were collected across a large geographical region in Northeast China covering a variety of topographic and stand conditions. Six forms of nonlinear additive model structures were built and tested with the data. Our results demonstrated that the TDM model structure was nearly as accurate as other models in goodness-of-fit and predictive abilities in terms of average prediction errors for dob, dib and dbt. While other models had a slight advantage, differences between the six model structures were small. All model structures took into account the intrinsic correlations among the stem components and provided efficient parameter estimation. In NSUR model structures (i.e., AMO), the prediction accuracy of dob depended on the accuracy of dib and dbt models. The aggregation nature of the systems required that each component be estimated to obtain dob. A relatively large prediction error in any component model could affect the prediction accuracy of dob. The prediction precision of TDM and AP model structures depended on the accuracy of the dob model, which was the most accurate among the dob, dib and dbt models. The essence of TDM model structure is also the NSUR, which theoretically led to the unbiased estimation, satisfying the additive property of the total and components, and guaranteeing their proportions in the total model. It was also recognized that AMO, the NSUR method, had been widely used in additive volume or biomass equations throughout the world [32][33][34]. Admittedly, it was difficult to deduct and fit the additive system derived from the TDM. However, the advantages of TDM were substantiated in this study. These appealing characteristics of the TDM model structure were supported by our results. Therefore, the TDM was shown to be another better choice to develop additive

Discussion
In this study, our data were collected across a large geographical region in Northeast China covering a variety of topographic and stand conditions. Six forms of nonlinear additive model structures were built and tested with the data. Our results demonstrated that the TDM model structure was nearly as accurate as other models in goodness-of-fit and predictive abilities in terms of average prediction errors for dob, dib and dbt. While other models had a slight advantage, differences between the six model structures were small. All model structures took into account the intrinsic correlations among the stem components and provided efficient parameter estimation. In NSUR model structures (i.e., AMO), the prediction accuracy of dob depended on the accuracy of dib and dbt models. The aggregation nature of the systems required that each component be estimated to obtain dob. A relatively large prediction error in any component model could affect the prediction accuracy of dob. The prediction precision of TDM and AP model structures depended on the accuracy of the dob model, which was the most accurate among the dob, dib and dbt models. The essence of TDM model structure is also the NSUR, which theoretically led to the unbiased estimation, satisfying the additive property of the total and components, and guaranteeing their proportions in the total model. It was also recognized that AMO, the NSUR method, had been widely used in additive volume or biomass equations throughout the world [32][33][34]. Admittedly, it was difficult to deduct and fit the additive system derived from the TDM. However, the advantages of TDM were substantiated in this study. These appealing characteristics of the TDM model structure were supported by our results. Therefore, the TDM was shown to be another better choice to develop additive taper equations. This methodology would be useful for forest researchers interested in developing more precise systems of stem taper models in the future.
It is known that stem volume is important for forest management. Therefore, we further verified the prediction ability of six additive models for corresponding volume. The following Table 6 shows the model validation results of each part of volume calculated by numerical integration for each additive model. Comprehensive indicators and the precision of TDM were closely followed by AP for all variables. The two additive volume model methods are better than the other four models. They reflect the advantages of the additivity model and volume model constructed by TDM. It should be noted that correlated error structure in the data was not taken into account in the model fitting process due to convergence problems. For instance, a test of autocorrelation for the TDM method showed that the models of dib and dbt failed to achieve convergence. Prediction accuracy is little affected by the correlated error structure [45]. For practical applications, auto-correlation is generally ignored when using models for prediction [46][47][48][49][50].
To date, additivity has been used in the forestry field, for calculations such as biomass, volume [12,13,32], and crown [10,11]. However, it is rare to study the additivity of taper equation. As stated earlier, only Rodríguez et al. [23] have conducted relevant research with the NSUR (SMB) method based on 351 Corsican pines, where the authors considered auto-correlation, and predicted the whole-tree volume and the different components of Corsican pine. Inspired by this, our paper constructed a new additive model, which compared the advantages and disadvantages of different existing additive structures. Our work expanded and perfected the additive theoretical system of taper equation.

Conclusions
This research is believed to be a novel attempt to present a preliminary additive system for Dahurian larch in Northeast China. Four approaches of TDM, AP, OLSSR, and NSUR were used to develop the additive systems of stem taper models for Dahurian larch. All model structures, particularly the TDM, demonstrated the additive property of stem taper models efficiently, with TDM obtaining a slightly better performance. In addition, the systems of stem taper models with the TDM method and dummy variable performed much better than those without the dummy variable. This methodology would be useful for forest researchers seeking to develop more precise systems of stem taper models, to predict volume in the future.
Author Contributions: Y.X. performed data analysis, and wrote most of the paper. M.K.S. assisted in data analysis and article revision. L.J. supervised and coordinated the research project, designed the experiment, and contributed to writing the paper. All authors have read and agreed to the published version of the manuscript.
Funding: This research was supported by Heilongjiang University of Science and Technology Research Start-up Fund granted to Yanli Xu.