Models for Tree Taper Form: The Gompertz and Vasicek Diffusion Processes Framework

: In this work, we employ stochastic differential equations (SDEs) to model tree stem taper. SDE stem taper models have some theoretical advantages over the commonly employed regression-based stem taper modeling techniques, as SDE models have both simple analytic forms and a high level of accuracy. We perform fixed- and mixed-effect parameters estimation for the stem taper models by developing an approximated maximum likelihood procedure and using a data set of longitudinal measurements from 319 mountain pine trees. The symmetric Vasicek- and asymmetric Gompertz-type diffusion processes used adequately describe stem taper evolution. The proposed SDE stem taper models are compared to four regression stem taper equations and four volume equations. Overall, the best goodness-of-fit statistics are produced by the mixed-effect parameters SDEs stem taper models. All results are obtained in the Maple computer algebra system.


Introduction
Stem taper modeling is an attractive way to estimate tree size variables, both in forestry research and practice, due to its analytical flexibility. These types of models have been increasingly used for forest inventory and growth forecasting tools. The most difficult challenge of stem taper modeling is in modeling tree species with stems that differ from an idealized shape. This situation often leads to isolation of many tree species from stem taper modeling process.
Many types of taper modeling techniques have been proposed and applied over the years. The most traditionally used stem taper parametric equations are regression models, which are classified into segmented polynomial [1], variable exponent [2], mechanistic [3], and stochastic differential equation [4,5] methods.
In the early stages of the study of forest tree stem taper [6], a hypothesis was framed that the best-fitting form curve of the square of the diameter and the height above ground at the point of measurement is a straight-line relationship. Later, Kozak et al. [7] presented a parabolic model for which the parameters were estimated using the ordinary least squares method. Max and Burkhart [1] published segmented polynomial models to describe the evolution of squared relative diameters by relative heights. New classes of stem taper models have been derived with reference to widely accepted mechanical and biological principles [3,8]. Sloboda [4] introduced the diffusion process analogy for stem taper modeling, focused on a brief methodological discussion of the stem taper model, and theoretically demonstrated that the evolution of the stem taper can be successfully formulated as a diffusion process. Fortunately, the proposed stochastic differential equation (SDE) stem taper model has led to a very simple approximation of the stem taper dynamic and the corresponding probabilities. Petrauskas et al. [9] successful promoted the SDE stem taper model to a segmented model which uses two joining points to link three stem sections along the bole, where each section is modeled by a different type of fixed-effect parameters SDE.
Stochastic differential equations are widely used to model biological dynamical systems in which a stochasticity phenomenon plays a leading role, embracing more complex variations in the dynamics and generalizing classical ordinary differential equation models [10,11]. The analytical theory of univariate diffusion processes has been successfully utilized for modeling the growth dynamics of an even-age stand [12,13]. Over the last decade, a variety of fixed-and mixed-effect parameters diffusion processes (both symmetrical and nonsymmetrical forms) have been applied in forest tree-and stand-level growth models. Recent research has focused on developing techniques to account for between-individual variabilities in stands. Stochastic growth theories, for example, have been built around the Vasicek, Gompertz, Bertalanffy, and gamma SDEs in both univariate [14] and bivariate [15,16] forms. Multivariate diffusion processes allow us to further consider the underlying covariance structure driving the changes in the state variables; for example, a trivariate case [17], a quadrivariate Vasicek case [18], and a quadrivariate Bertalanffy case [19] have been investigated.
The main goal of this paper is to develop a segmented mixed-effect parameters SDE stem taper model that uses one joining point to link two sections, where each section is modeled by a different type of mixed-effect parameters SDE, and to describe the maximum likelihood procedure for fixedand mixed-effect parameter estimates. Another goal is to compare the developed stem taper models with well-known regression models for prediction of diameter at any specified height and volume. In the results, we consider a possible application of the SDE stem taper modeling technique to measurements from mountain pine stands in Lithuania. All results are obtained in the Maple computer algebra system using the Statistics and VectorCalculus packages.

SDE Taper Framework
A large number of stem taper equations have been developed for different tree species and regions, which mathematically describe the tree diameter at an arbitrary height. Stem taper equations meet multifarious interests in forest management: first, we can calculate the stem diameter at any specified height; second, we can calculate the tree height for any specified value of stem diameter; and, finally, we can calculate the stem volume and the product volume, given specified dimensions. The estimates of specified product volume are significant in modern forest planning, for example, in evaluating the economics of different stand management regimes. This paper focuses on two stem taper modeling frameworks, which can be classified into stochastic differential equations and regression models.

SDE Stem Taper Models
In this study, we determine a taper model using a univariate stochastic process ( ) where ( )   (1) and (2). Many regression analysis techniques have recently been used for stem taper modeling, such as linear or nonlinear fixed-and mixed-effect models or generalized additive models [21]. The mixed-effect SDE models have the ability to model the stem variability and analyze data from several individual trees simultaneously. This modeling approach describes individual stems by a common mean and variance of the global distribution, with some of the model parameters varying within the stem (called random effects), while other parameters remain invariant between stems (called fixedeffect parameters). Mixed-effect models use longitudinal data from all stems to estimate the fixedand mixed-effect parameters. For the sake of simplicity, we assume that the parameters B α and T α are both composed of fixed parameters and are normally distributed (with 0 mean and constant variance) random variables:

0;
i T N ϕ σ (4) In this study, the stem taper SDE with one joining point

1.3
i h is defined, using the two different forms, by ( )   x v x μ with mean, variance, and conditional probability density functions, respectively, as follows: The diffusion process representing the solution of Equation (2) has a univariate normal distribution x v x μ with mean, variance, and conditional probability density functions, respectively, as follows: Using the mean (Equations (7) and (10)) and variance (Equations (8) and (11)) functions, we can define the mean trajectory of diameter, 1 ) and Model 3 in the following forms, respectively: Using the mean (Equation (10)) and variance (Equation (11)) functions, we can define the mean ) and Model 4 in the following forms, respectively:

SDE Parameters Estimation
In this section, we discuss the parameter estimation methods for the SDE stem taper Models 1-4. This study focuses on discretely observed longitudinal tree stem measurements without measurement noise. This assumption has a great advantage for formulating the maximum likelihood function in closed form. Maximum likelihood estimation of the fixed-and random-effect parameters in Models 1-4 is possible, due to the availability of the exact expressions for the conditional probability density functions as defined by Equations (9) and (12). Assume that the relative diameter process Y i (x i ) is directly observed at discrete relative height points { } where ni is the number of diameter measurements of the ith stem, For Models 1 and 2, the log-likelihood function takes the following form: where the conditional probability density functions, respectively, are For the Models 3 and 4, the combined log-likelihood function for all M stems takes the following form: where the fixed-effect parameters are denoted by the vector ( ) The conditional probability density functions, respectively, are , , , , where the fixed-effect parameters are ( ) 1 2 , , , , , , , , and the random effects are As the integral in Equation (22) does not have an exact solution and no analytic expression is known, using Laplace expansion, a two-step maximization procedure can be developed for fixed and random parameters estimation. In the first optimization step, we estimate the vector ψ for every tree after plugging the estimates of the fixed-effect parameters m θ ∧ into the following equation: where the function is defined as The second optimization step estimates the vector m θ after plugging the estimates of the random effects ψ ∧ into the following equation:

Standard Errors of the Parameter Estimates of the SDE Models
The standard error measures the uncertainty of the estimated model parameters. To evaluate the asymptotic standard errors of the maximum likelihood estimators, we focused on the Fisher information matrix [22]. The approximate asymptotic variance of the approximated maximum likelihood estimators of the fixed effect parameters, θ ∧ , may be found by the inverse of observed Fisher information matrix. By defining the vector ( ) , s = 1, 2, 3, 4, the observed Fisher information matrix takes the following form: Then, the approximate asymptotic standard errors of the fixed-effect parameters are just the square roots of the diagonal terms in the variance-covariance matrix defined by the matrix

Nonlinear Regression Stem Taper Models
Over the past few decades, forest statisticians have proposed advanced regression models for describing tree stem taper using simple functions, such as linear and more complex nonlinear functions. In developed countries, including the U.S. and Canada, the segmented polynomial model was traditionally used at the end of the twentieth century. It is obvious, however, that no single stem taper model corresponds well to longitudinal data sets for all tree species. To validate the proposed SDE stem taper models, we selected four different regression-type stem taper models of the variable form reported in forest research literature for comparison [3,23,24] where d is the diameter at any particular height h, D is the diameter at breast height, H is the total tree height, z is the relative height ( h z H = ), α1 and α2 are the joining points of three segments, β1-β4 are the unknown parameters to be estimated, and ( )

Segmented q-Exponential
Tsallis [25] described the q-exponential function, which has been applied successfully to model complex problems in a broad range of scientific areas, such as information theory [26] and forestry [27]. The pooled segmented function obtained from the q-exponential and polynomials can be utilized to describe the changes of tree stem diameter at a specified stem height. ( 1) ( 1), 1 0.0001 (1 exp((1 ) )) where α is the joining point of two segments, β1-β9 are the unknown parameters to be estimated, and [28] is defined as ( )

Model 7. The stem taper model published by Lee et al. [24] and modified by Berhe and Arnoldsson
where β1-β5 are the unknown parameters to be estimated.

Model 8. The stem taper model published by Kozak [2] is defined as
where β1-β9 are the unknown parameters to be estimated.

Nonlinear and Linear Regression Stem Volume Models
Based on review papers on compiled stem volume regression models [29], four regression models were selected and compared with techniques involving stem tapers. All models include diameter at breast height, D, and total height, H, as the independent variables, whereas the last includes a form factor.

Model 9. Schumacher and Hall [30] developed the power form stem volume model
where β1-β3 are the unknown parameters to be estimated.

Model 10.
Using the q-exponential function, the stem volume can be defined in the following form [31]: where β1-β5 are the unknown parameters to be estimated. where β1 and β2 are the unknown parameters to be estimated.

Model 12.
A cylindrical form factor, F, has been used for tree stem volume modeling by the Lithuanian National Forest Inventory [9]. Tree stem volume is defined as where β1-β6 are the unknown parameters to be estimated.

Evaluation of Models to Data
A main goal of stem taper simulation as a descriptive model is to accurately predict the diameter value at a specified stem height from a set of predictors (tree diameter at breast height and total tree height). In order to interpret the performance of the statistical models on observational data, this study focuses on numerical statistics and visualization of residuals. The four statistics used to evaluate the goodness-of-fit were adjusted coefficient of determination, mean prediction error (percent prediction error, %), mean absolute prediction error (percent absolute prediction error, %), and root-mean-square error (root-mean-square error, %). The mathematical expressions of the statistics are as follows: 1. Adjusted coefficient of determination: ( ) Mean prediction error (percent prediction error, %):

Results and Discussion
The newly developed models are nondeterministic models that consider symmetric and asymmetric diffusions of stem diameter at a particular height. Various regression models that incorporated diameter at breast height and tree height have been proposed [1,2,6,7,27,31]. However, there has been debate about the mathematical form of stem taper. Our developed models are not affected by the mathematical form of the stem taper, because we start from the evolution of the distribution [33] of the relative diameter. Another characteristic of our models is that the mixed-effect parameters SDE model deals with random growth rate for a given stem diameter and height. Such a model can express the variation in individual stem taper due to the random effects.

Estimation Results
We illustrate the proposed modeling technique by using a mountain pine tree (Pinus mugo Turra) data set. The data used for modeling consisted of measurements collected from 30 mountain pine plots located in western Lithuania (Kuršių Nerija). This tree species was planted to stop drifting sand dunes of Curonian Spit in the nineteenth to twentieth centuries. The age of sampled stands varies from 53 to 123 years. The size of round sample plot was 150 m 2 . Total diameter at the breast height was measured for 7002 trees, height for 702 trees, and stem diameter on each 0.5 m (longitudinal) for 319 trees. 102 trees were cut down for biomass estimation. A total of 319 trees were used. At plot establishment, the following data were recorded for every sample tree: diameter over bark at 1.30 m, tree height, and stem volume (calculated by Equation (40)). The data set was randomly divided into estimation and validation data sets. A random sample of 217 stems was selected for model estimation, and the remaining data set of 102 stems was used for model validation. The observed data set of longitudinal measurements of the diameter at a specified height are presented in Figure 1. Summary statistics for the diameter at breast height (d), the total height (h), and volume (v) for all of the stems used in the model estimation, and validation data sets, are presented in Table 1.  The methodology of parameter estimation, which is known as machine learning, deals with fitting the parameters to reproduce the observed data set. The approaches most commonly used in the estimation of parameters are the least squares method and the maximum likelihood procedure. The fixed-and mixed-effect parameters of the SDE stem taper Models 1-4 were estimated utilizing the maximum likelihood methodology developed in Sections 2.1.2 and 2.1.3, and the parameters of the nonlinear regression models 5-12 were estimated by the least squares methodology [34]. For the parameters and their standard error estimates, algorithms were developed and realized using the symbolic algebra system Maple [35]. The results of parameter estimation and the standard errors for Models 1-12 are listed in Table 2.

Comparison of Stem Taper Models
To be an applicable taper model, a model needs to predict diameter measurements at a particular height in a stem highly accurately. To evaluate taper model performance, adjusted coefficient of determination, mean prediction error (percent prediction error, %), mean absolute prediction error (percent absolute prediction error, %), and root-mean-square error (percent root-mean-square error, %) were calculated, which are presented in Table 3. The stem taper modeling techniques presented in this study are generally grouped into SDE models and nonlinear regression models. As can be seen from Table 3, taper models 1-8 produced very similar statistical indices for both estimation and validation data sets. The stem taper SDE models (1-4) had very similar goodness-of-fit statistics (see Table 3); however, the SDE taper models used additional measurements that fixed tree diameter at stem height h = 0. Comparison of the statistical indices produced by SDE taper models (1)(2)(3)(4) and the regression taper models (5)(6)(7)(8) reveals that the SDE taper models were superior to the others. Random effects were calibrated, by Equation (30), using two additional stem diameters measured at a stem height of 1.0 m and 1.5 m for all stems in the validation data set. This additional information marginally improved the values of all statistical indexes.
Visualization of the residuals in Figure 2 shows that the SDE stem taper models (1-4) produced a more homogeneous residual variance than the regression stem taper models (5)(6)(7)(8). The inclusion of random effects was found to be significant for Models 3 and 4.

Comparison of Stem Volume Models
The predicted stem volume was calculated by integrating the taper equations over the height: where di is the diameter at breast height (mm) of the ith tree, and hi is the total tree height (dcm). The observed volume (V cm 3 ) of each stem was computed using a truncated cone formula, up to the last section, where the final apex was calculated as a cone [31] where dij and Lij are the diameter (mm) and length (d cm), respectively, of the jth section j of the ith tree.
The goodness-of-fit results for all presented stem taper models and volume predictions are given in Table 4. As can be seen from the table, the statistical indices of all volume models showed subdivision into three clusters (SDE taper volume, regression taper volume, and volume models). These clusters were characterized by very similar statistical indices within each cluster, for both estimation and validation data sets. The volumes predicted using stem taper mixed-effect-parameters SDE Models 3 and 4 (extra integral by Equation (39)) showed superior goodness-of-fit statistics, compared to the others, for both estimation and validation data sets (see Table 4). However, all SDE taper models (1-4) used additional measurements, which fixed the tree diameter at stem height h = 0. Comparison of the statistical indices produced by the regression taper models (5-8; extra integral by Equation (39)) and the regression volume models (9)(10)(11)(12) revealed that the regression volume models were superior to the regression taper models. In Models 3 and 4, the random effects calibrated by Equation (30) (using two additional stem diameter observations measured at a stem height of 1.0 m and 1.5 m for all stems in the validation data set) improved all statistical indices for the volume predictions (see Validation Data Set columns in Table 4). Visualizations of the residuals plotted against the predicted diameters (see Figure 3) show that volume prediction using SDE stem taper models (1-4; extra integral by Equation (33)) produced more homogeneous residual variance than the regression stem taper models (5-8; extra integral by Equation (33)) and the volume regression models (9)(10)(11)(12). The inclusion of random effects was found to be considerable for the volume predictions of Models 3 and 4.

Illustration of Tree Stem Tapers
Traditionally, stem form is described by the stem taper technique, which relates the changes in stem diameter with increasing height (i.e., from the tree stump to the tree tip). Stem taper equations can predict the diameter at any specified height, total stem volume, merchantable volume, and height at any specified stem diameter. In this study, two approaches for determining an equation-like stem taper profile were examined: stochastic differential equation models and regression equation models. All stem taper equations for both modeling techniques are visualized in Figure 4. For illustration, three stems from the validation dataset, corresponding to large, medium, and small trees, were randomly selected. Figure 4a,b shows the improved fit of mixed-effect parameters SDE stem taper Models 3 and 4 to the three randomly selected trees due to the inclusion of random effects. The better fit of the SDE stem taper models (1-4; Figure 4a,b), compared with the regression stem taper models (5-8; Figure 4c,d) can be observed by visual inspection, as the predicted stem taper curves are closer to the observed data set of stem measurements. The SDE stem taper modeling technique can better interpret stem structure evolution by utilizing higher central moments of the relative diameter. Figure 5 shows the mean stem taper dynamic, as well as its 0.025 and 0.975 quantile dynamics, for three randomly selected stems from the validation data set.

Conclusions
The SDE stem taper equations proposed in this paper are simple to use and provide a more accurate diameter prediction at a specified height than the most common form of regression equations as well as volume predictions. The results of our statistical analyses (see Tables 2 and 3; Figures 2 and 3) indicate that the SDE stem taper Model 3 had the overall best performance in predicting tree diameter at a specified height. However, the calibration of random effects for a new observed tree requires some additional information and provides better performance when compared with implementation of only the fixed-effect parameters SDE stem taper model. All presented regression stem taper models showed extremely similar performance in predicting tree diameter at a specified height. The results of this study also indicated that the SDE stem taper models can be used to accurately estimate the total stem volume of mountain pine trees. Future investigation will cover applications of SDE models for predicting tree biomass.
Funding: This research received no external funding.