Construction of Reducible Stochastic Differential Equation Systems for Tree Height–Diameter Connections

: This study proposes a general bivariate stochastic differential equation model of population growth which includes random forces governing the dynamics of the bivariate distribution of size variables. The dynamics of the bivariate probability density function of the size variables in a population are described by the mixed-effect parameters Vasicek, Gompertz, Bertalanffy, and the gamma-type bivariate stochastic differential equations (SDEs). The newly derived bivariate probability density function and its marginal univariate, as well as the conditional univariate function, can be applied for the modeling of population attributes such as the mean value, quantiles, and much more. The models presented here are the basis for further developments toward the tree diameter–height and height–diameter relationships for general purpose in forest management. The present study experimentally confirms the effectiveness of using bivariate SDEs to reconstruct diameter–height and height–diameter relationships by using measurements obtained from mountain pine tree ( Pinus mugo Turra) species dataset in Lithuania.


Introduction
The structural class hierarchy in a forest stand has interested researchers for more than a century. This study shows that the framework of the stochastic differential equations (SDEs) can easily be generalized to incorporate symmetric or non-symmetric size component distributions to explain empirical datasets in a theoretically consistent way.
Tree height (h) and tree diameter at breast height (d) (in the sequel-diameter) are traditionally used variables for tree volume calculation, aboveground carbon-stock estimation, and modeling of tree growth and yield. Unfortunately, it is difficult to measure tree height in the field, and errors tend to be observed. Tree height and diameter are traditionally formalized by using their regression relationship [1], the artificial neural network (ANN) [2,3], or stochastic differential equations [4,5]. Height-diameter regression equations have been developed for the local (stand) level and the generalized (ecoregional) level, by introducing additional stand variables, as well as random parameters [6,7]. The application of stand-level models to a wider region would probably lead to bias in predictions, as the deterministic height-diameter equations are related to the growth conditions and stand characteristics. The bias of tree-height predictions may affect the accuracy of the estimation of the stand volume and aboveground carbon stocks. To achieve reliable predictions of tree height, European forest statisticians have applied different techniques, such as generalized height-diameter models that include additional stand variables [8]. A generalized height equation evaluates the contribution of stand attributes, such as the stand density per hectare, the basal area per hectare, the quadratic mean diameter, and others in height-diameter models [9,10]. The inclusion of stand variables improves the predictive capacity of the selected height-diameter equation, but this technique requires additional sampling effort and reduces its practical application. The second technique that is used to introduce stand attributes to the height-diameter equation uses standspecific effects defined for each stand by a normally distributed random variable, named a random effect [7], which allows models to adapt to diverse growth environments.
In evenly aged stands, the use of the previously listed approaches is appropriate for explaining and modeling tree development. On the other hand, the height-diameter relationship is not static; instead, it varies with stand age within the same stand. Despite the progress in linear and nonlinear regression modeling techniques, these earlier-proposed height-diameter models do not address fundamental factors of tree height and diameter development, such as tree age and covariance between size variables. In seeking a more comprehensive understanding of the dynamics of the treesize variables, diffusion processes defined by SDEs have been proposed to express the shape of the tree height or crown width distribution versus the diameter [4,11]. SDE individual-tree growth models are a fundamental framework to link tree-size variables with stand age and to warrant mean stand volume predictions at a particular stand age. Most of the published SDE height-diameter models are applied to medium-and large-sized trees. However, the difficult challenge of forest growth and yield modeling is in tree species that differ from the idealized shape, size (diameter and height), age, and density. Previous SDE models have focused on bivariate (height and diameter) [12], trivariate (quadratic diameter, mean height, and density per hectare) [13], four-variate (height, diameter, crown base height, and crown width) [14][15][16], and five-variate (height, diameter, crown base height, crown width, and density per hectare) cases [17] for Scots pine trees in Lithuania. The main reason for developing SDE models is to gain the capacity to model highly nonlinear biological dynamics and their abnormalities [18].
The goal of this study was to develop SDE-type individual height-diameter models that are specific to the mountain pine tree (Pinus mugo Turra) species in Lithuania and to offer new insights into stand dynamics. The specific objectives were (1) to compare the newly developed SDE-type height-diameter equations with traditionally used regression equations; (2) to evaluate individualtree and stand size variables such as the mean, median, mode, quantiles, and much more; (3) to evaluate and present the long-term behavior of predictive size variables; and (4) to account for between-stand variability by introducing random effects (random variables). All results were determined by using the symbolic algebra system Maple.

Height-Diameter Regression Models
One of the most important links between tree-size variables is the relationship between tree height and diameter. Estimating tree height from tree diameter at breast height is a common practice for forest inventory, model simulation, and forest management. Height-diameter equations enable a better understanding of the rates of changes in tree-size variables and the development of stand volume, forest biomass, and carbon stocks [19,20]. Previous studies have summarized a great number of available height-diameter linear and nonlinear relationships and compared their performance for different tree species [1,21]. It should be noted that the main limitation of the height-diameter equations is that they can produce very different results if applied to stands that differ from fitted stands. Therefore, mixed-effects modeling techniques were developed to account for stand variability [8,22]. The use of mixed-effects modeling techniques has made it possible to calibrate random effects to local stands by using a subsample of heights and diameters [22].
This study compared three nonlinear regression models named the power, Richards-Chapman, and q-exponential models [23][24][25]. The stand-specific random effects, i (i=1,…,M), were included in each examined model for only one of parameters that possessed higher variability. The general equations of the mixed-effects power, Richards-Chapman, and q-exponential models are as follows: Model 1. Power function ℎ = ( + ) . (1)

Model 2. Richards function
Model 3. Q-exponential function where β1-β3 are the unknown fixed-effect parameters to be estimated, [ ] = , ≥ 0, 0, < 0 and φi (i = 1,…,M) are normally distributed random variables with a mean of 0 and constant variance , and M is the number of fitted stands.

SDE Models
More forest statisticians promote the use of bivariate distribution (diameter and height) for the derivation of the static height-diameter equation [26,27]. The mathematical description of the diameter or height distribution provides more detailed information about the stand [28]. It is worth noting that, by using measurements conducted by remote-sensing technologies, the height-diameter problem might be reversed, generating the need for a diameter-height relationship. A bivariate distribution is a framework used for symmetric interchange of diameter and height [29]. A bivariate SDE diameter and height stochastic process would be more appropriate for describing the development of the diameter and height structure in a stand over time (age). Therefore, we can define a dynamic form of the probability density function representing the diameter and height distribution, which changes in shape over time. The height-diameter or diameter-height equation can be derived by using conditional probability density functions. This study focused on four types of nonhomogeneous bivariate SDE models, ( ) = ( ( ), ( )) , of the tree diameter, ( ), and height, ( ), dynamics, as follows [30]: Model 4. Vasicek-type stochastic process Model 5. Gompertz-type stochastic process Here, ( ) = ( ), ( ) represents independent bivariate Brownian motions; ∈ [ ; ] is a finite horizon, T<∞; and , i=1, ..., M are independent and normally distributed random variables with zero mean and constant variances ( ~ (0; ) and ~ (0; ) ); and { , , , , , , , , } are fixed-effect parameters to be estimated. In the sequel, the initial condition takes the following form: if t=t0, then X(t0) =x0=(d0, h0).

SDE Trajectory Type Equation
Vasicek Mean, median and mode ( ) is the inverse of the standard normal distribution function.

SDE Models
The maximum likelihood parameter estimator of the SDE seeks to make the conditional (transition) probability density function the most likely fit for the observed diameter and height estimation dataset , ℎ , , ℎ , … , , ℎ at discrete times , , … , , starting at the initial time, t0, diameter, and height ( ) = = (0.1, 1.3) , from which a given set of the SDEs (4)- where ( , ℎ| , 0) is a normal (the Vasicek type) or lognormal (the others) probability density function, as defined in Table 1, with random effects = ( ) = 0. For the mixed-effect parameter scenario models, the two-step approximated maximum loglikelihood procedure takes the following form: where = , ,

Standard Errors of Parameter Estimates
To assess the accuracy of the parameter estimates obtained through the maximum-likelihood and approximated-maximum-likelihood procedures, we considered the standard errors associated with the estimates. Estimates of these standard errors can be obtained as the inverse of the observed Fisher [33] information matrix (see, for instance, Reference [30]). The approximate asymptotic standard errors of the fixed-effects parameters are defined by the diagonal elements of the observed where the matrix is

Random Effects Calibration
To calibrate the random effects, , , of the bivariate SDE models defined by Equations (4)-(7) for a new subsample, we used a full dataset of observations taken from the validation sample unit. Using new subsample observations {( , ℎ ), ( , ℎ ), … , ( , ℎ )} at discrete times { , , … , }, the random effects of the SDE models = ( , ) for diameter and height can be calibrated by the following: The random effect g of the regression models can be calibrated by the following:

Data
The study site was located in western Lithuania (Kuršių Nerija). A mountain pine tree (Pinus mugo Turra) species was planted to stop sand dunes of Curonian Spit from drifting in the nineteenth to twentieth centuries. The data used for modeling consisted of measurements collected from 31 mountain pine plots located in Western Lithuania (Kuršių Nerija). The ages of the sampled plots varied from 53 to 123 years. The size of the round sample plot was 150 m 2 . In all plots, the diameter at breast height of all trees larger than 1 mm was measured (7005 trees). The total diameter at breast height and tree height was measured for 702 trees. At plot establishment, the following data were recorded for every sample tree: diameter over bark at 1.30 m, tree height, and age. Diameter was measured to an accuracy of 1 mm. Height was measured by using clinometer, with precision to the nearest 0.1 meter. Tree age was identified with stand age, which provides a general timeframe for when stands were first established. The dataset was randomly divided into estimation and validation datasets. A random sample of 23 plots was selected for model estimation, and the remaining dataset of 8 plots was used for model validation. The summary statistics for the diameter at breast height (d), the total height (h), and age (t) for all of the trees used in the model estimation and validation datasets are presented in Table 5.

Estimates of Parameters
All used fixed and mixed-effect parameter regression models described above, and the newly developed bivariate fixed and mixed-effect parameter SDE models, are nonlinear. In order to establish the mixed-effect parameter height-diameter and diameter-height models, the sample plot-level effects were considered by including the random effects for a particular parameter. This study estimated the parameters of all used regression and bivariate SDE models by maximizing the maximum likelihood functions defined by Equations (8)- (12). Standard errors of parameter estimates were calculated by using the observed Fisher information matrix, defined by Equation (15). All results were determined by using Maple mathematical software. The results of the parameter estimation and the standard errors for all used models are listed in Table 6 and Table 7. Parameter estimates of all fitted models were significant (p < 0.05).

Bivariate Diameter and Height Distributions
Newly developed bivariate probability density functions obtained from the bivariate Vasicek-, Gompertz-, von Bertalanffy-, and gamma-type SDEs defined by Equations (4)-(7) can be visualized in graphs for different stand and parameter scenarios. Additionally, to demonstrate that data from the validation dataset follow the bivariate probability density functions derived from SDEs (4)- (7), we used a simple graphical technique represented by contour plots. The random effects, , , of the diameter and height for a randomly selected stand from the validation dataset were calibrated by Equation (16). The estimated bivariate mixed-effect parameter probability density functions and contour plots for a particular randomly selected stand from the validation dataset and the observed values are given in Figure 1. We can see that our developed lognormal probability density functions for the Gompertz, von Bertalanffy, and gamma diffusions inherit circular structures but also have unequal tails and show signs of skewness. For mountain pine trees, the height and diameter probability density function presented steeper and more symmetrical surfaces for the Vasicek-type diffusions (see Figure 1); however, with increasing age, the surfaces became flatter.

Comparison of Diameter, Height, Height-Diameter, and Diameter-Height Models
Tree size dimensions vary continuously through natural processes and specific silvicultural activities. Accurate tree-height relationships are necessary for most forest inventories, as knowledge of tree heights is needed to calculate tree volumes, biomass, carbon content, and economic value [34]. The newly developed fixed-and mixed-effect marginal and conditional univariate normal and lognormal probability density functions, which are listed in Tables 1 and 2, respectively, enable the results of the diameter and height behavior to be formalized from an evolutionary (time) perspective, with an emphasis on models of the mean diameter and mean height in a forest stand. By using marginal and conditional univariate normal and lognormal probability density functions, trajectories of the mean tree diameter and height in a forest stand were derived (see Tables 2 and 4). Fixed-effect parameter height-diameter SDE models are suitable for local application, and more generalized mixed-effect parameter SDE models, which do not use additional stand variables, can be applied at the regional level.
The principal objective of the newly framed SDE height-diameter and diameter-height models, as presented in Table 4, is to show their precedence over regression models. The statistical measures commonly used for the comparison of all used height-diameter and diameter-height models are shown in Table 8 and Table 9. The SDE height-diameter and diameter-height models were found to be better than the regression models. The mixed-effect parameter SDE models showed better statistical measures than those of fixed-effect parameter models, as, by definition, they do not take into account the variability between stands and therefore do not provide good predictions for a dataset sampled from a new stand. Statistical measures for all fitted height-diameter and diameterheight SDE models gave very similar values. For the validation dataset, the SDE height-diameter and diameter-height models appeared to work better than the regression models. With reference to all used statistical measures and to all used height-diameter and diameter-height models, the bivariate Gompertz-type SDE, in general terms, provided the highest accuracy.    Figure 2 shows plots of the residuals against the predicted values resulting from the validation dataset. The p-values (entered up in the plots) of the Shapiro-Wilk test for the residuals demonstrate that the residuals follow a normal distribution. The visual examination of the residual plots presented in Figure 2 does not show the presence of systematic errors (bias). Based on the statistical measures presented in Tables 8 and 9, the residual analysis of the mixed-effect parameter height-diameter and diameter-height SDE models, the impossibility of the negative values, and the asymmetry of the tree size component distribution, it was concluded that the Gompertz-type diffusion is superior to other diffusions. On the other side, the four mean curves from the SDE models did not show a notable difference. For the mixed-effect parameter Gompertz-type SDE using the marginal diameter and height distributions presented in Table 1, the mean, mode, median, 5% quantile, 95% quantile, and lower and upper quartile trajectories (see Table 2) for two randomly selected plots from the validation dataset are presented in Figure 3 (random effects were calibrated by Equation (16)). Figure 3 demonstrates that the stochastic bivariate Gompertz-type differential equations with a sigmoidal form solution are more reliable for future diameter and height predictions and allow better interpretability of model parameters than nonlinear regression models. An essential research finding pertaining to the tree-diameter and -height distributions is the fact that they are positively skewed (see Figure 3). Figure 3 illustrates that the tree diameter distribution is more asymmetric than the treeheight distribution. Theoretical studies on the growth of the tree-size components proved that the tree-size-component frequency distribution is characterized by many small trees and few larger trees [35]. For the bivariate Gompertz-type diffusion process, it is possible to derive the stationary univariate marginal and conditional distributions if parameters βd and βh are positive [36]. As time, t, goes to infinity for the Gompertz-type diffusion, the diameter and height marginal distributions (see Table 1) and conditional distributions (see Table 3) converge to a lognormal stationary distribution with the means and variances listed in Table 10 ( ∈ { , ℎ}). The asymptotic mean, median, mode, p-quantile (0 < p < 1), and variance trajectories of the tree diameter and height marginal and conditional processes are listed in Table 11. The stationary conditional lognormal type distribution derived from the Gompertz SDE (see Table 10) enables us to estimate the full distribution of the dependent variable (diameter or height) when the value of the independent variable (height or diameter) is known. This technique is a flexible rule that can be used to describe the development of the diameter or height distribution against the height or diameter. One advantage of the stationary conditional distribution is that the conditional quantile functions, and different measures of central tendency and variability can be obtained, providing a more comprehensive analysis of the relationship between diameter and height. Until now, the quantile regression has been widely conducted in tree-variable modeling [10,16,37].

Marginal
Mean + is the inverse of the normal distribution function.

Mean Tree Basal Area Models
The mean tree basal area, BA, is one of the central attributes of a forest stand. It is simply the average of the cross-sectional area at breast height of all trees in a forest stand. In order to better understand how the mean tree basal area is changing over time, we used the Gompertz diffusion process framework developed in previous subsections. The mean tree basal area can be used to estimate the mean tree volume and to generate management procedures. The mean tree basal area dynamics describe tree development with age and are defined by the following: where the moments ( ), ( ), ( , ℎ), ( ), , , and (ℎ), are listed in Tables 1, 3, 10, and 11, respectively.
An important contribution to the development of tree basal area prediction methodology is compatibility between its growth and increment equations, which can be derived from the growth rate equation by differentiation. Traditionally, the majority of authors have used a system of permanent experimental plots to provide partial time-series data covering different stand densities and thinning strategies [38]. Using the mean tree basal area growth, Equation (18), the current and mean annual increments of the tree basal area (CAIB and MAIB, respectively) can be defined in the following forms: According to the mean diameter relationships listed in Table 2, the current and mean annual increments of the diameter (CAID and MAID, respectively) have the following forms: Figure 5 illustrates the mean basal area dynamics, over time, with the observed values for two randomly selected stands from the validation dataset, using the mixed-effect scenario (the random effects were calibrated by Equation (16) and for all stands from the validation dataset, using the fixedeffect scenario). Figure 6 provides graphs of the basal area and diameter increments (acceleration) over time. We should note that all increments illustrated in Figure 6 are positive; therefore, they correspond to mathematical principles of the derivative, because trees increase in basal area and diameter for as long as they live. Figure 6 demonstrates that maximum basal area growth occurs later than the maximum diameter growth. Our conclusion is not related to the selection of a specific function to model the basal area or diameter increments. This is the fundamental quality of the used modeling technique that deals with probability density functions of the tree diameter and height over time. The maximum values of CAI and MAI, defined by Equations (20)- (23), occur at different times, according to the quality of the stand.  . Dynamics of the current and mean annual tree basal area increments of the basal area and diameter, over time, for two randomly selected stands from the validation dataset and mixed-effects scenario: (a) basal area increments; (b) diameter increments; black color represents the first stand; red color represents the second stand; solid line-the current increment; dotted line-the mean annual increment. Table 12 shows the goodness-of-fit statistics for tree basal area predictions. From Table 12, it is clear that the mixed-effect scenario model is more accurate for predicting the tree basal area than the fixed-effect scenario model. Statistical indexes for the estimation and validation datasets produced similar values. The best statistical index values were obtained with the mixed-effect parameter model, which includes the tree height as an additional predictor. The mixed-effect parameter stationary models produced statistical index values very close to the non-stationary process.

Mean Tree Volume Models
Prediction of growth and yield in trees are important for decision-making in sustainable forest management. In order to estimate the mean tree volume, the diameter at breast height and total height variables are traditionally used as predictors. The newly developed bivariate lognormal probability density function of the diameter and height distributions (see Table 1) and an additional stem volume regression function allow us to assess the mean tree volume and its dynamics in the forward and backward directions, as follows: where, in this study, ( , ℎ) is the individual tree volume regression function of the q-exponential form [25]: The estimates of the parameters β1, β2, and β3 were calculated by the weighted least-squares technique. The estimators were The SDE framework allows the construction of dynamic volume growth models, which offer a better description of the development of a stand over time than static models and can be considered superior for prediction purposes. Additionally, the dynamic growth model provides forest decisionmakers with more detailed information about the volumes of merchantable products. The variations in the mean tree volume with age and among stands are shown in Figure 7. Insight from diffusion processes allowed us to formalize a unique way to study the long-term dynamics of the mean tree volume at the stand level. Figure 8 provides a graph of the mean tree volume increments over time. Figures 6 and 8 demonstrate that the maximum individual tree volume growth occurs at a later age than the maximum basal area and diameter growth. Table 13 shows the goodness-of-fit statistics for mean tree volume predictions. From Table 13, it is clear that the mixed-effect scenario bivariate SDE model can accurately predict the mean tree volume. Statistical indexes for both estimation and validation datasets produced similar values. Figure 7. Dynamics of the mean tree volume, over time, for two randomly selected stands from the validation dataset and mixed-effect scenario. Black color-the first stand; red color-the second stand; circles-the observed values. Figure 8. Dynamics of the current and mean annual tree volume increments versus age for two randomly selected stands from the validation dataset and mixed-effect scenario. Black color-the first stand; red color-the second stand.

Conclusions
SDEs were developed at the beginning of the twentieth century, to quantify all aspects of stochastic processes. This study evaluated the use of SDEs to model the tree diameter and height at any given age in mountain pine tree (Pinus mugo Turra) species of Lithuania. We developed a few new models for diameter and height evolution by using well-defined diffusion processes, such as the symmetric Vasicek diffusion process and asymmetric Bertalanffy-, Gompertz-and gamma-diffusion processes. In both scenarios, fixed-and mixed-effect parameters were applied to model the evolution of the diameter and height. We performed fixed-and mixed-effect parameters estimation via maximum likelihood procedure and using a real-world dataset of mountain pine trees in Lithuania. These models were compared with traditionally used regression models by using performance statistics and residual analysis. Overall, the best goodness-of-fit statistics were produced by the Gompertz-type diffusion model. On the other side, the superiority of the Gompertz-type diffusion can been confirmed by the existence of the steady state variance, the impossibility of the negative values, and the asymmetry of the tree-size component distribution. All results were determined by using the Maple computer algebra system.