Modelling Individual Tree Diameter Growth of Quercus mongolica Secondary Forest in the Northeast of China

: Quercus mongolica secondary forest is widely distributed in the northeast of China, but it usually has low productivity, unstable structure, poor health, and low biodiversity. Diameter is a tree variable that is commonly used for forest growth measurement, to provide the basis for forest management decision. Two level generalized linear mixed effects individual diameter growth model were developed using data from two times surveys of 12 Q. mongolica secondary forest permanent plots that were distributed among Wangqing forest farms. Random effects of 14 tree species and 12 plots were introduced into the basic model consisting of three factors: tree size, competition of surrounding trees, and site quality. The results showed that initial diameter at breast height(DBH) was the most important variable affecting diameter growth, followed by competition, while the effect of site quality on diameter growth was not signiﬁcant. Compared with the basic model, the prediction accuracy of the mixed effect model was improved by 17.69 %, where R 2 reached to 0.6805, indicating that it is suitable for the individual-tree diameter growth prediction of the secondary forest of Q. mongolica .


Introduction
As forest management is a long-term activity, the response of growth indicators to management measures may not be reflected in the short term. Forest growth models can estimate the long-term production by using the growth law of trees or stands, and then evaluate the impact of management measures to provide a theoretical basis for the formulation of forest management decision, so it is highly valued. The research of the forest growth model mainly includes model construction, parameter determination, highprecision estimation, and the prediction of individual tree level and stand level [1,2]. A large number of growth and harvest models have been developed and applied to forestry production practice [3]. The stand growth model may not able to represent the growth dynamics of individual trees and is only applicable to regular and even-aged stands, as tree sizes and competition largely varies in structurally complex stands [4][5][6]. The individual tree growth model is more flexible, detailed and accurate for the prediction of individual tree growth dynamics, so compared with the stand level model, it can more accurately evaluate the impact of different management measures on growth [7]. In addition, it is more suitable for uneven-aged mixed forest, because it can more accurately show the competition relationship between different tree species or different sizes of trees [2,8]. The individual tree growth models usually consist of diameter increment, height increment, crown ratio, and mortality models as core models [9] which can be used as input in forest simulators for decision making in sustainable forest management [10,11].

Data Materials
The study area is located in Tazi Gou Forest farm (129 • 56 -131 • 04 E, 43 • 05 -43 • 40 N), Wangqing Forestry Bureau, Jilin Province, China ( Figure 1). This region is dominated by low mountains and hills, with an altitude between 300-1200 m and a slope between 5 • and 25 • . It belongs to a temperate continental monsoon climate with an annual average temperature of about 3.9 • C and annual precipitation 600-700 mm, mostly in summer. The lowest temperature is about −32 • C in January and the highest temperature is about 22 • C in July. The area is mainly dark brown soil with fertile soil texture, acidic pH, and high humus content. The main tree species are Q. mongolica, Betula platyphylla, Pinus koraiensis, Acer pictum, Tilia mandshurica, Populus ussuriensis, Larix olgensis, Betula dahurica, Fraxinus mandshurica, Abies nephrolepis, Ulmus davidiana var. japonica, Picea jezoensis, and Phellodendron amurense, among others. Twelve one-hectare permanent sample plots (100 m × 100 m each) were established in Q. mongolica secondary forest in August 2013. After the establishment of the sample plot, measurements were made and then the thinning operation was taken. The measurements of plots include height (H), diameter at breast height (DBH), crown width and coordinates of all living trees (D ≥ 5.0 cm) as well as slope, elevation, and other variables. The thinning operation was designed for comparing the influence of traditional thinning and crop tree releasement. The felled trees of traditional thinning are of poor quality, low vitality, and high density, while the felled trees of crop tree releasement are the competitors of crop trees.
In August 2018, 12 sample plots were remeasured. The data for establishing the diameter growth model are from the two measurements of investigation data of 12 Q. mongolica sample plots in 2013 and 2018. The calculation of independent variables in 2013 is based on the stand status after cutting. A total of 9365 trees with measurements taken twice were used for the model establishment. The basic information at the beginning of the establishment of the sample plots is shown in Table 1.

Basic Model
Naturally, the main factors affecting the diameter growth of individual trees are tree size, competition of surrounding trees, and site quality [30][31][32][33][34]. Combined with the biological importance and the correlation with the diameter growth of the dependent variable, a total of 14 variables were selected, among which the diameter at breast height (D), basal area (BA), tree height (H), average crown width (CW), the coefficient of slenderness (CS), and crown ratio (CR) were related to the tree size. The distance independent individual tree competition index include basal area of the trees larger in diameters than a subject tree (BAL, m 2 ha −1 ), total basal area per ha (m 2 /ha) for all species in the stand (BASUM), the stand density index (SDI = N(D g /20) 1.605 ) [35], the ratio of basal area of target tree species to stand basal area(SC) and the ratio of DBH of the target tree to the average DBH of the stand(CI). The distance-dependent individual tree competition indexes include Hegyi [36] and aCI based on the intersection angle [37], which can be depicted as: where Hegyi i is the competition index of the object tree i (ith), D j is the DBH of the competitive tree j (jth), D i is the DBH of the ith object tree, D ij is the horizontal distance between the object tree and the competitive tree. where where aCI i is the competition index of the ith tree based on the intersection angle, H i is the initial height of of the ith object tree, H j the height of the jth neighboring tree. L ij is the horizontal distance between the object tree and the neighboring tree. The spatial structure of forest can be quantified by describing the spatial relationships between the reference tree and its four immediate neighboring trees [38]. Therefore, the competition trees selected in this study are the 4 trees directly adjacent to the target trees. Site index is most commonly used for site productivity, however, it is suitable for regular stands [39]. Therefore, in this study, the relationship between tree height and diameter was used to represent site productivity (SPI) [40]. It is common to use the relationship between tree height and diameter to evaluate the site productivity of mixed forest [4], as it is difficult to precisely know the site index and age.
where k = 14.4204, b = 0.1657, D 0 = 20 [41], H i is the dominant height of the ith plot, D i is the average diameter of dominant trees in the ith plot. Therefore, the individual tree diameter growth model of Q. mongolica natural secondary forest was constructed by regression analysis, as shown below (Equation (4)).
where D 2 is the final DBH, D 1 is the initial DBH.
After determining the form of the basic model, these candidate factors are gradually introduced to measure the accuracy improvement of the basic model by different combinations. The variables with strong collinearity were excluded by combining the Variance inflation factor [42,43]. We start with graphs and relevant statistical analysis to see whether the variables are suitable for model fitting [44,45]. The variables with significant coefficients and higher fitting accuracy are selected primarily. Moreover, optimal basic model is determined by the stepwise regression method of linear model. Finally, four variables (D, CW, CS and CR) reflecting the size of the forest and four variables (SDI, aCI, SC and Hegyi) reflecting the competition were used as the independent variables to establish the individual tree diameter growth model of Q. mongolica natural secondary forest.

Two Level Nested Mixed Effect Model of Individual Tree Diameter Growth
Based on linear basic model of individual tree diameter increment, estimation models with sample plot nested in tree species two level random effects were formulated according to the linear mixed estimation modeling technique (Equation (8)): where ln (D 2 2 − D 2 1 ) ijk was the diameter increment of the kth tree on the jth sample plot of the ith tree species, M was the number of tree species, n i was the number of sample plots in the ith tree species, n ij was the number of individual tree in the jth sample plot of the ith tree species, and f (·) was a linear function of a parameter vector ϕ ijk and a covariate vector x ijk . The within-group error vector ε ij = (ε ij1 , . . . , ε ijn ij ) T that accounted for the within-group variance and correlation was assumed to follow a normal distribution that had an expectation of zero and an identity matrix variance-covariance structure. where β was a parameter vector of fixed effects, u ij was an independently and normally distributed vector of random effects with zero means and variance-covariance matrix Ψ, A ijk and B ijk were design matrices, and u ij and ε ijk were independent of each other.

Model Estimation and Evaluation
Generally, data sets are divided into a modeling set and a verification set. The modeling set is used to build models and estimate parameters, while the verification set is used to evaluate the extrapolation ability of models. The cross validation method was used to evaluate the prediction accuracy of the two-level mixed effect model to make better use of the existing data. The data structure of this study is based on sample plots, including 12 sample plots. In the cross-validation method, 1 sample plot was taken as the validation set each time, and the remaining 11 sample sites were taken as the modeling set for model construction. Then, the constructed model was used to predict the 1 sample plot. This cycle was repeated 12 times, and the predicted values of the 12 sample sites were obtained. The Akaike information criterion (AIC, Equation (9)), root mean square error (RMSE Equation (10)), total relative error (TRE, Equation (11)) and maximum coefficient of determination (R 2 , Equation (12)) were calculated based on the observed and predicted values of these dependent variables. The larger R 2 is, the smaller AIC, RMSE and TRE are, the stronger prediction ability of the model is.
where n is the total number of trees, y i is the diameter growth of the ith tree,ŷ i is the estimated value of the ith tree diameter growth, y is the average value of all tree diameter growth, p is the number of parameters in the model, and l is the maximum likelihood function value of the model. The parameters of the basic model and mixed effect model were estimated in R 3.6.2, and the mixed effect model was calculated by Nonlinear Mixed Effect (NLME) package [46].

Determination of the Optimal Basic Model
The fitting evaluation indexes of each basic model with different variables are shown in Table 2. With the gradual introduction of variables, the evaluation indexes RMSE, TRE, and AIC all decreased, and the decreasing rate slowed down when the number of independent variables exceeded 5. R 2 increased with the increase of independent variables, and the increase rate also slowed down when the number of independent variables exceeded 5. When all the 9 independent variables were taken into account, the fitting effect of the basic model was the best, with a maximum R 2 of 0.5782 and a minimum RMSE of 0.7285. The optimal basic model is as follows (Equation (13)): ln(D 2 2 − D 2 ) = a 1 + a 2 ln D + a 3 D 2 + a 4 ln CW + a 5 SDI + a 6 aCI + a 7 CS + a 8 CR + a 9 SC + a 10 Hegyi where D and D 2 are the DBH values from the first and second measurements, respectively, a 1 , a 2 , a 3 .... a 10 are parameters.  Table 3 shows that all the selected variables have a significant impact on the basic model, and their variance inflation factors are less than 6, so there is no multicollinearity problem [43]. In the aspect of tree size factor, the coefficient of the logarithm of D (lnD) is positive, while the coefficient of D 2 is negative, which indicates that the diameter growth of tree increases with the increase of diameter, and to a certain extent, its speed will decrease, showing a parabolic form, which is in line with the law of tree growth. The coefficients of CW and CR are both positive, indicating that the larger the crown width is, the higher the ratio of crown length is, the faster the growth of tree diameter is. While the coefficient of CS is negative, reflecting that the larger the ratio of height to DBH is, the smaller the growth of tree diameter is, and trees allocate more resources to the growth of tree height. The coefficients of SDI, aCI, SC and Hegyi are all negative in reflecting the competition factors around trees, which indicates that the competition among natural secondary forest trees of Q. mongolica already exists and inhibited the diameter growth of trees.

Mixed Effect Model of Diameter Growth
The model (13) has 10 parameters (a 1 -a 10 ), therefore, there will be 58,236 different combinations by introducing random effects of tree species and sample plots into the parameters, and each combination forms a mixed effect model of diameter growth. In order to find the best combination of random effects of tree species and sample plots, the covariance matrix [47] of random effects of tree species and sample plots was set as diagonal matrix due to the convergence of mixed effects model.
According to the evaluation indexes of mixed effect model (see Table 4), combined with the variance of random effect of tree species and sample plot on different parameters, the greater the variance, the greater the impact of random effect on this parameter. It is determined that the random effects of tree species and sample plot are added to parameters a 1 , a 2 , a 3 and a 7 , and the evaluation index of diameter growth mixed effect model is the best. Then, considering the structure of variance covariance matrix of random effects of tree species and sample plot, the likelihood ratio test [48,49] is used to avoid over fitting. Finally, it is found that when the variance and covariance matrices of random effects of selected tree species and sample plots are diagonal matrices, each evaluation index of the model is the best. Therefore, the two-level mixed effect model of individual tree diameter growth of Q. mongolica secondary forest was finally determined as (Equation (14)): See model (13) for the meaning of variables in the formula. The parameters of model (14) were significant (p < 0.001). Compared with the model (13), each evaluation index was significantly improved, in which RMSE decreased by 14.89%, R 2 increased by 17.69%, TRE and AIC decreased by 24.93% and 9.14%, respectively. It is shown that the introduction of random effects of tree species and sample plots greatly improves the fitting accuracy of the model, and the model (14) could be used to describe the diameter growth of individual tree in Q. mongolica secondary forest.

Model Evaluation
Based on the cross-validation method, the prediction evaluation of the three prediction methods (average response, random effect prediction at tree species level, and sample plot level) of model (13) and model (14) are shown in Table 5. The prediction results of the model (14) based on different levels of random effects are better than the prediction of its mean response and the prediction of the model (13), with smaller RMSE and TRE and larger R 2 . In particular, the prediction accuracy of the model (14) based on the sample plot level nested in the tree species level was the highest. Compared with the average response and tree species level prediction of the model (13) and (14), its RMSE decreased by 6.35%, 10.02% and 0.59%, and its R 2 increased by 9.37%, 16.80%, and 0.73%, respectively. The results showed that the random effect of tree species level was significant, but the random effect of sample plot level was not as obvious as tree species. Diameter growth curve of Q. mongolica natural secondary forest (Figure 2) was produced by model (14). It can be seen that the diameter growth curve well passes through the observed value, which verifies the validity of the mixed effect model (14). Table 5. The results of evaluation metrics for model (13) and mixed random effects model (14)   Therefore, the mixed effect model (14) is recommended to predict the individual tree diameter growth of Q. mongolica secondary forest.
The results showed that the variable D was the most important factor affecting the diameter growth, and the influence of the other 7 variables on the diameter growth from large to small order is as follows: CW, aCI, CS, CR, SDI, SC, and Hegyi. CW, CS, and CR were positively correlated with the diameter growth, while the other 5 variables were negatively correlated. The regression coefficient of SPI is not significant, so this index was not considered in the model construction. The results show that the mixed effect model has better prediction results, and the prediction accuracy based on the sample plot level nested in the tree species level is the highest, with RMSE of 0.6907 and R 2 of 0.6210. It can be seen from the distribution map of the predicted values and residuals ( Figure 3) that there is no obvious heterogeneity in the distribution of residuals, which indicates that the two models have good prediction results, and model (14) is obviously better than model (13).

Discussion
Tree growth is affected by many factors, including physical, physiological, and ecological aspects [50]. Both biological importance and statistical fitting were considered when variables were selected in this study. If the model has no biological significance, it is just for model development and has no guiding significance for production and research [51]. This study showed that the variables significantly related to the diameter growth of Q. mongolica natural secondary forest were size and competition. The effect of tree size factor on diameter growth is higher than that of competition factor. R 2 of the model reached 0.5081 after introducing the first variable lnD. The introduction of the remaining eight variables only increased R 2 by 13.80%, indicating that initial DBH had the greatest impact on individual tree diameter growth. Many related studies also show this, but most of these related studies are based on traditional regression models (least square method), which is not suitable for data with complex structure [34,47,52,53]. Meanwhile, it is found that the basal area of the stand seems to be more important than the relative size of the tree in other studies [50,54,55]. CW (Figure 4a) has the greatest effect on diameter growth in addition to diameter, which may be because there is a close positive correlation between crown and diameter growth. The crown characteristics were also included in some individual tree diameter growth models [56,57]. , and CS (c) on the diameter increment of individual trees. The curves were produced using parameter estimates of model (13). The mean values of the observed data were used for other variables rather than the variable shown varying (lowest to highest range in the observed data).
In this study, the effect of site condition on diameter growth is not significant, it may because the 12 sample plots are in the similar site conditions, where the microenvironment does not change much. The 12 plots were all in the west direction, with a gradient between 6 and 10 degrees, and were all in the middle slope (except for the No.5 plot, which was uphill) (see Table 1). Another reason why site conditions have little effect on diameter growth may be that Q. mongolicus, which is mainly distributed in arid and poor soil in China, does not have a high demand for site conditions. The above results are consistent with the conclusions of relevant scholars [58,59]. Related studies also show that site quality has a greater impact on tree height growth, while competition factor has a greater impact on diameter, which is also verified by this study [58]. With the increase of independent variables and the introduction of random effects, the coefficient of determination of the model has been improved, likely due to the variety of species in the stands. Many related studies have also improved the model performance by introducing multiple predictors and random effects [60,61].
The independent variables in the model have different influences on diameter growth, among which the coefficients of lnD, CW and CR are positive. It can be seen that the higher the values of these three items are, the more vigorous the tree is and the faster the growth rate is. However, the coefficient of D 2 is negative, which indicates that the growth rate of trees will decrease when the diameter grows to a certain extent. In this study, CS is regarded as a variable representing the tree size, which can also be treated as a variable representing competition. The coefficients of CS and four competition variables SDI, aCI, SC and Hegyi are all negative. Many related studies also show that diameter growth decreases with the increase of competition [47,53,61,62]. Both stand density and competition have significant effects on diameter growth, as trees compete for limited resources such as space, nutrients, water, and light requirements during growth. In case of excessive competition, trees may grow taller, but with smaller crown and smaller diameter. Especially for the trees in the sub-forest layer, more energy will be allocated for tree height growth in order to obtain more lights [63,64]. As for the dominant trees, because the canopy can receive enough sunlight, more energy will be distributed to the diameter growth, and the stability is better than that of the compressed trees, which also explains the positive coefficient of lnD [65]. This study is based on the observation data from two times surveys of 12 sample plots in a forest farm. Long term and large-scale observation data are needed in order to predict the growth dynamics of Quercus mongolica secondary forest more accurately.