Stand Volume Growth Modeling with Mixed-Effects Models and Quantile Regressions for Major Forest Types in the Eastern Daxing’an Mountains, Northeast China

The relative growth rate (RGRnv) is the standardized measurement of forest growth, whereby excluding the size differences between individuals allows their performance to be compared equally. The RGRnv model was developed using the National Forest Inventory (NFI) data on the Daxing’an Mountains, in Northeast China, which contain Dahurian larch (Larix gmelinii Rupr.), white birch (Betula platyphylla Suk.), and mixed coniferous–broadleaf forests. Four predictor variables—i.e., quadratic mean diameter (Dq), stand basal area (G), average tree height (Ha), and altitude (A)—and four different methods—i.e., the nonlinear mixed-effects models (NLME), three nonlinear quantile regression (NQR3), five nonlinear quantile regression (NQR5), and nine nonlinear quantile regression (NQR9) models—were used in this study. All the models were validated using the leave-one-out method. The results showed that (1) the mixed coniferous–broadleaf forest presented the highest RGRnv; (2) the RGRnv was negatively correlated with the four predictors, and the heteroscedasticity reduced significantly after the weighting function was integrated into the models; and (3) the quantile regression models performed better than NLME, and NQR9 outperformed both NQR3 and NQR5. To make more accurate predictions, parameters of the adjusted mixed-effects and quantile regression models should be recalculated and localized using sampled RGRnv in each region and then applied to predict all the other RGRnv of plots. MAPE% indicates the mean absolute percentage error. The values were stable when the sample numbers were greater than or equal to six across the three forest types, which showed relatively accurate and lowest-cost prediction results.


Introduction
Forests represent the habitat for most of the earth's terrestrial biodiversity, occupying almost 4.06 billion ha or 31% of the earth's total land area. Deforestation and forest degradation are two reasons for the decrease in forest area, which has been lost at an alarming rate of 4.7 million ha per year during 2010-2020 [1]. Consequently, China has gradually banned the commercial logging of natural forests in the eastern Daxing'an Mountains since 2015 to recover the natural forests. This ban has been successfully implemented with the current total volume of 547 million m 3 or about 92.01% of the total natural potential of the eastern Daxing'an forest [2]. However, many natural forests with no production demand need a new management plan to specifically address more appropriate and effective monitoring and adjustment methods to ensure their continuity. One of the most effective ways is to monitor and update real-time forest growth, which significantly impacts forest structure and composition over time [3]. Unfortunately, to the best of the authors' knowledge, no research has been found regarding the forest growth model in the eastern Daxing'an Mountains.
Forest growth is defined as the size of a tree or a stand produced within a certain period [4]. Forest growth and yield research has provided forest managers with an abundance of tools for simulating stand dynamics [5]. This research was begun in the 1850s in Central Europe when the basic yield tables became the primary standard for growth and yield estimation until the 1950s [6]. The basic yield tables might be helpful in actual application; however, their low accuracy is gradually becoming unacceptable to both academicians and practitioners. Hence, the mathematical-statistical analysis and modeling of forest growth have been increasingly developed to fulfill the demand for more accurate estimation. As a result, numerous classic functions can be used to better estimate forest growth, such as those of Chapman-Richards, Kangas, Monomolecular, Weibull, and Korf [7][8][9].
However, the growth function is not suitable for the uneven forest, because of the difficulty in determining the age of many natural forests [10]. Moser and Hall [11] proposed a function of a measurable size characteristics of the stand under investigation to eliminate the problem of indefinite age. This growth rate equation provides a yield function expressed with an elapsed time from a given initial condition, indicating that compatible growth rate and yield functions for the uneven-aged stand can be derived from permanent plot data. Dale et al. [12] classified growth and yield models into stand model, size class model, and individual tree model. Peng [13] detailed the differences between these three models: stand models use stand parameters to simulate the stand growth and yield that are usually simple and robust; however, they provide little or no detail about individual trees within a stand; size class models only require overall stand values as input and provide some information relating to stand structure, but the output is not flexible enough to evaluate a broad range of stand treatments; and the individual tree models simulate each individual tree as a basic unit that provides maximum detail and flexibility for evaluating the stand but requires more detail and an expensive database to be developed and implemented. Although the stand models cannot provide specific information about the individual tree or size class, it is generally cheaper and relatively effective for practical applications because of lesser data and the need for complex variables [14].
There are several different basic concepts of growth, which have not been properly explained and often appear confusing. Cumulative growth means the total volume attained by a tree or a stand at any particular time. Generally, cumulative growth time is a fixed interval, which is not convenient for practical application. Absolute growth rate (AGR nv ) is the mean absolute changes in mass over a given time period, which can be defined as the value of a particular characteristic at different times (t k and t k−1 divided by ∆t). AGR nv depends on the current state of the plant size characteristic and is therefore not helpful to growth analysts when comparing plants of different sizes [15]. In such situations, RGR nv is often studied in addition to absolute growth. RGR nv is also a function of time and is defined as the increase in size relative to the growth characteristic [3]. Hunt [7] decomposed RGR nv into the product of three components: (1) the net assimilation rate (N AR), which is the absolute growth rate per unit leaf area; (2) the leaf mass ratio (LMR), which is the proportion of biomass invested in leaves; and (3) the specific leaf area (SLA), which is the leaf area divided by leaf mass. Given the advantages of RGR in physiology and ecology, it was used in this study as the dependent variable.
Permanent plot data from different regions are mostly used for forest growth modeling. However, heteroscedasticity always exists in model residuals; hence, a weighted regression should be used to neutralize the inconsistency [16,17]. The traditional ordinary leastsquares (OLS) method obeys the assumption of independent observations, providing the proper parameter estimates for the overall average. However, their variances are biased when there are significant differences between groups. Since the measured trees or stands are usually grouped into plots or regions, the nonlinear mixed-effects (NLME) modeling approach has been used to reduce the bias by considering the differences between groups [18][19][20][21]. The NLME is composed of fixed-and random-effects parameters, in which the variance-covariance structure analyzes hierarchically structured data more efficiently and subsequently increases the prediction accuracy of the models. NQR has been widely used in the last few years to describe the response variable, given a set of explanatory variables [22][23][24]. Compared to the conventional OLS method, quantile regression can characterize the entire conditional distribution of the outcome variable and is more robust to the presence of outliers and misspecification of the error distribution [25]. However, in actual application, both the NLME and NQR should be calibrated for the adjusted random parameters and the best quantile curve [26][27][28].
The main research tasks of this study were to (i) choose the best basic model equation for RGR nv and solve model autocorrelation and heterogeneity using the NLME and NQR methods; (ii) evaluate different RGR nv models using a leave-one-out cross-validation approach and compare the RGR nv of three forest types and analyze the trend differences between mixed and relatively pure stands; and (iii) determine an appropriate sample size that considers both sample cost and predictive accuracy. The finding will be useful in forest management planning for the uneven-aged forests in the eastern Daxing'an Mountains. It will also expand the application of mixed-effects and quantile regression models in stand growth modeling, which will be useful in effectively estimating the dynamics of future forest accumulation and providing a theoretical basis for carbon neutrality. Continuously improving our understanding of RGR nv impacts on the comparison of pure and mixed forests will support forest managers seeking to develop effective adaptation measures and determine sustainable forestry production.

Study Sites
We selected the eastern Daxing'an Mountains located in Heilongjiang Province, Northeast China (50 • 05 -55 • 33 N, and 121 • 11 -127 • 01 E). It is the largest and best-preserved natural forest in China (Figure 1), covering approximately 6.62 million ha (97.46% of the total forest area), with a total volume of living trees of about 540 million m 3 (98.72% of the total forest volume). The average width is approximately 200 km, stretching 1200 km from northeast to southwest. The altitude ranges from 300 to 1520 m above the sea level. This area belongs to the typical cold temperate continental monsoon climate, with mean annual temperature and total precipitation ranging from −1 to −2.8 • C and 430 to 460 mm, respectively. The soils are mostly cryumbrepts or humic cambisols (or brown coniferous forest soils in the Chinese taxonomic system) [29,30]. The main dominant tree species are Dahurian larch (Larix gmelinii Rupr.) and white birch (Betula platyphylla Suk.), accompanied by Dahurian poplar (Populus davidiana Dode.), Mongolian oak (Quercus mongolica), and Mongolian pine (Pinus sylvestris var. mongolica). There are three main forest types in the eastern Daxing'an Mountains: LF, WBF, and CBMF. The areas of the three main forest types are 2.54 million (37.39% of the total forest area), 2.23 million hectares (32.83% of the total forest area), and 0.69 million hectares (10.16% of the total forest area), respectively.

Modeling Data
NFI data are network representatives designed to continuously monitor the macrochanges of forest resources at the national and provincial levels at 5 years. The main objectives of the NFI are to identify the forest extent, such as volume, growth, consumption, function, and dynamics during each interval. In eastern Daxing'an Mountains, the NFI sampling design is an 8 km × 8 km grid, and a permanent sampling plot is located at every grid.
Measurements that were used in this study came from 384 permanent NFI plots. These plots were located in 12 forest bureaus (or 12 regions), which covered most of the eastern Daxing'an Mountain area. The data were collected in three periods of [2000][2001][2002][2003][2004][2005]2005-2010, and 2010-2015, respectively, with a total of 1152 samples investigated. The data represented three main natural forest types in the eastern Daxing'an Mountains, comprising LF, WBF, and CBMF (dominated by the intermixture of Dahurian larch and white birch). The rectangular plots had a size of 10 × 60 m, which was equal to 0.06 ha. No logging or human intervention was detected within the sample plots. All trees with diameter at breast height greater than 5 cm were measured, and information including the species name, DBH, tree status, and number of trees per sample plot was collected. For each plot, the heights of 3-5 standard trees (determined by the D q ) for the dominant species in a plot were measured using a Blume-Leiss hypsometer to calculate the mean height of these trees as H a . The stand basal area was calculated with Equation (1). Individual tree volume was calculated with the one-way volume table for each forest bureau based on the diameter at breast height, and the system error is within ±1% [31]. The stand volume was calculated with Equation (2). The Pressler growth rate formula, Equation (3), was used to calculate the RGR nv , in which the periodic average growth rate was substituted for the instantaneous growth rate [32]. The abbreviations for the main variables and terms involved in the article are shown in Table 1. The descriptive statistics of the stand variables are provided in Table 2.

Area
(2) where V i t−n and V i t represent the initial and final stand volume of ith plot, respectively, which are equivalent to the accumulation of the live stand volume, ingrowth volume, and harvest volume; k represents the numbers of plot; n represents the investigation interval in Equation (3). The National Forest Inventory data NLME/MIXED Nonlinear mixed-effects model NQR (3,5,9) Nonlinear quantile regression model (3, 5, 9 quantile

Basic Model
Several basic models (liner model, reciprocal model, power model, exponential model, and sigmoid model) were used to describe the relationship between relative growth rate and dependent variables. The equations are listed below: RGR nv = 1 1 + exp − (α + βX) (8) α and β are the models' parameters and X is the independent variable vector. Seven variables were used in the model: All models with insignificant variables (p-value > 0.05) were eliminated. The relationship between variables and growth rate are provided in Figure 2. The model fitting performance was evaluated by three evaluation statistics: adjusted coefficient of determination (R a 2 ), root mean square error (RMSE), and Akaike information criterion (AIC). The equation with the largest R a 2 and the smallest RMSE and AIC was chosen as the best basic model.
where n is the plots size, p is the number of the model parameters, LL is the log-likelihood value of the equations, and y i ,ŷ i , and − y i are the observed, predicted, and mean observed volume growth rate, respectively.
The heteroscedasticity is commonly found in growth models. It is well known that the error variance of the i th observation is functionally related to one or more predictors, which can be modeled with a power function, such as σ 2 i = (X k i ) [33]. The power coefficient k can be obtained by iteratively using the residual variance model e 2 i = (X k i ), in which e i is the model residual of the unweighted basic model. In this study, we chose 1/ŷ i k as the weight function. The Breusch-Pagan test was used to measure the heteroscedasticity corrected model. It uses a variance function and a χ 2 -test to test the null hypothesis that heteroscedasticity is not present (i.e., homoscedastic) against the alternative hypothesis that heteroscedasticity is present [34]. To ensure that there is no "multicollinearity" in the model, a correlation test method was used.

Nonlinear Mixed-Effects Models
The population average model is commonly used in forestry modeling. However, it cannot exhibit the trends of the specific subjects within the data. A nonlinear mixed-effects model was built to solve this problem, specifically for dealing with the intrinsic variation and the nested structure of the sample data [20,35]. In this study, the best-performing basic model was used to formulate a generalized one-level NLME growth model by introducing region-level (forest bureau) random effects into the model. All possible combinations of the fixed-effects and random-effects parameters were fitted to the data. The model with the smallest AIC and the largest log-likelihood (LL) was selected for further analyses. A likelihood-ratio test (LRT) was performed to avoid over-parameterization [36]. The one-level NLME growth model can be written as follows: where . . ε in i T , respectively, represent the vectors of the observation value of RGR nv , variables, and the random errors; n i is the number of RGR nv measurements for region i; b and u i are vectors of fixed-and random-effects parameters, respectively; u i and ε i are assumed to follow the multivariate normal distributions with mean 0 and variance D and R, respectively. D is the variance-covariance matrix of random-effects parameters used in the generalized positive-definite matrix throughout this article, and R is the within-group variance-covariance matrix. The mathematical form is as follows: where σ 1 2 is the variance of the first parameter; σ n 2 is the variance of the nth parameter; σ 1 σ n and σ n σ 1 are the mutual covariances of the first parameter and the nth parameter, respectively; σ 2 is the variance of model residual; G i accounts for within-group variance heteroscedasticity and its diagonal elements are provided by variance function; and Γ i accounts for within-group autocorrelations of the residual errors and considered as an identity matrix as there is no time autocorrelation. The heteroscedasticity of the mixed model can be solved by weighted function based on the power variance function, Equation (15).
where e ij and RĜR nvij are, respectively, the residual and estimated RGR nv calculated with fixed-effects for the jth sample plot within the ith sample region, and γ and δ are the estimated parameters for the best basic model. The NLME package in R software was used to obtain the fixed-and random-effects parameters of the nonlinear mixed model of RGR nv [37].
To calculate the prediction of RGR nv for a new plot, at least one plot of RGR nv within each region should be available. The random parameters u i for the plot i can be computed using the first-order of Taylor series expansion [38]: whereû i is the estimate of the random parameters for region i;D is the estimate of D; Z i is the designed matrices for random effects, is the estimate of R i ; y i and x i are the same as defined in Equation (12); andb is estimates of parameter variable. The validation statistic was similar as the basic model, Equations (7)-(9).

Nonlinear Quantile Regressions
The quantile regression model was first introduced by Koenker and Bassett [22] and is also known as a location model. It can characterize the entire conditional distribution of a dependent variable, providing a robust measure of location that reduces the sensitivity of the estimated coefficient to the outlier observations. Quantile regression may bring more efficient estimates than ordinary least-squares regression, specifically when the error term is not normal [39]. NQR was used to predict RGR nv with the τth quantile expressed as: is the estimated RGR nv under the τth quantile, and τ is the quantile point (ranging from 0 to 1). In contrast to the mean regression technique, which employs the least-squares procedure, the parameters of NQR are obtained by minimizing the following equation [18]: where S is the sum of the weighted residuals of the τth quantile; y x ij is the observed RGR nv . under the τ th quantile; andŷ τ x ij is the predicted RGR nv under the τth quantile. The quantreg package in R software was used for fitting the NQR growth rate model [40]. The quantile model can be calibrated with different curves of several quantile points (τ). Three different numbers of quantile points were constructed for the NQR models-i.e., NQR3 (τ = 0.1, 0.5, 0.9), NQR5 (τ = 0.1, 0.3, 0.5, 0.7, 0.9) and NQR9 (τ = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9).
When the prior information (a few sampling plot measurements for each region) is utilized in the RGR nv NQR model, the two closest quantile regression curves encompassing the majority of these data points need to be correctly identified [23,41]. One or more RGR nv of sample plots were measured in each region as calibration samples to compute the mean difference between the observed and the predicted values of all sample plots for each quantile regression. Two consecutive quantile regressions (i.e., mth and m + 1th) were then chosen for interpolation of the RGR nv curve: whereŷ i is the adjusted quantile regression predicted value of RGR nv and the interpolation ratio is α =ŷ m+1 −y m y m+1 −ŷ m , which was obtained by minimization: If the mean difference was positive for all the quantile regression curves, then the majority of RGR nv were over the highest (qth) quantile regression curve, andŷ m and y m+1 in Equation (19) were defined asŷ q−1 andŷ q , respectively. In contrast, if the mean difference was negative for all quantile regression curves, theŷ m andŷ m+1 were defined aŝ y 1 andŷ 2 , respectively.

Model Evaluation of Validation and Prediction
Six methods evaluated in this study were: (1) basic model; (2) fixed-effects model; (3) mixed-effects model; (4) NQR3; (5) NQR5; and (6) NQR9. The models were validated using the leave-one-out cross-validation method, which is widely used and considered to be a useful method to test the model's predictive performance [35]. Three statistics-MAE is mean absolute error, MPE(%) is mean percent error, and MAPE(%) is mean absolute percent error-were calculated with the model bias generated from validation for assessing and comparing the predictive performance of the models.
where y i is the observed RGR nv ,ŷ i,−1 is the predicted RGR nv obtained by the leave-one-out cross-validation method, and n is the size of all plots.
To make more accurate predictions, parameters of the adjusted mixed-effects and quantile regression models should be recalculated and localized using the sampled RGR nv in each region and then applied to predict all the other RGR nv of plots. One to 15 sample plots were used in this study for each region. Each sample scenario was repeated 500 times to ensure the stability of the results. The evaluation statistics computed for the validation data set are as Equations (21)-(23). We used the paired t-test to compare and analyze the validated results of different sampling numbers and to find the best sample number, considering both accuracy and cost.
Several methods have been suggested for evaluating model predictions, aimed in general at quantifying the relative contribution of different error sources to the unexplained variance [42][43][44]. The most popular method was using graphic plots of observed and predicted data (OP scatter plot). It should be emphasized that the OP scatter plot must set observed values in the y-axis vs. predicted values in the x-axis. The opposite setting will result in incorrect conclusions about the performance of the model. Piñeiro et al. [45] suggested that the coefficient of determination of the regression of OP values (R 2 ) is an effective parameter that provides important information about model performance to indicate how much of the linear variation of observed values is explained by the variation of predicted values. It was calculated by Equation (24), where y i was the observed RGR nv andˆi y was the predicted RGR nv . Another suggestion was to test the hypothesis of slope = 1 and intercept = 0 to assess statistically the significance of regression parameters of fitting relationship between the observed and predicted data.

Best Basic Model
The fitting results of the basic models are provided in the Appendix A (Tables A1-A3). The best basic RGR nv model form was the exponential function model for all the three forest types. The R a 2 and RMSE for the best basic model is 0.6841 and 0.0111 (LF), 0.6991 and 0.0087 (WBF), and 0.8031 and 0.0122 (CBMF), respectively ( Table 3). The parameter estimates and standard errors are shown in the Appendix A (Tables A4-A6). The relationship between the variables and the RGR nv is presented in Figure 2. The heteroscedasticity of the model significantly stabilized (p > 0.05) after the model was weighted (Figure 3). The variables correlation test ( Figure 4) showed that there was no obvious strong correlation (−0.6 < coefficient values < 0.6) between the model independent variables [46]. The specific form of the model is as follows:

Mixed-Effects Models
The parameter estimations of the mixed-effects models are presented in the Appendix A (Tables A4-A6). Different random parameter combinations were sequentially added to the basic model. The best mixed-effects models for the three forest types were finally determined after considering the model convergence and the comparison of fitting indicators. The model forms for LF, WBF and CBMF are as follows (u 1 -u 4 are random parameters): The model fitting accuracy significantly improved after the addition of random effects to the basic model, in which their R a 2 and RMSE were 0.7706 and 0.0095 (LF), 0.7978 and 0.0071 (WBF), and 0.8546 and 0.0104 (CBMF), respectively ( Table 3). As in the previous subsection, the heteroscedasticity of the mixed-effects models also significantly stabilized (p > 0.05) after it was weighted (Figure 3).

Nonlinear Quantile Regression Models
The parameter estimates and standard errors of the NQR models across different quantile points are shown in Appendix A (Tables A4-A6). Quantile regression properly represented the characteristics of invariant to monotonic transformations; hence, the heteroscedasticity or outliers had only an insignificantly minor effect on the NQR estimations. Table 4 shows no difference in MAE values between NQR3, NQR5, and NQR9. NQR9 had smaller MPE(%) and MAPE(%) than NQR3 and NQR5, indicating that it provided slightly better predictions than the other two methods.

Comparison of RGR nv between Three Forest Types
Four variables were included in the basic model (Equation (25)), which were D q , G, H a , and A. To compare RGR nv between the three forest types, the relationship between the predicted RGR nv and G at different variable levels was visualized, since G had a superior performance in the basic model compared to the other variables. The common data interval range of the remaining variables in each stand was determined according to the maximum and minimum value, then the average value was calculated and divided into three levels. The results are shown in Figure 5. For LF, the RGR nv across three levels of D q differed more than H a and A. This result indicated that the effect of the individual's radial size on the growth rate was higher than that of the longitudinal size and site in LF. The opposite result was found in WBF, where the RGR nv differences among the average H a values were greater than other variables. For CBMF, the differences among both D q and H a were greater than A. The RGR nv rank among the three stand types varied across diameter levels ( Figure 5A), when the D q were 5 and 13 cm, the RGR nv was ranked as CBMF > LF > WBF. The order changed when D q was 21 cm: WBF > CBMF > LF. For both H a and A, CBMF consistently had the highest RGR nv across all the three levels. Table 4 presents the model validation results using the leave-one-out validation method. The results showed that the basic model provided a slightly higher accuracy than the fixed-effects model. In addition, the MPE(%) and MAPE(%) values of the mixed-effects and quantile model were significantly lower compared to the basic model, in which the quantile model resulted a way much better prediction than the mixed-effects model. For the quantile model, NQR9 was better than NQR5 and NQR3 in both LF and CBMF. There was only a slight prediction difference between the NQR3, NQR5, and NQR9 in WBF. Figure 6 shows the scatter plots between the observed and predicted RGR nv of the basic, fixed-effects, mixed-effects, and quantile models. NQR9 had the biggest R 2 , indicating that it performed best among all the methods. Both null hypotheses were not rejected; thus, disagreement between model predictions and observed data was due entirely to the unexplained variance. Figure 7 presents the MAPE(%) variation across various sample numbers for the four models (NQR9, NQR5, NQR3, and the mixed-effects model). Zero sample number means that MAPE(%) was calculated using the mixed-effects model with fixed parameters only, and for the quantile regression, τ = 0.5 refers to the median effect. The MAPE(%) values decreased as the sample numbers increased across all forest types, in which the three quantile regressions (i.e., NQR3, NQR5, and NQR9) had lower MAPE(%) values than the mixed-effects model. The difference between NQR9, NQR5, and NQR3 was not obvious, where NQR9 consistently outperformed NQR3 and NQR5 by producing the lowest MAPE(%) values across all sample numbers. The above conclusion was confirmed statistically by a series of test of contrasts: sample of one plot per region vs. two or more plots; two plots vs. three or more plots, etc. Results indicated that sampling six plots per region yielded non-significant differences among the evaluation statistics as compared to sampling seven or more plots (Table 5). Hence, more stable MAPE(%) performances were seen when the sample numbers were greater than or equal to six.

Comparison of Sample Size for Calibration
For the mixed-effects models, the result of sampling indicated that when the sample size was more than six (Appendix A:

Discussion
In this study, Equation (25) was the best basic RGR nv model constructed using four primary predictors: D q , G, H a , and A. The data were obtained from the NFI's permanent sample plots across the three forest types in eastern Daxing'an Mountains. The exponential form was found to be the best basic RGR nv model. Several previous studies have proven that the exponential form can visualize the important biological characteristics of various animal and plant growth. Jobidon [47] studied the competition effects in the boreal mixed wood forest across Quebec and Canada; he concluded that the exponential form should be considered to analyze the stand growth, specifically if young trees exist. Scolforo et al. [48] also preferred to utilize the exponential form to model the relative diameter growth rate of the six tree species in Brazil, which included stand density as the primary predictors. The exponential model was proposed in this study since it was able to clearly describe the growth rate of the cold temperate forests and avoid negative growth rate estimation. Most of the native tree species had small DBH, while only a few individual trees had large DBH. Hence, these larger trees have smaller predicted growth rate values compared to other trees. The larger trees may highly influence the model; however, the exponential formulation does not allow the predicted growth rate to be negative.
The RGR nv was the dependent variable of the model in this paper and was calculated with Pressler's formula using the average growth rate to replace the consecutive annual growth rate. RGR nv is a standard variable to determine the productive capacity of a tree and can be used to compare the trees that differ in initial size, age, and environmental conditions [49,50]. All four parameter estimates were negative, indicating that the RGR nv decreased with the increment of various variables. This trend was consistent with the actual survey results (Figure 2). D q was used for the average tree diameter in the research of tree growth, having a better performance than the arithmetic mean diameter [51,52]. Curtis et al. [53] explained the difference between the average and the arithmetic mean diameter in detail and their respective applicable scenarios. Reineke's [54] SDI (stand density index) was used for the various relative density measures and stand management diagrams derived from the Reineke relationship, which was based on D q . H a was also an important variable for describing stand structure, specifically for the vertical direction. G was the density index, combining the tree diameter and the number of trees within a stand. Site effect variables that contained altitude, slope, and aspect have been recommended in growth, forest species composition, and productivity research [55][56][57]. Unfortunately, in this study, although the altitude showed significant contributions to the RGR nv model, the slope and aspect of the plot were not significant in the model. The RGR nv declined with altitude, which could be the result of a short growing season and reduction in mean summer temperatures [58,59]. Ma and Lei [60] showed that H a can replace dominant height when there is no information available. However, Inoue [61] found that stand volume and bole surface area were not independent of stand age or site quality. Hence, when the stand age is unknown, H a can be the average feedback of stand vertical structure rather than the stand site condition.
The growth of the trees has played an important role in recycling the forest. Plantation forests emerged mainly in Germany and Austria, while the concept of natural forests originated in France and Switzerland [62]. Although some research on heterogeneous forests was conducted in 1940s, presently, natural forests are still lacking a forest management philosophy. Dahurian larch is a very intolerant and cold-resistant species that can grow in dry, infertile, and swamped soil, or on permanently frozen ground; however, it is best grown in drained, moist, and sandy soil [63]. Jia and Zhou [64] summarized the growth characteristics of natural and planted Dahurian larch in Northeast China. They emphasized that the growth rates are considered essential indexes in assessing forest recovery processes and carbon sequestration potentials, which can supply post-fire or sustainable management strategies. White birch is an excellent, fast-growing pioneer species and suitable for well-drained soil; it can grow in heavy clay and nutritionally poor soils [65]. In recent years, research on the mixed forest has developed rapidly, including stand stability [66], stand resource utilization [67], stand diversity [68], and stand dynamic change [69]. It has shown that the mixed forest has obvious advantages over the pure forest. The CBMF studied in this paper mainly involved the mixed forest of LF and WBF. Figure 5 shows that the RGR nv of CBMF was higher than LF and WBF across different levels, which proved that CBMF had more potential growth superiority compared with LF and WBF. CBMF also has the highest productivity or stand stability, which was corroborated by the well-known competitive yield criterion [70].
In forestry applications, two or more measurements are often taken from each sampling unit. These repeated measurements are not statistically independent; hence, the ordinary least-squares techniques may underestimate the variance of parameters. In this paper, the nonlinear mixed model and quantile model were used to deal with the longitudinal data. According to Table 3, the prediction accuracy of all NLME models was improved and presented a more stable estimate (indicating the reduction of spatial heterogeneity in the model residuals). WBF had the most significant improvement, which was then followed by LF and CBMF. Figure 6 shows that the relationship between the observed value and the predicted value of the mixed model was more concentrated compared with the basic model. The random effect was integrated to D q and A for the LF mixed model, while G and H a were used for WBF and CBMF, corroborating the results of Condés and García-Robredo [71]. Zhao et al. [72] reported that modeling the autocorrelation structure alone cannot completely match the possible common effects, and they suggested introducing random period effects for describing the autocorrelation. Unfortunately, their research relied only on one to three observations; hence, it is unreliable to employ period as an autocorrelation structure.
Cade and Noon [73] mentioned that the quantile regression estimates multiple rates of change (slopes) from the minimum to maximum response, providing a more complete trend of the variables relationships. Table 4 shows that the result of the quantile regression model was better than the basic and NLME models. MPE(%) and MAPE(%) indicated that more improvement in model accuracy existed for the quantile regression than for the NLME models. NQR9 was a better quantile regression than the others (NQR3 and NQR5), although the difference was negligible. Bohora and Cao [26] showed that NQR3 was better than NQR5 and NQR9, while Zang et al. [24] argued that NQR9 was the best and used it to model the height-diameter relationship. This kind of contradictory result may be caused by the size and the distribution of the data. The higher quantiles require a much greater proportional increase in data volume to maintain a constant risk ratio, and the excessive numbers of the quantile may cause over-fitting. Therefore, we recommend choosing NQR3 or NQR5 when the data distribution range is narrow or when the size of the data is limited. In this study, the quantile model had a better result than the NLME, which was indicated by smaller MAE, MPE(%), and MAPE(%). The reason might be the obvious differences in growth rate trends between different regions.
Random sampling was used to calibrate the mixed-effects and quantile models. The results showed that the model performance improved with an increase in the sampling number. The quantile model performed better than the mixed model across all sample numbers, and NQR9 was best among them. Six (or more) sample plots per region were adequate for delivering high-accuracy predictions (Figure 7, Table 5). In the mixed-effects or the quantile regression model, the subject-specific random effect or entire conditional distribution of the response makes up the shortcomings of instability and uncertainty of the population-averaged response. The sampling process is a hypothetical simulation of the survey in reality. It can provide prior information to adjust the random parameters or interpolation ratio of the local best quantile curve and effectively improve the accuracy of estimation. It has generally been taken for granted that the inclusion of additional stand variables into the base models automatically results in better predictions (helping to justify the increased costs associated with measuring the additional variables). As Huang et al. [74] stated, the subject-specific model allows the plot level variations related to many known and unknown factors such as topography, soil type, nutrient status, genetics, climate, silvicultural regime, environment, intra-and inter-specific competitions, etc., to be accounted for without requiring that they be identified or measured. This is a noted advantage of the NLME and NQR techniques. In this study, each region contained an average of 32 plots. Assuming the data from the last period are known, the next period RGR nv of all plots can be estimated when only six plots from each region are surveyed. This can save about 80% of costs.

Conclusions
Plantation forests have developed rapidly in modern forestry because of their simple structure, short management cycle, and high timber output. However, there is still a lack of a theoretical basis for natural pure forests and mixed forests. Hence, in this study three forest types from Daxing'an Mountains were studied to determine the growth differences between the natural pure forest and mixed forest and establish the RGR nv model. RGR nv was recalculated with the Pressler growth rate formula. The liner model, reciprocal model, power model, exponential model, and sigmoid model were used to develop the basic model, and of these, the exponential model performed the best. The best RGR nv basic model was established using D q , G, H a , and A surveyed from three periods, using NFI data from 12 forest bureaus in the eastern Daxing'an Mountains, Northeast China, for a total of 1152 samples. A correlation index plot showed that there was no obvious strong relationship between the independent variables. The exponential model was proposed in this study since it was able to clearly describe the growth rate of the cold temperate forests and avoid negative growth rate estimation. All four parameter estimates were negative, indicating that RGR nv decreased with an increment in various variables. This trend was consistent with the actual survey results. The result showed that the trend of RGR nv differed among the three forest types, in which CBMF had higher RGR nv than both LF and WBF, indicating the superiority of mixed forest.
In this paper, four different methods, i.e., NLME, NQR3, NQR5, and NQR9, were used to deal with the longitudinal data. All models were validated using the leaveone-out method and random sampling technique. Both quantile regression and NLME methods showed better performance than the basic model. After the addition of random effects, the R 2 a improved: 0.6841 to 0.7706 (LF), 0.6990 to 0.7978 (WBF), and 0.8031 to 0.8546 (CBMF). NQR9 outperformed all methods evaluated in this study and provided the best MAE, MPE(%), and MAPE(%) values. Model weighting effectively removed the heteroscedasticity effect of the basic and mixed models. Performance of the methods improved as the sample size increased. Six (or more) sample plots was the most appropriate number and is recommended for model calibration when both prediction accuracy and sampling cost are considered.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to confidentiality.

Acknowledgments:
The authors would like to thank the faculty and students of the Department of Forest Management, Northeast Forestry University (NEFU), China who provided and collected the data for this study.

Conflicts of Interest:
The authors declare no conflict of interest. Table A1. Goodness-of-fit statistics of the basic RGR nv equations for larch forest (LF).