2.2. Weibull Distribution of Stand Diameter
In this study, Weibull distributions with two and three parameters were selected to simulate the diameter distribution of Chinese fir plantation, and the corresponding distribution function forms are shown in Equations (1) and (2).
where, x corresponds to the diameter class value of the upper and lower limit intervals of the diameter class; f (x) corresponds to the cumulative percentage of the number of trees of each diameter class; a is the location parameter, referring to the downline of the smallest diameter class of the diameter distribution; b is the scale parameter, b > 0; and c is the shape parameter, c > 0.
Based on the distribution function, the formulas of tree number for different diameter classes (3) (4) can be deduced.
where,
is the number of trees of diameter class i of the stand; N is the number of trees per unit area of the stand, trees·ha–1;
and
are the upper and lower limits of diameter class i of the stand.
Weibull distribution parameter estimation can be solved by a variety of mathematical methods. The nonlinear regression method in SAS software can seek the optimal solution of model parameters through multiple iterations. This method is selected in this study and Marquardt is chosen as the iterative method.
2.3. MaxEnt Distribution of Stand Diameter
The maximum entropy principle is applied to the distribution of stand diameter. The maximum entropy model of stand diameter aims to seek the distribution where entropy reaches the maximum value under certain constraint conditions:
where i is the order number of diameter class, k = 1,2,…,n; n is the number of diameter classes;
is the tree number probability in the ith diameter class of stand;
is the diameter class value in the ith diameter class of stand; k is the order number of the origin moments of diameter classes, k = 1,2,… m; m is the number of the origin moment of diameter class, and is also the highest order of the origin moment;
is the number of trees of diameter class i at origin moment k;
is the expected value of the origin moment k,
.
The central task of the maximum entropy model is to find out the method of determining
. Since entropy function S is a concave function, it can be transformed into a convex programming problem with separable variables, according to the duality theory of mathematical programming, and then the global optimal solution of its closed form can be obtained through the standing value condition of the Lagrange function. Under the constraints of Equations (6) and (7), the maximum of Equation (5) is solved by Lagrange multiplier method. First, the left side of Equation (6) is multiplied by
, and the left side of Equation (7) is multiplied by
−1, and then these resulting expressions are added. Then the sum is subtracted from Equation (5) to obtain Lagrange function, Equation (8).
The first order partial derivative of
is obtained by using Lagrange function Equation (8), and when it is equal to zero, Equation (9), and Equation (10) are obtained. By combining Equations (10), (2) and (3), then Equations (11) and (12) can be obtained. Equation (12) is a system of non-linear equations about
, and the numerical method is used to solve
. Equation (11) is substituted by
to get
, Equation (10) is substituted by
and
to get
, and Equation (5) is substituted by
to get S, thus completing the task of solving
and S.
where
, Lagrange multiplier;
, k-order moment coefficients.
In the problem of determining the value of m, if the value of m is small, the constraint equations are few and the calculation amount is small, but the constraint conditions described may not be comprehensive enough. If the value of m is large, there are many constraint equations, and the constraint conditions described are relatively comprehensive, but the computational work is large. When m is large enough, its contribution to the accuracy of the description model becomes smaller and smaller. Therefore, exploring the value of m can, not only make the model accurately describe the actual distribution, but also make the calculation amount moderate. Considering that the minimum number of sample diameter class interval in this paper was 6 (MIN (sample size) = 6), the maximum number of model parameters was 5 when fitting. In this paper, the range of m from 1 to 4 was explored. The diameter distribution model corresponding to
m = 1 to 4 and its parameters are given in
Table 2.
The algorithm flow of solving parameters is as follows: Step1, take m = 3 as an example, the corresponding MaxEnt model , the model is fitted using SAS PROC MODEL (SAS Institute 2010), the estimates coefficients of each plot are ;
Step2, taking
as initial values, the numerical solution of the system of non-linear Equations (13) are solved, the predicted values of the parameters are
;
Step3, substitute PREλk ( into Equation (11) to obtain . Step4, substitute PREλk and into Equation (10) to get prediction formula.
The obtained parameters are substituted into Equation (10), and the theoretical probability value is obtained. In other words, the theoretical probability distribution is obtained. The diameter distribution model is obtained by Chi-squared test of the theoretical distribution.
2.4. Dynamic Prediction of Stand Diameter Distribution
In the dynamic prediction of stand diameter distribution in this paper, the Weibull diameter distribution results were predicted by the PPM and the PRM. The MaxEnt diameter distribution used the PPM and the PSIM to realize the prediction of the unknown stand diameter distribution.
When the PPM was applied, two cases were considered: (1) the regression relationship between the parameters and 8 factors including stand characteristic factor (t, D, Dg, H and N) and stand DBH characteristic factor (DBH_KURT, DBH_SKEW and DBH_CV) was established; (2) the regression relationship between the parameters and 5 stand characteristic factors (t, D, Dg, H and N). By stepwise regression analysis, the parameter prediction model was established.
When the PRM was applied, the following two key parts are implemented: First, with the help of key points on the distribution curve, the recovery equation of model parameters was established. The three parameter recovery equations are shown below:
Second, the correlation equation of stand factor and diameter at key points on the distribution curve (recovery model) were established. Relevant studies [
44,
45] have shown a power function relationship between the diameter of the key points and Dg, so Equation (15) was used to estimate the diameter of key points in this paper. Therefore, the diameter of the key points was calculated by using the Equation set (14), and the regression relationship between them and the stand factor was established to form a complete prediction system. If the stand factor is known, the prediction of the diameter distribution of the unknown stand can be realized:
where
refers to the corresponding diameter of the key point i, the three key points are 0.333, 0.9 and the longitudinal coordinates of the inflection point of the equation;
and
are the diameters corresponding to the key points; m and n are the parameters to be estimated.
According to the characteristics of maximum entropy model belonging to the machine learning algorithm, this paper presents a dynamic prediction method of stand diameter distribution by plot similarity index method. The idea is to find a similar sample plot in the fitting data set, match the prediction parameters of the unknown stand diameter distribution, and realize the prediction of the unknown stand diameter distribution. Since skewness and kurtosis can evaluate the shape characteristics of stand diameter distribution, the variation coefficient can indicate the diameter distribution range. Equation (16) was used to calculate the similarity index of two places through these three variables. For each plot in the test data set, put the corresponding value into Equation (16) to calculate the similarity index of plot, and seek the minimum value, and the corresponding plot is the similar plot. In this way, the maximum entropy model parameters of similar plots were used to predict the diameter distribution of stands in the test data:
where, PSI represents the similarity index of the diameter class distribution of the two plots;
,
and
represent the kurtosis, skewness and coefficient of variation of a plot in the validation data set;
,
and
represent the kurtosis, skewness and coefficient of variation of a plot in the fit data set.
2.5. Comparison of the Models
The application effect of two-parametric and three-parametric Weibull function was examined by comparing the residual sum of square (RSS) and coefficient of determination (
R2). The results of maximum entropy model fitting were tested by RSS and mean square error (MSE).
where,
is the observed value for the ith observation,
is the predicted value for the ith observation,
is the mean of the
, and n the number of observations in the dataset.
We used the Chi-squared (
χ2) [
3,
6,
7,
9,
10,
11,
21,
22,
23] and Fisher’s test as goodness-of-fit measures for the diameter distribution estimations that an estimated distribution corresponds to the real distribution. P-values of less than 0.05 were considered statistically significant. According to the total sample size of each plot (n) and the theoretical number of each diameter class (T), the following three situations can be divided [
46]:
where, is the observed diameter class value in the ith diameter class of stand; is the estimated diameter class value in the ith diameter class of stand; in accordance with the reliability α = 0.05, if , the distribution status of the current stand data was consistent with the corresponding distribution function.
Using the Wilcoxon’s nonparametric ranked sum (or Mann–Whitney–Wilcoxon, MWW) test, we can decide whether the population distributions are identical without assuming that they follow the normal distribution [
47]. It is also an effective goodness-of-fit measure for the diameter distribution estimations. A two-sided probability value of less than 0.05 was considered to be statistically significant. The statistical analysis was performed in SAS PROC NPAR1WAY (SAS Institute 2010).