Next Article in Journal
Interannual Variation of Transpiration and Its Modeling of a Larch Plantation in Semiarid Northwest China
Previous Article in Journal
Redeployment of Shoots into Better-Lit Positions within the Crowns of Saplings of Five Species with Different Growth Patterns
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Bayesian Approach to Estimating Seemingly Unrelated Regression for Tree Biomass Model Systems

1
Key Laboratory of Sustainable Forest Ecosystem Management-Ministry of Education, School of Forestry, Northeast Forestry University, Harbin 150040, China
2
Department of Sustainable Resources Management, College of Environmental Science and Forestry (SUNY-ESF), State University of New York, One Forestry Drive, Syracuse, NY 13210, USA
*
Author to whom correspondence should be addressed.
Forests 2020, 11(12), 1302; https://doi.org/10.3390/f11121302
Submission received: 13 October 2020 / Revised: 1 December 2020 / Accepted: 2 December 2020 / Published: 4 December 2020
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)

Abstract

:
Accurate estimation of tree biomass is required for accounting for and monitoring forest carbon stocking. Allometric biomass equations constructed by classical statistical methods are widely used to predict tree biomass in forest ecosystems. In this study, a Bayesian approach was proposed and applied to develop two additive biomass model systems: one with tree diameter at breast height as the only predictor and the other with both tree diameter and total height as the predictors for planted Korean larch (Larix olgensis Henry) in the Northeast, P.R. China. The seemingly unrelated regression (SUR) was used to fit the simultaneous equations of four tree components (i.e., stem, branch, foliage, and root). The model parameters were estimated by feasible generalized least squares (FGLS) and Bayesian methods using either non-informative priors or informative priors. The results showed that adding tree height to the model systems improved the model fitting and performance for the stem, branch, and foliage biomass models, but much less for the root biomass models. The Bayesian methods on the SUR models produced narrower 95% prediction intervals than did the classical FGLS method, indicating higher computing efficiency and more stable model predictions, especially for small sample sizes. Furthermore, the Bayesian methods with informative priors performed better (smaller values of deviance information criterion (DIC)) than those with the non-informative priors. Therefore, our results demonstrated the advantages of applying the Bayesian methods on the SUR biomass models, not only obtaining better model fitting and predictions, but also offering the assessment and evaluation of the uncertainties for constructing and updating tree biomass models.

1. Introduction

In the practice of sustainable forest management, tree biomass is primarily used to calculate the carbon storage and sequestration of forests and further to comprehend climate change and forest health, productivity, and nutrient cycling [1,2]. It is well known that the most appropriate method for estimating tree biomass is to use biomass models developed using the direct measurements of tree biomass (response variable) and predictor variables [3,4]. Over the past decades, hundreds of biomass equations have been developed for different tree species around the world [5,6,7]. Tree diameter at breast height (DBH) is the commonly used predictor variable, which is considered the readily attainable and most reliable sole predictor [8]. Tree height (H) is also used for developing biomass models to improve the model performance and explain the potential limitations of intra-species divergence [9,10]. The biomass models with both DBH and H are recognized more stable and reliable for predicting tree biomass [11,12]. Furthermore, various modeling approaches have been explored and applied for developing tree biomass models [13,14].
To date, the biomass of different tree components (i.e., stem, branch, foliage, and root) are usually estimated jointly in order to account for the inherent correlations among those tree components because they are measured within the same sample trees [11]. The additive model systems of either linear or nonlinear functions are used to deal with the correlations between the biomass components from the same tree, which are commonly known as seemingly unrelated regression (SUR) models [15]. Over the past decades, there are different parameter estimation methods proposed for the additive model systems from the perspective of traditional statistics, such as two-step feasible generalized least squares (FGLS), iterative FGLS (IFGLS), maximum likelihood (ML), etc. [16,17]. FGLS has been widely used for parameter estimation of the SUR models [18] and has been considered more favorable in recent years because of its simplicity and flexibility in practical applications [19]. However, it is recognized that those classical methods are not able to access and evaluate the uncertainty of parameter estimates and the reliability of model predictions.
Zellner [20] proposed the usage of Bayesian analyses in modeling practices to take advantage of Bayesian’s capability of estimating model parameters using probability statements and, consequently, fulfilling the purposes of uncertainty assessment [21]. In the past, however, the applications of the Bayesian approaches were limited due to the lack of numerical optimization theory and computing power. In addition, Bayesian methods can be readily applied to linear models, but many allometric relationships are nonlinear in nature. Following the widespread applications of the Markov Chain Monte Carlo (MCMC) method and rapid development of computing technology, Griffiths [22] finally implemented Bayesian methods for SUR models in the field of econometrics.
For many years, Bayesian theorem was not accepted and well recognized among researchers, even though it was proposed by Thomas Bayes in the mid-18th century [23]. Scientists began to accept and deepen their understanding of Bayesian modeling philosophy because it has unique benefits in solving some practical problems, which are highly supported by the remarkable escalation in computing power. The fundamental enhancement of applying Bayesian methods lies within their ability to create the full posterior distributions of model parameters; hence, several statistics (e.g., mean, median, quantiles, etc.) can be easily computed from the samples generated during the model fitting processes [24]. In addition, Bayesian methods consider the model parameters as random variables and assume that each parameter has a prior distribution, which then is naturally integrated with the sample information to obtain the posterior probability distribution for the parameter [25]. To date, Bayesian analysis is considered an essential and important element in modeling practices across many scientific fields, such as biology [26], engineering [27], finance [25], genetics [28], etc. In general, Bayesian analysis can be divided into two major categories, namely objective and subjective Bayesian methods, due to the types of prior distributions. The objective Bayesian method often uses a non-informative prior (e.g., Jeffreys prior), while the subjective Bayesian method utilizes humans’ perspective or other available information as an informative prior for estimating the model parameters [29]. However, questions still arise among researchers regarding the appropriate determination and selection of the prior distribution in Bayesian methods. In fact, this is one of the most debated issues by researchers, because it influences the computation and results of the posterior distribution [30]. In recent years, the MCMC method has initiated broad prospects for the promotion and application of Bayesian methods so that it has become a reliable tool for handling complex problems in statistics and modeling practices [31].
Bayesian methods have been widely applied in forestry research in recent years, including the studies of tree diameter and height growth, mortality, tree biomass, etc. [1,13,21,24]. Zapata-Cuartas [13] established the aboveground biomass model for the Colombian tropical forest using Bayesian approaches incorporating 132 biomass equations from 72 published articles as the prior information. Zhang et al. [32] utilized Bayesian approaches to estimate stem, branch, foliage, root, and total tree biomass models with both non-informative and informative priors. They proved that the Bayesian methods with informative priors produced better performances than those with non-informative priors and classical methods. In this sense, Bayesian statistics have the advantage of model performance in the case of fitting individual models. However, the biomass of different tree components is usually estimated jointly, while other studies found in published research were only based on a single biomass model, ignoring the inherent correlations among the biomass of tree components. Although the additive model systems with linear or nonlinear functions have been widely used and become a standard for developing individual tree biomass models [2,8], to the best of our knowledge, there has been no publication in the literature on the applications of Bayesian methods with SUR to jointly estimate the additive biomass model systems of tree components.
Korean larch is one of the most important fast-growing conifer species for timber production in the Northeast, P.R. China. The larch plantations encompass a wide range of geographical coverage, from the northeastern to the northern subalpine areas of China [33]. It is desirable and needed to maximize and maintain the ecological and economic benefits of the Korean larch plantations. Therefore, reliable biomass models with the capability of uncertainty assessment are necessary for estimating, accounting, and monitoring tree biomass and forest carbon stocking. In this study, we used data consisting of 174 destructively sampled Korean larch (Larix olgensis Henry) to develop the additive biomass model systems of tree components (i.e., stem, branch, foliage, and root) for planted Korean larch. Two additive model systems were constructed to facilitate the application of different data types; the first one used tree DBH as the only predictor, and the second one used both tree DBH and H as the predictors. For each model system, we compared two parameter estimation methods: classical FGLS (as the benchmark) and Bayesian methods. A comprehensive sample of allometric biomass equations for Larix spp. in different locations gathered from published literatures provided sufficient prior information of model parameters.
The objective of this study was to explore the advantages of Bayesian approaches on the SUR models of individual tree biomass. We expected that the SUR model systems using Bayesian methods would produce superior model fitting and predictions, and the Bayesian methods with the informative priors would perform better than non-informative priors. We investigated the impacts of five priors of model parameters on the model fitting and validation results: (1) direct Monte Carlo simulation with Jeffreys invariant prior (DMC); (2) Gibbs sampler using Jeffreys invariant prior (Gs-J); and (3) Gibbs sampler using three priors of multivariate normal distribution, i.e., artificial setting (Gs-MN), self-sampling estimate (Gs-MN1), and other research results on biomass models (Gs-MN2), respectively.

2. Materials and Methods

2.1. Study Sites and Data Collection

The data used in this study were collected from 9 sites of Korean larch plantations of different ages in Heilongjiang Province, northeast China (Figure 1). A total of 38 plots (20 × 30 m or 30 × 30 m in size) were established in July and August of 2007–2016. Different classes of sample trees (dominant, codominant, medium, and suppressed) were felled from each plot (5–7 trees/plot). In the fieldwork, tree variables such as DBH and H were measured and recorded first. Then all live branches in each tree were cut and weighed; the subsamples of branch and foliage were from an average-sized branch in each pseudo-whorl. After that, the stems were each cut into 1 m sections and weighed. A thick disc of about 3 cm was cut at the end of each 1 m section, weighed, and taken to a laboratory for further sampling. The green weight of roots included large (diameter more than 5 cm), medium (diameter between 2 cm to 5 cm), and small (diameter less than 2 cm) roots, and the fine root (diameter less than 5 mm) was not considered. The subsamples of roots were taken from large, medium, and small roots, respectively. In total, 174 sample trees were selected for both above- and below-ground biomass. All subsamples of each component were oven-dried at 80 °C using a blast drying oven, until a constant weight was reached, and then weighed in the lab. The dry biomass of each tree component was calculated by the product of the green weight and dry/fresh ratio of the corresponding tree component. The tree DBH, H, and dry weight of each tree component are summarized in Table 1.

2.2. Biomass Model

The power function of allometric equations has been commonly used for biomass modeling [10], in which tree biomass is the dependent variable and tree attributes (i.e., DBH and/or H) are the independent or predictor variables [34]. In this study, we applied a likelihood analysis to determine which of the two error structures (additive and multiplicative) was more appropriate to the biomass data of Korean larch [2,35]. It was found that the multiplicative error structure was preferred so that the logarithmic transformation was applied to linearize the power function to construct the linear seemingly unrelated regression (SUR) models. Two additive model systems were constructed in this study as follows:
SURM 1 :   { l n W s = β s 0 + β s 1 l n D B H + ε s l n W b = β b 0 + β b 1 l n D B H + ε b l n W f = β f 0 + β f 1 l n D B H + ε f l n W r = β r 0 + β r 1 l n D B H + ε r
SURM 2 :   { l n W s = β s 0 + β s 1 l n D B H + β s 2 l n H + ε s l n W b = β b 0 + β b 1 l n D B H + β b 2 l n H + ε b l n W f = β f 0 + β f 1 l n D B H + β f 2 l n H + ε f l n W r = β r 0 + β r 1 l n D B H + β r 2 l n H + ε r
where W s , W b , W f , and W r are the biomass of stem, branch, foliage and root, respectively; DBH is the tree diameter at the breast height; H is the tree total height; ln is the natural logarithm; β k j are the model coefficients to be estimated from the data, where the subscript k = s ,   b ,   f   a n d   r corresponds to the four tree components, and j = 0 ,   1 ,   2 ; and ε is the model error term.
A general expression of linear SUR models (in matrix notation) is
Y = X β + ε
where Y is a vector of the response variable; ε is a vector of the error terms; X is a matrix of the predictors, including l n D B H for SURM1, and l n D B H and l n H for SUMR2; β is a vector of the model coefficients; and ε ~ N ( 0 ,   Ω I ) , where I is an identity matrix, Ω is the variance–covariance matrix of error term, and is the tensor product, such that:
Y = ( l n W s l n W b l n W f l n W r ) ,   X = ( X s 0 0 0 0 X b 0 0 0 0 0 0 X f 0 0 X r ) ,   β = ( β s β b β f β r ) ,   ε = ( ε s ε b ε f ε r ) ,   Ω = ( σ s 2 σ s σ b σ b 2 σ s σ f σ s σ r σ b σ f σ b σ r σ f 2 σ f σ r σ r 2 )

2.3. Classical Approach of SUR

Generalized least squares (GLS) is an efficient technique for parameter estimation when the error terms between regression models are correlated [2,36]. However, if the covariance Ω of the model errors is unknown, one can get a consistent estimate of Ω using an implementable version of GLS known as the feasible generalized least squares (FGLS). The package systemfit in R software 3.5.3 [37,38] was used to obtain the classical estimations of model coefficients and the covariance matrix in this study.

2.4. Bayesian Approaches of SUR

In Bayesian inference, both model coefficients β and covariance Ω are considered random variables with probability distributions that can be calculated using the data D such that
π ( β ,   Ω | D ) = π ( β ,   Ω ) L ( D | β ,   Ω )   π ( β ,   Ω ) f ( D | β ,   Ω ) d ( β ,   Ω ) π ( β ,   Ω ) L ( D | β ,   Ω )
where π ( β ,   Ω | D ) and π ( β ,   Ω ) are the posterior- and prior-probabilities, respectively, L ( D |   β ,   Ω ) is the likelihood, and the denominator of Equation (5) is the marginal probability of the data D [13]. Based on Equation (5), algorithms of Bayesian approaches were proposed on the SUR models [39]. To implement the Bayesian estimation, we fit the SUR models using both the Gibbs sampler and direct Monte Carlo (DMC) sampling along with different priors [39] (the detailed calculation processes of the Gibbs sampler and DMC are attached in Appendix A).
We proposed five estimations of the SUR models as follows:
(1)
Direct Monte Carlo simulation approach using Jeffreys invariant prior (namely DMC).
(2)
Gibbs sampler using Jeffreys invariant prior (namely Gs-J).
(3)
Gibbs sampler using a multivariate normal distribution with mean vector and variance defined as 0 and 1000 (the covariance is 0) as priors of model parameters, and a default inverted Wishart distribution [25] as the priors of variance–covariance matrix errors (namely Gs-MN), respectively, which was considered as a non-informative prior.
(4)
Gibbs sampler using a multivariate normal distribution prior, which consisted of parameters estimated by the FGLS method with subsamples (namely Gs-MN1). The percentages of the subsamples were 10%, 20%, 30%, …, and 90% of the 174 independent trees data, which were sampled without conducting any replacement. Each simulation was repeated 10,000 times.
(5)
Gibbs sampler using a prior of multivariate normal distribution, which consisted of the parameters estimated by using logarithmic functions of different biomass components for Larix spp. (namely Gs-MN2). The previous biomass model parameters in the same forms for Larix spp. were collected from a normalized tree biomass equation dataset in Luo et al. [10]. A default inverted Wishart distribution proposed by Rossi and Allenby was used as the prior of variance–covariance matrix errors [25]. Even though the biomass functions were sampled from different areas in China, the model parameters were well represented by a multivariate normal distribution with mean vector and variance–covariance matrix after a multivariate normal test (p > 0.05).
In this study, the FGLS estimates of the model parameters were used as the starting values for the Gs-J method to reach faster convergence and save computational time. For the Gs-MN, Gs-MN1, and Gs-MN2 methods, the starting values were automatically generated in the R package bayesm when the prior distributions were given [40]. On the setting of options for the Gibbs sampler, the size of the posterior samples was set to 33,000, including 3000 burn-in periods and thinning length of 3 to ensure the performance’s stability along with the relatively small autocorrelation of Markov chains constructed. The thinning length of 3 means that we took 1 sample for every 3 samples from the remaining 30,000 samples obtained and subsequently received 10,000 posterior samples. This set of the thinning length is beneficial to reduce the autocorrelations between samples. The convergence of the Gibbs posterior samples was assessed using the Geweke’s and the Heidelberger and Welch’s convergence diagnostic [41,42,43]. For the DMC approach, the process was repeated 10,000 times, corresponding to the Gibbs posterior samples. We used the R package bayesm to implement a Gibbs sampler with a prior multivariate normal distribution [40] and package Coda [43] to conduct the convergence diagnostic. However, there were no published packages that could implement Bayesian analysis with Jeffreys’s prior, and thus we used custom R code to implement the Bayesian estimation of DMC and Gs-J.

2.5. Model Evaluation and Validation

In this study, the deviance information criterion (DIC) was used to evaluate and compare the model fitting of the Bayesian estimations with different priors [28]. In addition, the coefficient of determination ( R 2 ) and root mean square error (RMSE) were calculated for the description of model performance [44]. The models with higher R 2 and lower DIC and RMSE were preferred.
Model validation is necessary to assess the applicability of the models, because better model fitting does not necessarily indicate better model prediction quality. In this study, we used the leave-one-out cross-validation to test the model prediction performance [45]. Mean absolute error (MAE) and mean absolute percent error (MAE%) were calculated to validate the model prediction performance [2]. The mathematical formula of the DIC, R 2 , RMSE, MAE, and MAE% can be expressed as follows:
D I C = D ¯ + p D
R 2 = 1 i = 1 n ( Y i Y ^ i ) 2 i = 1 n ( Y i Y ¯ i ) 2
R M S E = ( i = 1 n ( Y i Y ^ i ) / n ) 2 + i = 1 n ( Y i Y ^ i ) 2 / ( n 1 )
M A E = i = 1 n | Y i Y ^ i , i | n
M A E % = i = 1 n | Y i Y ^ i , i | / Y i n × 100
where D ¯ is the posterior mean of the deviance (−2 × Log likelihood of the given data and parameters); p D is the model complexity, which is summarized by the effective number of parameters; Y i and Y ^ i are the observations and predictions, respectively; Y ^ i , i are the predictions obtained by using leave-one-out cross-validation; and n is the sample size.

2.6. Anti-Logarithm Correction Factors

When the logarithmic transformation is applied to the power function, the correction factors (CFs) are commonly used to correct the bias produced by the anti-log transformation. Several CFs from previous studies were compared in this study [2,46,47]:
C F 1 = e x p ( S 2 / 2 )
C F 2 = 1 + S 2 / 2
C F 3 = e x p ( S 2 2 ( 1 S 2 S 2 + 2 4 n + S 4 3 S 4 + 44 S 2 + 84 96 n 2 ) )
where S 2 is the mean square error and calculated as S 2 =   ( Y i Y ^ i ) 2 / ( n p ) ; n is the sample size; p is the number of model parameters; and Y i and Y ^ i are the observations and predictions, respectively, taken from Equations (1) and (2). To compare the effectiveness of the three CFs, three measures were used [2,48,49] as follows:
B = C F 1 C F × 100
G = C F 1 × 100
M P D =   ( Y i Y ^ i ) / Y i × 100 n
where B is percent bias, G is percent standard error, and M P D is mean percentage difference; Y i and Y ^ i represent anti-log observations and anti-log predictions, respectively, by CFs; and n is the sample size.

2.7. Stability Analysis for Classical and Bayesian Approaches

Simulations were performed using the subsamples of all biomass data to compare the classical SUR models (FGLS) against the Bayesian SUR models. Specifically, we compared the characteristics of the model parameters and prediction performance using different sample sizes. In the simulation process, we sampled 10, 15, 20, 30, 40, 60, 80, 110, 140, and 170 individuals without replacement from the entire dataset (174 trees), and then the procedure was repeated 5000 times. Within each subsample, we estimated the model parameters using both FGLS and Bayesian estimations. The distributions of estimated parameters were analyzed, and corresponding predictions were summarized and used to calculate the indices of the efficiency of estimation (MAE and MAE%). All simulations were performed using R software 3.5.3 [38].

3. Results

3.1. Fitting the SUR Models

For the Bayesian methods Gs-MN1 on both SURM1 and SURM2, the prior information consisted of the model parameter values estimated by the FGLS method with subsample sizes of 10%, 20%, 30%, …, and 90% of the 174 trees sampled without replacement. For Gs-MN2, the prior information in the normalized tree biomass equation dataset [10] was used to estimate the SURM1 and SURM2 model parameters. The results of statistical tests showed that these derived parameters values indeed followed a multivariate normal distribution. The specific forms and values were as follows:
Gs-MN1 of SURM2:
β ¯ = ( 4.0167 ,   2.7273 ,   3.3781 ,   2.7429 ,   3.5230 ,   2.0626 ,   2.9308 ,   1.4780 ) T ,
A ¯ = ( 1760 5024 801 2212 43 198 7 126 5024 14 , 633 2220 6349 163 652 77 545 801 2220 2210 6090 673 2017 52 267 2212 6349 6090 17 , 393 1985 5974 258 1030 43 163 673 1985 1357 3866 726 2023 198 652 2017 5974 3866 11 , 273 2011 5821 7 77 52 258 726 2011 1473 4082 126 545 267 1030 2023 5821 4082 11 , 717 ) ,
ν 0 = 4 ,
V 0 = ( 0.0668 0.0270 0.0193 0.0074 0.0270 0.0547 0.0344 0.0161 0.0193 0.0344 0.1169 0.0566 0.0074 0.0161 0.0566 0.0811 ) ;
Gs-MN1 of SURM2:
β ¯ = ( 5.0939 ,   2.2007 ,   0.7288 ,   4.4558 ,   1.6630 ,   1.1908 , 2.6587 ,   2.9240 ,   1.1908 , 2.5218 ,   1.8870 ,   0.5649 ) T ,
A ¯ = ( 1760 5045 4980 638 1786 1725 7 71 46 46 196 181 5045 14 , 732 14 , 478 1728 5067 4820 58 298 231 159 679 614 4980 14 , 478 14 , 268 1671 48.7 4627 38 244 176 162 662 606 6380 1728 1671 7547 20 , 716 20 , 647 233 878 783 29 227 190 1786 5067 4837 20 , 716 58 , 708 57 , 962 837 2870 2639 162 835 721 1725 4820 4627 20 , 647 57 , 962 57 , 478 759 2684 2447 1372 7545 653 7 58 387 233 837 759 1319 3732 3668 696 1939 1916 71 298 244 878 2871 2685 3732 10 , 845 10 , 857 1922 5550 5423 45 231 176 783 2639 2447 3668 10 , 587 10 , 373 1913 5455 5359 46 159 162 29 162 137 696 1922 1913 1430 3954 3826 196 679 662 227 835 754 1939 5550 5455 3954 11 , 323 11 , 127 181 6149 606 190 721 653 1916 5423 5359 3926 11 , 128 10 , 990 ) ,
ν 0 = 4 ,
V 0 = ( 0.0568 0.0060 0.0025 0.0005 0.0060 0.0117 0.0001 0.0002 0.0025 0.0001 0.0899 0.0438 0.0005 0.0002 0.0438 0.0752 ) ;
Gs-MN2 of SURM1:
β ¯ = ( 4.0167 ,   2.4730 ,   0.7288 ,   2.9983 ,   2.5546 ,   4.4397 , 2.4903 ,   4.5258 ,   2.1122 ) T ,
A ¯ = ( 0.7675 0.2753 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.2753 0.1203 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.8751 0.3542 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.3542 0.1554 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.1078 0.3631 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.3631 0.1380 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0443 0.3479 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.3479 0.1581 ) ;
Gs-MN2 of SURM2:
β ¯ = ( 3.8163 ,   1.7587 ,   0.8807 ,   3.3932 ,   1.8392 ,   0.9196 , 4.6487 ,   1.7310 ,   0.8655 , 5.0326 ,   1.5111 ,   0.7555 ) T ,
A ¯ = ( 0.5112 0.2052 0.5271 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.2052 11.5670 6.4188 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.5271 6.4188 43.0293 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.9842 1.1609 0.1566 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.1609 14.2652 2.0570 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1566 2.0570 51.6092 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.2258 0.1337 0.01687 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1337 3.5121 1.2614 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0169 1.2614 13.7326 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1698 0.0415 0.1603 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0415 2.6080 0.5663 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1603 0.5663 10.5430 ) ,
where in β ~ N ( β ¯ , A 1 ) ,   β ¯ is the prior mean, and A is the prior precision matrix; in Ω ~ I n v W i s h a r t ( ν 0 , V 0 ) , ν 0 is the degree of freedom parameter for the inverted Wishart prior, and V 0 is the scale parameter for the inverted Wishart prior.
The posterior means, standard deviations, and 95% posterior intervals for the Bayesian methods of both SURM1 and SURM2 are reported in Table 2 and Table 3, respectively, in which the posterior draws from the Gibbs sampler successfully passed the convergence test. Furthermore, the parameter estimates and standard errors of FGLS are also listed in Table 2 and Table 3. The means of the posterior draws from different sampling were relatively equal, but there was a significant discrepancy in their standard deviations. The tables indicated that DMC produced the lowest standard deviation, followed by Gs-MN1, Gs-J, Gs-MN2, and Gs-MN, as did the 95% posterior intervals. In order to simplify the comparison, Figure 2 visualizes the results of the posterior draws in the form of probability density plots for the Bayesian methods. The same trend was clearly shown for each model parameter (Figure 2). Furthermore, we calculated the correlation matrix of the model errors between the four tree components after the model parameters were obtained. However, those correlation matrices were not significantly different for the similar parameter estimates between the six modeling methods. Hence, we only illustrated the correlation matrix for DMC in Table A1 (see Appendix B).

3.2. Model Evaluation and Validation

Table 4 presents the R 2 , RMSE, and DIC of the six modeling methods for the four tree components (i.e., stem, branch, foliage, and root). According to the DICs, Gs-MN1 performed better than other methods either for SURM1 (DBH as the only predictor) or SURM2 (DBH and H as the predictors). For the non-informative prior Bayesian methods, there was not much difference of DIC between DMC and Gs-J, but there was a significant reduction in DIC for Gs-MN. For the informative priors, only Gs-MN1 produced a relatively better performance in most tree components, while Gs-MN2 performed similarly with other non-informative priors. The results also indicated that all models fitted the larch biomass data well, given R 2 all higher than 0.75 in the model (Table 4). The stem biomass models obtained the highest R 2 , while the foliage biomass models obtained the smallest R 2 . Incorporating H as the second predictor (SUM2) improved the model fitting for the stem, branch, and foliage biomass models, while unexpectedly it resulted in a slight reduction in the model R 2 for the root biomass models. In general, the Gs-MN1 method for both SURM1 and SURM2 yielded smaller DIC and relatively similar R 2 and RMSE compared with other modeling methods.
The averages and 95% prediction intervals (PIs) for the DBH-only classical method (FGLS), non-informative (DMC, Gs-J, and Gs-MN), and informative priors (Gs-MN1 and Gs-MN2) Bayesian prediction curves are showed in Figure 3. It is clear that the 95% PIs of the five Bayesian methods were much narrower than the FGLS models across all tree components. The results indicate the Bayesian models are more reliable and stable than FGLS. Overall, DMC evidently produced the narrowest 95% PI among the five Bayesian methods, followed by Gs-MN1. On the other hand, the differences between the other three Bayesian methods (i.e., Gs-J, Gs-MN, Gs-MN2) were indistinguishable.
Leave-one-out cross-validation was used to validate the log-transformed SUR models for the six methods (Table 5). According to MAE and MAE%, the model prediction bias varied across different tree components. The stem and branch models consistently produced the smallest and the largest MAE%, respectively, across all methods for both SURM1 and SURM2, while Gs-MN1 yielded the best prediction for both SUM1 and SUM2 for most tree components (Table 5).

3.3. Comparison of Correction Factors on Anti-Log Transformation

In order to further analyze the effects of applying different correction factors, the percent bias (B, %) and percent standard error (G, %) of three correction factors (CF1, CF2, and CF3) for both SURM1 and SURM2 were calculated using the Gs-MN1 method (Table 5). In addition, the mean percentage difference (MPD, %) of CF0, CF1, CF2, and CF3 was also listed in Table 6 (CF0 = 1, without applying any correction factor). Surprisingly, SURM2 always had smaller values of correction factors (B, G, and MPD) across the four tree components than those of SURM1, indicating that adding H into the biomass equations indeed improved the model fitting and performance. However, the values of the correction factors, B, and G were relatively similar for the CF1, CF2, and CF3, revealing that the three correction factors were indistinguishable for either SURM1 or SURM2 in this study. Further, the mean percentage difference of the uncorrected prediction (MPD0) was the lowest value (Table 6), so that the correction factor might not be necessary for the anti-log transformation in this study.

3.4. Stability Analysis in Repeated Trials between Various Sizes of Data and Methods

In this study, the parameter estimates obtained by FGLS and Bayesian methods were similar using the entire biomass data (Table 2 and Table 3). However, the two methods produced the parameter estimates with different ranges when the sample size decreased, in which the FGLS models had a wider range of the parameter estimates than those of the Bayesian models on SURM1 (Figure 4), in which the five models (except Gs-J) were selected to illustrate the stability of the parameter estimation methods. Figure A1 (see Appendix B) shows the trend of SURM2, which is similar to that of SURM1. It was evident that the ranges of the parameter estimates decreased as the subsample size increased. The distributions of the parameter estimates were relatively identical for each subsample size in DMC and FGLS, in which both methods estimated the parameters outside the confidence intervals obtained with the entire data (174 trees). For the other non-informative priors, the simulation results of Gs-MN were slightly different from both DMC and FGLS. The Bayesian methods with the informative priors produced less variation or fluctuation for the parameter estimates with narrower ranges than those with the non-informative priors (Figure 4 and Figure 5). For the small subsamples of Gs-MN2, the means of the parameter estimates clearly deviated from the means of the parameter estimates obtained with the entire data, while Gs-MN1 performed well in each subsample size. The results indicated that using an inaccurate informative prior will produce incorrect results.
The MAEs and MAE%s were calculated using SURM1 produced in the simulations and reapplied to the entire data (Figure 5); the simulation result of SURM2 was highly similar to that of SURM1, as shown in Figure A2 (see Appendix B). The summary of MAEs and MAE%s for all methods and tree components across ten different sample sizes are given in Table A2 (see Appendix B). SURM2 had a higher prediction efficiency than SURM1 from the values of the MAEs and MAE%s. Further, the prediction bias of Gs-MN1 was most stable, indicating that a good informative prior has the advantages of using a small number of data. For Gs-MN2, the prediction bias calculated using the parameter estimates obtained from the small samples deviated significantly from those calculated using all data, although the ranges of MAE and MAE% were smaller than both Bayesian models with non-informative priors and FGLS.

4. Discussion

4.1. Biomass SUR Models

Tree biomass is the basis for estimating net primary production and carbon of forest ecosystems, especially when the global carbon emissions and carbon sinks have gradually become the emergent topics [50,51]. Usually, allometric equations are used to develop tree biomass models for specific species and areas [8,14]. In our research, two additive model systems (i.e., SURM1 and SURM2) were constructed to estimate tree biomass of planted Korean larch by both classical and Bayesian approaches. The main objective of this study was to explore the advantages of Bayesian approaches on the SUR biomass models of the four tree components (i.e., stem, branch, foliage, and root). As we expected, our results demonstrated that applying the Bayesian methods on the SUR model systems not only obtained better model fitting and predictions, but also provided the assessment and evaluation of the uncertainties for constructing and updating tree biomass models.
For computational efficiency, we compared the computing time for both FGLS and Bayesian approaches on the SURM1 estimations by a computer of a 3.2 GHz Intel (R) Core (TM) i7—8700 CPU and RAM of 32.0 GB. The result showed that it took about 0.46 s to obtain the posterior samples of parameters for the Bayesian approaches (i.e., Gs-MN, Gs-MN1 and Gs-MN2) when the sample size was 33,000 (including 3000 burn-in periods and 3 thinning length) using the R package bayesm, while the classical FGLS took only 0.03 s to get the parameter estimates using the R package systemfit. It was clear that the computing time of both approaches was small enough to be ignored. However, the DMC and Gs-J took more time (about 15 min) to estimate the models of the same setup described above, which may be caused by the limitation of our customized R codes, and the computation would have been faster if the codes had been optimized.
The published research revealed that the model parameters of biomass equations can be described by a multivariate normal distribution. In other words, the variability in these model parameters represented different biotic and abiotic factors influencing its own allometric biomass relationships for specific species and locations [13]. With the Bayesian approaches, the model parameters of new locations can be estimated and updated from the prior information, which was different from the calibration of the random-effect parameters in mixed-effect models [14], because the random-effect parameters were fixed for a specific level. Compared to the classical approaches, the Bayesian approaches produced narrower 95% prediction intervals so that the Bayesian statistics have the advantages of reducing uncertainty and obtaining more stable predictions (Figure 3) [13,32], and the good prior information improved the accuracy of biomass predictions (Table 5). However, the Bayesian approaches sometimes performed worse if the prior information was not good, like Gs-MN2.
Our results of stability analysis showed the evidence, as we expected, that the Bayesian approaches with informative priors provided more stable parameter estimates and reduced the uncertainty of parameters for small sample sizes, similar to the findings of Zapatacuartas et al. [13] (also see more discussion in the Section 4.2). The smaller fluctuations of both MAE and MAE% for the Bayesian methods with the informative priors indicated that the Bayesian methods had higher estimation efficiency than the classical method, especially for small sample sizes (N = 10, 20, 30). However, the difference in the estimation efficiency between FGLS and Bayesian methods decreased as the sample size increased. Overall, the sample size reflected the amount of the prior data information, of which the values of the posterior parameters mainly depended [28,30]. Increasing sample size continuously corrected the prior distribution until it significantly improved the posterior distribution. This phenomenon can be considered as one of the reasons that the estimations of non-informative Bayesian and FGLS were identical.
Considering the inherent correlations between tree components (additivity or compatibility) in developing biomass equations has been recognized as an essential characteristic of the biomass model systems [2,4]. Numerous SUR models in the literature commonly used one or even more constraints in the simultaneous model fitting process [18,52]. Zhao et al. [53], Dong et al. [14], and Widagdo et al. [54] conducted a comprehensive simulation and comparison of applying zero, one, and three restriction(s) in the SUR model systems. All of these researches confirmed that applying SUR without using any constraint, as proposed by Affleck and Diéguez-Aranda [55], gave slightly better model predictions than those using one or more constraints. Therefore, we decided to use no constraint to account for the correlations between tree components. Our results indicated that the correlation coefficients ranged from 0.1062 to 0.5836 across the tree components for SURM1, while they were slightly smaller (from 0.0026 to 0.5432) for SURM2. Therefore, it was appropriate to jointly fit the simultaneous biomass equations.
The sole DBH predictor of the SUR model (SURM1) produced relatively large variations for the tree components, and the inclusion of H as the second predictor variable in stem, branch, and foliage biomass models (SURM2) yielded a relatively greater enhancement on the model performance for both model fitting and validation. Our results were compatible with previous publications [8,56]. Further analyses showed that the root biomass model’s performance using DBH as the sole predictor was better than those of two predictors (DBH and H), although the parameter estimates of H were statistically significant. These findings indicated that the benefit of adding H as the additional predictor on root biomass models was relatively small [56,57]. Overall, SURM2 apparently produced better parameter estimation and model prediction than SURM1 for three of the four tree components biomass.

4.2. Non-Informative vs. Informative Priors in Bayesian Methods

Both DMC and Gs-J used the Jeffreys non-informative priors but obtained slightly different results for model fitting. The DIC values indicated that DMC was better than Gs-J, with smaller standard deviations (Table 2 and Table 3, and Figure 2). The Gs-J method needed to check on the posterior samples’ convergence, while DMC did not. Zellner [39] reported one of the advantages of DMC, in which DMC might generate independent samples within a weak autocorrelation. Hence, we also evaluated the posterior samples’ autocorrelation of the parameter βr0 (Table A3 in Appendix B). The autocorrelation of various methods at every non-zero lag was very small, indicating that the autocorrelation in the posterior samples rarely existed. Thus, it may not be apparent that the DMC’s autocorrelation was less than those of other methods based on the Gibbs sampler, which was slightly different from Zellner’s inference [39]. As described in Section 2.4, the thinning length used in the Gibbs sampler’s posterior samples was set to 3 in order to weaken the autocorrelation, but this step was absent in Zellner [39].
The DMC method seemed to be more efficient in the computation process according to the real size of the generated posterior samples (DMC method was 10,000 and Gibbs was 33,000), similar to the inferences in other studies [58]. However, the computational efficiency may not become the future’s primary indicator in evaluating the quality of the estimation method, since computing capacity has been exponentially increased over the years. Furthermore, the Gs-MN method also used a non-informative prior of multivariate normal distribution with mean vector 0 and variance 1000 (covariance is 0), in which its standard deviations of the model parameters were larger than Gs-J. This phenomenon may be caused by the variances of model parameters in the priors being set to 1000. When the variances were limited, there was still an upper and lower limit, although the distribution range of the prior’s parameters was relatively wide. Thus, this can be categorized as a pseudo-non-informative prior with remarkably insufficient information.
In our study, we determined two types of informative priors according to the samples’ sources to further implement the Gibbs sampler-based Bayesian analysis. The first type was derived from the results of our data’s repeated self-sampling (Gs-MN1), while the other was acquired from the literature (Gs-MN2). The self-sampling informative prior gave a more accurate value of not only parameters’ means and variances, but also fitting and validation performance compared with the one based on the literature’s informative prior. These findings confirmed that accurate informative priors would bring a greater improvement in the model performances. Overall, the Gs-MN1 method produced the smallest DIC value, followed sequentially by DMC, Gs-J, Gs-MN, and Gs-MN2. However, our results also suggested that, at least by using the data obtained from this research, the literature’s informative priors may not be suitable for biomass modeling in the study area. In future research, further research on the determination of prior information is still needed to improve biomass model performance.

4.3. Anti-Logarithm Correction Factor

Logarithmic transformation in biomass modeling is used to simply linearize the nonlinear relationships between biomass and predictors [59,60]. It is important to make a decision on whether or not a bias correction is needed. Usually, it is determined by the size of the error term [61]. In this study, three evaluated correction factors (CFs) were all small (ranging from 1.0058 to 1.0604). For the statistics of CFs, the B and G were relatively similar, suggesting that the difference between the CFs was not significant. Furthermore, we also compared the mean percentage difference (MPD) of the four types of CFs (CF0, CF1, CF2, and CF3). The results showed that MPD0 (calculated by CF0) was the smallest among others, indicating that the anti-log correction may not be necessary for our biomass data. In general, the CFs’ inferences obtained in this study were consistent with previous research [61,62].

5. Conclusions

In this study, we applied Bayesian methods to construct two sets of the seemingly unrelated regression models (i.e., SURM1 and SURM2) with no constraint to take account of the inherent correlations between four tree components (i.e., stem, branch, foliage, and root) collected from the same tree. The classical method FGLS was also used to estimate the model parameters of the SUR models as comparison. The Bayesian methods with the informative priors not only produced better model fitting, but also higher prediction accuracy and computing efficiency. In addition, the Bayesian methods had more stable predictions for small sample sizes, which could be used to estimate new model parameters in new locations, as we suggested. However, finding good informative priors for the model parameters was crucial and necessary to improve the model fitting and prediction accuracy and efficiency. On the other hand, the anti-logarithm correction factors on the model predictions may not be necessary for our developed models, as the model error terms were relatively small.
Overall, applying the Bayesian methods on the SUR models of tree biomass showed good benefits compared to the classical modeling method, especially when the amount of available data was relatively small. These newly-developed SUR models based on Bayesian methods can be used to accurately estimate the tree biomass of planted Korean larch in northeast China and to estimate carbon storage and further understand the energy matter distribution of larch plantations.

Author Contributions

L.D. and F.L. provided the data, conceived the ideas, and designed methodology; L.X. analyzed the data and wrote the manuscript. F.R.A.W. and L.Z. helped in analyzing the data and writing the paper. All authors contributed critically to the data collection and manuscript preparation. All authors have read and agreed to the published version of the manuscript.

Funding

This research was financially supported by the National Natural Science Foundation of China (No. 31971649), the National Key R&D Program of China (No. 2017YFD0600402), Provincial Funding for National Key R&D Program of China in Heilongjiang Province (No. GX18B041), the Fundamental Research Funds for the Central Universities (No. 2572019CP08) and the Heilongjiang Touyan Innovation Team Program (Technology Development Team for High-Efficient Silviculture of Forest Resources).

Acknowledgments

The authors would like to thank the faculty and students of the Department of Forest Management, Northeast Forestry University (NEFU), China, who provided and collected the data for this study. And the authors also thank two anonymous reviewers for suggestions and comments that improved the contents and quality of the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Classical Approach of SUR
A m-dimensional linear SUR model system can be expressed as follows:
y i = X i β i + ε i
where E [ ε i ε j ] = { ω i j I i j ω i 2 I i = j , i and j are 1 ,   2 ,   3 ,   . ,   m . A general expression of linear SUR models (in matrix notation) is
Y = X β + ε
FGLS for estimating SUR model system:
Step 1. Fit the m linear functions of the SUR models individually and calculate the error terms ε i for each function, where i = 1 ,   2 ,   ,   m ;
Step 2. Compute the variance–covariance matrix Ω ^ of ε i obtained in step 1 by Equation (A3):
Ω ^ = 1 n   ε i ε j
Step 3. Estimate the model coefficients by GLS as follows:
β ^ = ( X T ( Ω ^ 1 I ) X ) 1 X T ( Ω ^ 1 I ) Y
where upper right T is a transpose of the matrix, and Ω is the variance–covariance matrix.
Bayesian approaches of SUR
Based on Equation (A2), the likelihood function by giving the data D is
L ( D |   β , Ω ) = 1 ( 2 π ) n m / 2 | Ω | n / 2 e x p [ 1 2 ( Y X β ) T ( Ω 1 I ) ( Y X β ) ]
Gibbs Sampler on SUR models
  • A simple prior of β and Ω was expressed as follows:
    { π ( β ,   Ω | D ) = π ( β ) π ( Ω )   β ~ N ( β ¯ , A 1 )                                         Ω ~ I n v W i s h a r t ( ν 0 , V 0 )
    where N ( β ¯ ,   A 1 ) denotes the multivariate normal distribution, where β ¯ is the prior mean vector of   p i × 1 , and p i is the number of parameters in the i t h single model of Equation (A1); A is the prior precision matrix of   p j ×   p j ; and I n v W i s h a r t ( ν 0 , V 0 ) denotes the inverse Wishart distribution, where both ν 0 and V 0 are the degree of freedom and m × m prior precision matrix for the inverse Wishart prior. The normal prior of β is conjugated with the conditional likelihood of Equation (A5), and the posterior of β by giving Ω is normal:
    π ( β |   Ω , D ) = N ( β ^ ,   Ω ^ β )
    where β ^ = ( X T ( Ω 1 I ) X + A ) 1 ( X T ( Ω 1 I ) Y + A β ¯ ) and Ω ^ β = ( X T ( Ω 1 I ) X + A ) 1 . Meanwhile, the inverted Wishart prior is a conditional conjugate and is used to obtain the posterior distribution with the following prior:
    π ( Ω |   β , D ) = I n v W i s h a r t ( ν 0 + n , V 0 + R )
    where R is the m × m matrix of ( r i j ) , and r i j = ( y i X i β i ) T ( y j X j β j ) in Equation (A1).
  • A non-informative prior that is widely used, namely Jeffreys invariant prior is expressed as follows:
    π ( β ,   Ω | D ) = ( β ) π ( Ω ) | Ω | m + 1 2 .
    The conditional posteriors are [39]
    π ( β | Ω , D ) = N ( β ^ ,   Ω ^ β )
    π ( Ω | β , D ) = I n v W i s h a r t ( R , n )
    where β ^ = ( X T ( Ω 1 I ) X ) 1 X T ( Ω 1 I ) Y , and Ω ^ β = ( X T ( Ω 1 I ) X ) 1 .
According to the conditional posteriors on Equations (A7), (A8), (A10), and (A11), a Markov Chain Monte Carlo (MCMC) method with a Gibbs sampler can be used to obtain posterior samples of the SUR model parameters. The steps of the Gibbs sampler on SUR models are summarized as follows:
Step 1.
Give the starting values of Ω , namely Ω 0 , the starting values were obtained from prior distributions generally;
Step 2.
Draw β 1 from Equation (A7) or (A10);
Step 3.
Draw Ω 1 from Equation (A8) or (A11);
Step 4.
Repeat both step 2 and step 3 to draw β j and Ω j for N times, j = 1 ,   2 ,   3 ,   ,   N ,
where N is the sample size that could be manually set.
DMC for estimating SUR model system
The SUR models (Equation (A1)) are reformulated as follows:
{ Y 1 = X 1 β 1 + e 1 = Z 1 b 1 + e 1 Y 2 = X 2 β 2 + ρ 21 ε 1 + e 2 = Z 2 b 2 + e 2 Y m = X m β m + j = 1 m 1 ρ m j ε j + e m = Z m b m + e m
with e 1 = ε 1 , e i ~ N ( 0 ,   Σ ) , and Σ = d i a g ( σ 1 2 , σ 2 2 , , σ m 2 ) . There is no correlation detected between the error terms e i and e j of Equation (A12). The Z i is the design matrix that including predictors, and b i is the vector of model parameters, where i =1, 2, 3 and 4. Thus, the likelihood function of the data D becomes
L ( D | b , Σ ) = j = 1 m 1 ( 2 π σ j 2 ) n / 2 e x p [ ( Y j Z j b j ) T ( Y j Z j b j ) 2 σ j 2 ] .
Herein, we set the Jeffreys invariant prior as the prior:
π ( b ,   Σ | D ) = π ( b ) π ( Σ ) j = 1 m ( σ j ) 1
then the conditional normal inverse-gamma posterior is
π ( b j | b j 1 , , b 1 , σ j 2 , D ) = N ( ( Z j T Z j ) 1 Z j T Y j ,   σ j 2 ( Z j T Z j ) 1 )
π ( σ j 2 | b j 1 , , b 1 , D ) = I G ( 1 2 ( Y j Z j b j ) T ( Y j Z j b j ) , 1 2 ( n p j j + 1 ) )
The sampling procedure of m-dimensional SUR models can be determined using the following steps:
  • Step 1. Set j = 1 to generate σ 1 2 ( k ) ( k = 1 ,   2 ,   3 , , N ), where N is the samples size. Draw b 1 ( k ) from the Equation (A15).
  • Step 2. j = j + 1 is set as the increment, and then σ j 2 ( k ) is drawn by using Equation (A16) to generate the b j ( k ) in Equation (A15).
  • Step 3. Repeat the step 2 sequentially until j = m is reached.
  • Step 4. Transform { b ,   Σ } into to { β ,   Ω } .
The more detailed information of DMC approach could be found in the Zellner’s inference [39].
Predicting with Bayesian method
Given the posterior samples of model parameters, the predictive density of model’s dependent variables can be approximated as follows:
  f ( y n e w | x n e w , β , Σ ) g ( β , Σ | D ) d β d Σ 1 n k = 1 n f ( y n e w | x n e w , β k , Σ k )

Appendix B

Figure A1. Simulation results for the parameter estimations of SURM2 using the DMC, Gs-MN, Gs-MN1, Gs-MN2, and FGLS. The dashed and solid horizontal lines represent the 95% confidence intervals and means of the simulation using the entire dataset (174 trees), respectively.
Figure A1. Simulation results for the parameter estimations of SURM2 using the DMC, Gs-MN, Gs-MN1, Gs-MN2, and FGLS. The dashed and solid horizontal lines represent the 95% confidence intervals and means of the simulation using the entire dataset (174 trees), respectively.
Forests 11 01302 g0a1
Figure A2. The MAE (A) and MAE % (B) of the SURM2 using different sample sizes. The solid line represents the value of MAE and MAE% calculated by all dataset (174 trees).
Figure A2. The MAE (A) and MAE % (B) of the SURM2 using different sample sizes. The solid line represents the value of MAE and MAE% calculated by all dataset (174 trees).
Forests 11 01302 g0a2
Table A1. The correlation matrices of SURM1 and SURM2 fitted by DMC method.
Table A1. The correlation matrices of SURM1 and SURM2 fitted by DMC method.
ModelCorrelation Matrix
SURM1 W s W b W f W r
W s 1.0000−0.4292−0.24280.4487
W b −0.42921.00000.5836−0.2208
W f −0.24280.58361.0000−0.1062
W r 0.4487−0.2208−0.10621.0000
SUMR2 W s W b W f W r
W s 1.00000.00260.00840.2311
W b 0.00261.00000.5342−0.0362
W f 0.00840.53421.00000.0042
W r 0.2311−0.03620.00421.0000
Table A2. The average of the mean absolute error ( MAE ¯ , kg) and average of the mean absolute percent error ( MAE % ¯ ) of the six modeling methods with different sample sizes in the simulations for the four tree biomass components.
Table A2. The average of the mean absolute error ( MAE ¯ , kg) and average of the mean absolute percent error ( MAE % ¯ ) of the six modeling methods with different sample sizes in the simulations for the four tree biomass components.
MethodsComponentsIndicesSample Sizes
10152030406080110140170
SUM1
DMCWr MAE ¯ 4.894.624.504.424.384.374.374.364.364.37
MAE % ¯ 22.0821.2320.8620.5020.3220.1620.0720.0119.9819.97
Ws MAE ¯ 19.3618.4818.1417.8017.6317.4717.3917.3117.2817.27
MAE % ¯ 21.1620.4120.0919.7919.6419.5019.4319.3819.3619.34
Wb MAE ¯ 2.802.662.592.532.492.462.452.432.432.42
MAE % ¯ 31.9430.1629.3528.5928.2527.9027.7427.6027.5127.47
Wf MAE ¯ .0.810.770.760.750.740.730.730.730.720.72
MAE % ¯ 26.6225.1224.5023.9123.6423.3523.2323.1323.0623.02
Gs-MNWr MAE ¯ 4.904.634.514.424.394.374.374.364.364.36
MAE % ¯ 22.0921.2420.8720.5020.3320.1520.0820.0219.9819.97
Ws MAE ¯ 19.3318.4718.1417.8017.6317.4717.3917.3217.2817.27
MAE % ¯ 21.1420.4120.0919.7919.6419.5019.4319.3819.3519.34
Wb MAE ¯ 2.802.662.592.532.492.462.452.432.432.42
MAE % ¯ 31.9730.1829.3628.5928.2527.8927.7327.5927.5227.47
Wf MAE ¯ 0.810.770.760.750.740.730.730.730.720.72
MAE % ¯ 26.6325.1324.5123.9223.6423.3723.2423.1323.0623.02
Gs-MN1Wr MAE ¯ 4.304.314.324.334.334.344.354.354.354.36
MAE % ¯ 19.8919.9219.9519.9719.9819.9719.9619.9519.9519.95
Ws MAE ¯ 17.2817.2817.2817.2917.2917.2817.2817.2717.2717.26
MAE % ¯ 19.3719.3719.3819.3819.3719.3619.3619.3519.3419.34
Wb MAE ¯ 2.432.432.432.432.432.432.432.422.422.42
MAE % ¯ 27.6527.6527.6527.6227.6127.5927.5727.5327.4927.47
Wf MAE ¯ 0.720.720.720.720.720.720.720.720.720.72
MAE % ¯ 23.1723.1723.1723.1523.1423.1223.1023.0723.0423.02
Gs-MN2Wr MAE ¯ 6.285.735.374.984.784.604.534.474.444.43
MAE % ¯ 22.7222.1021.6621.1020.7920.4620.2920.1720.1020.07
Ws MAE ¯ 19.7618.9118.4517.9917.7817.5717.4717.3717.3317.30
MAE % ¯ 19.5419.5119.5119.4919.4819.4219.3919.3619.3419.33
Wb MAE ¯ 3.603.192.962.732.622.532.492.462.452.44
MAE % ¯ 34.2231.7930.5029.1828.5828.0527.8227.6427.5327.48
Wf MAE ¯ 1.080.960.890.810.780.750.740.730.730.73
MAE % ¯ 29.7627.1625.6224.2323.6923.3223.1923.1023.0322.99
FGLSWr MAE ¯ 4.894.624.504.424.384.364.374.364.364.36
MAE % ¯ 22.0821.2320.8620.5020.3220.1520.0820.0219.9819.97
Ws MAE ¯ 19.3618.4818.1417.8017.6317.4717.3917.3217.2817.27
MAE % ¯ 21.1620.4120.0919.7919.6419.5019.4319.3819.3519.34
Wb MAE ¯ 2.802.662.592.532.492.462.452.432.432.42
MAE % ¯ 31.9430.1629.3528.5928.2527.8927.7327.5927.5227.47
Wf MAE ¯ 0.810.770.760.750.740.730.730.730.720.72
MAE % ¯ 26.6125.1224.5023.9123.6423.3723.2423.1323.0623.02
SURM2
DMCWr MAE ¯ 5.184.764.614.464.404.344.314.284.274.27
MAE % ¯ 21.6720.1419.5518.9718.7118.4118.2618.1518.0918.04
Ws MAE ¯ 9.498.888.608.338.208.088.027.967.937.91
MAE % ¯ 10.059.399.108.838.708.578.508.458.418.38
Wb MAE ¯ 2.722.532.452.372.332.292.272.262.262.25
MAE % ¯ 29.3927.2126.3425.3824.9724.5624.3824.2324.1124.05
Wf MAE ¯ 0.860.800.780.760.750.740.740.730.730.73
MAE % ¯ 27.6125.4924.6323.8023.4723.1523.0222.9422.8822.86
Gs-MNWr MAE ¯ 5.164.764.614.464.404.344.314.284.274.27
MAE % ¯ 21.6220.1319.5518.9718.7118.4118.2618.1518.0918.04
Ws MAE ¯ 9.498.888.608.338.208.088.027.967.937.91
MAE % ¯ 10.059.399.118.838.708.578.518.458.418.38
Wb MAE ¯ 2.702.532.452.372.332.292.272.262.262.25
MAE % ¯ 29.3427.2126.3425.3824.9724.5624.3824.2324.1124.05
Wf MAE ¯ 0.850.800.780.760.750.740.740.730.730.73
MAE % ¯ 27.5725.4924.6323.8023.4723.1523.0222.9422.8822.86
Gs-MN1Wr MAE ¯ 4.244.254.254.264.264.264.264.264.264.26
MAE % ¯ 18.0718.0918.1118.1218.1318.1118.0918.0718.0518.02
Ws MAE ¯ 7.967.967.967.967.967.957.947.937.927.91
MAE % ¯ 8.448.458.458.458.458.448.438.418.408.38
Wb MAE ¯ 2.262.262.262.262.262.262.262.262.252.25
MAE % ¯ 24.3024.3024.3124.2824.2624.2224.1924.1424.0824.05
Wf MAE ¯ 0.730.730.730.730.730.730.730.730.730.73
MAE % ¯ 22.9822.9822.9822.9722.9622.9422.9222.9022.8822.87
Gs-MN2Wr MAE ¯ 6.456.045.675.184.924.674.554.474.414.38
MAE % ¯ 24.5022.7221.5220.1019.4118.8218.5718.3918.2918.21
Ws MAE ¯ 13.6812.5911.7110.5910.029.489.178.858.628.46
MAE % ¯ 14.0812.9712.1411.1510.6910.259.989.659.389.17
Wb MAE ¯ 3.473.303.193.052.952.782.652.492.382.32
MAE % ¯ 36.8735.7335.0433.8732.9431.3129.9428.2426.9726.11
Wf MAE ¯ 1.080.970.910.850.820.780.760.750.730.72
MAE % ¯ 30.9328.7327.6026.4925.8625.0424.4723.8423.4023.10
FGLSWr MAE ¯ 5.184.764.614.464.404.344.314.284.274.27
MAE % ¯ 21.6720.1419.5518.9718.7118.4118.2618.1518.0918.04
Ws MAE ¯ 9.498.888.608.338.208.088.027.967.937.91
MAE % ¯ 10.059.399.108.838.708.578.508.458.418.38
Wb MAE ¯ 2.722.532.452.372.332.292.272.262.262.25
MAE % ¯ 29.3927.2126.3425.3824.9724.5624.3824.2324.1124.05
Wf MAE ¯ 0.860.800.780.760.750.740.740.730.730.73
MAE % ¯ 27.6125.4924.6323.8023.4723.1523.0222.9422.8822.86
Table A3. The autocorrelation function of the posterior samples for the parameter βr0 of the Bayesian methods calculated using 10,000 samples. The max lag was set to 50.
Table A3. The autocorrelation function of the posterior samples for the parameter βr0 of the Bayesian methods calculated using 10,000 samples. The max lag was set to 50.
LagDMCGs-JGs-MNGs-MN1Gs-MN2
SURM1
01.00001.00001.00001.00001.0000
1−0.00050.0120−0.00600.00330.0062
5−0.01080.00590.02960.00780.0002
100.0149−0.0011−0.0043−0.0180−0.0011
50−0.0020−0.00110.00750.00150.0160
SURM2
01.00001.00001.00001.00001.0000
10.0022−0.00390.00510.0007−0.0040
5−0.01040.00010.0037−0.0048−0.0181
100.0033−0.0087−0.00480.0021−0.0057
50−0.00180.00140.0043−0.0108−0.0129
where the subscript n e w means the new data that need to be predicted, and x n e w is the matrix of new independent variables; β and Σ are the parameters and variance–covariance matrix of errors, respectively; and n is the size of the posterior samples.

References

  1. Clark, J.; Murphy, G. Estimating forest biomass components with hemispherical photography for Douglas-fir stands in northwest Oregon. Can. J. For. Res. 2011, 41, 1060–1074. [Google Scholar] [CrossRef]
  2. Dong, L.; Zhang, L.; Li, F. A compatible system of biomass equations for three conifer species in Northeast, China. For. Ecol. Manag. 2014, 329, 306–317. [Google Scholar] [CrossRef]
  3. Weiskittel, A.R.; MacFarlane, D.W.; Radtke, P.J.; Affleck, D.L.; Hailemariam, T.; Woodall, C.W.; Westfall, J.A.; Coulston, J.W. A call to improve methods for estimating tree biomass for regional and national assessments. J. For. 2015, 113, 414–424. [Google Scholar] [CrossRef] [Green Version]
  4. Zhao, D.; Kane, M.; Markewitz, D.; Teskey, R.; Clutter, M. Additive tree biomass equations for midrotation Loblolly pine plantations. For. Sci. 2015, 61, 613–623. [Google Scholar] [CrossRef] [Green Version]
  5. Ter-Mikaelian, M.T.; Korzukhin, M.D. Biomass equations for sixty-five north American tree species. For. Ecol. Manag. 1997, 97, 1–24. [Google Scholar] [CrossRef] [Green Version]
  6. Jenkins, J.C.; Chojnacky, D.C.; Heath, L.S.; Birdsey, R.A. National-scale biomass estimators for united states tree species. For. Sci. 2003, 49, 12–35. [Google Scholar]
  7. Zianis, D.; Muukkonen, P.; Makipaa, R.; Mencuccini, M. Biomass and stem volume equations for tree species in Europe. Silva Fenn. Monogr. 2005, 4, 63. [Google Scholar]
  8. Wang, C. Biomass allometric equations for 10 co-occurring tree species in Chinese temperate forests. For. Ecol. Manag. 2006, 222, 9–16. [Google Scholar] [CrossRef]
  9. Chave, J.; Rejoumechain, M.; Burquez, A.; Chidumayo, E.N.; Colgan, M.S.; Delitti, W.B.C.; Duque, A.; Eid, T.; Fearnside, P.M.; Goodman, R.C. Improved allometric models to estimate the aboveground biomass of tropical trees. Glob. Chang. Biol. 2014, 20, 3177–3190. [Google Scholar] [CrossRef]
  10. Luo, Y.; Wang, X.; Ouyang, Z.; Lu, F.; Feng, L.; Tao, J. A review of biomass equations for China’s tree species. Earth Syst. Sci. Data 2020, 12, 21–40. [Google Scholar] [CrossRef] [Green Version]
  11. Bi, H.; Murphy, S.; Volkova, L.; Weston, C.J.; Fairman, T.A.; Li, Y.; Law, R.; Norris, J.; Lei, X.; Caccamo, G. Additive biomass equations based on complete weighing of sample trees for open eucalypt forest species in south-eastern Australia. For. Ecol. Manag. 2015, 349, 106–121. [Google Scholar] [CrossRef]
  12. Kralicek, K.; Huy, B.; Poudel, K.P.; Temesgen, H.; Salas, C. Simultaneous estimation of above- and below-ground biomass in tropical forests of Viet Nam. For. Ecol. Manag. 2017, 390, 147–156. [Google Scholar] [CrossRef]
  13. Zapatacuartas, M.; Sierra, C.A.; Alleman, L. Probability distribution of allometric coefficients and Bayesian estimation of aboveground tree biomass. For. Ecol. Manag. 2012, 277, 173–179. [Google Scholar] [CrossRef]
  14. Dong, L.; Zhang, Y.; Xie, L.; Li, F. Comparison of tree biomass modeling approaches for larch (Larix olgensis Henry) trees in Northeast China. Forests 2020, 11, 202. [Google Scholar] [CrossRef] [Green Version]
  15. Parresol, B.R. Assessing tree and stand biomass: A review with examples and critical comparisons. For. Sci. 1999, 45, 573–593. [Google Scholar]
  16. Zellner, A. An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. J. Am. Stat. Assoc. 1962, 57, 348–368. [Google Scholar] [CrossRef]
  17. Mehtätalo, L.; Lappi, J. Biometry for Forestry and Environmental Data: With Examples in R; Chapman and Hall; CRC Press: New York, NY, USA, 2020; p. 426. [Google Scholar]
  18. Dong, L.; Zhang, L.; Li, F. Additive biomass equations based on different dendrometric variables for two dominant species (Larix gmelini Rupr. and Betula platyphylla Suk.) in natural forests in the Eastern Daxing’an Mountains, Northeast China. Forests 2018, 9, 261. [Google Scholar] [CrossRef] [Green Version]
  19. Parresol, B.R. Additivity of nonlinear biomass equations. Can. J. For. Res. 2001, 31, 865–878. [Google Scholar] [CrossRef]
  20. Zellner, A. An Introduction to Bayesian Inference in Econometrics; Wiley: New York, NY, USA, 1971; p. 448. [Google Scholar]
  21. Lu, L.; Wang, H.; Chhin, S.; Duan, A.; Zhang, J.; Zhang, X. A Bayesian model averaging approach for modelling tree mortality in relation to site, competition and climatic factors for Chinese fir plantations. For. Ecol. Manag. 2019, 440, 169–177. [Google Scholar] [CrossRef]
  22. Griffiffiths, W.E. Bayesian Inference in the Seemingly Unrelated Regressions Model; Department of Economics, The University of Melbourne: Melbourne, Australia, 2003; p. 520. [Google Scholar]
  23. Bayes, T. An essay towards solving a problem in the doctrine of chances. M.D. Comput. 1991, 8, 157. [Google Scholar] [CrossRef]
  24. Li, R.; Stewart, B.; Weiskittel, A.R. A Bayesian approach for modelling non-linear longitudinal/hierarchical data with random effects in forestry. Forestry 2012, 85, 17–25. [Google Scholar] [CrossRef] [Green Version]
  25. Rossi, P.E.; Allenby, G.M. Bayesian Statistics and Marketing; John Wiley & Sons, Ltd.: Chichester, UK, 2005; p. 368. [Google Scholar]
  26. Huelsenbeck, J.P.; Ronquist, F.; Nielsen, R.; Bollback, J.P. Bayesian inference of phylogeny and its impact on evolutionary biology. Science 2001, 294, 2310–2314. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  27. Portinale, L.; Raiteri, D.C.; Montani, S. Supporting reliability engineers in exploiting the power of dynamic Bayesian networks. Int. J. Approx. Reason. 2010, 51, 179–195. [Google Scholar] [CrossRef] [Green Version]
  28. Reich, B.J.; Ghosh, S.K. Bayesian Statistical Methods; CRC Press: Boca Raton, FL, USA, 2019; p. 275. [Google Scholar]
  29. Berger, J.O. Bayesian analysis: A look at today and thoughts of tomorrow. J. Am. Stat. Assoc. 2000, 95, 1269–1276. [Google Scholar] [CrossRef]
  30. Samuel, K.; Wu, X. Modern Bayesian Statistics; China Statistics Press: Beijing, China, 2000; p. 244.
  31. Van Ravenzwaaij, D.; Cassey, P.; Brown, S.D. A simple introduction to Markov Chain Monte–Carlo sampling. Psychon. Bull. Rev. 2018, 25, 143–154. [Google Scholar] [CrossRef] [Green Version]
  32. Zhang, X.; Duan, A.; Zhang, J. Tree biomass estimation of Chinese fir (Cunninghamia lanceolata) based on Bayesian method. PLoS ONE 2013, 8, e79868. [Google Scholar] [CrossRef] [Green Version]
  33. State Forestry and Grassland Administration The Ninth Forest Resource Survey Report (2014–2018); China Forestry Press: Beijing, China, 2018; p. 451.
  34. Zeng, W.; Duo, H.; Lei, X.; Chen, X.; Wang, X.; Pu, Y.; Zou, W. Individual tree biomass equations and growth models sensitive to climate variables for Larix spp. in China. Eur. J. For. Res. 2017, 136, 233–249. [Google Scholar] [CrossRef]
  35. Xiao, X.; White, E.P.; Hooten, M.B.; Durham, S.L. On the use of log-transformation vs. nonlinear regression for analyzing biological power laws. Ecology 2011, 92, 1887–1894. [Google Scholar] [CrossRef] [Green Version]
  36. SAS Institute Inc. SAS/ETS® 14.1 User’s Guide; SAS Institute Inc.: Cary, NC, USA, 2015; p. 4100. [Google Scholar]
  37. Henningsen, A.; Hamann, J.D. Systemfit: A package for estimating systems of simultaneous equations in R. J. Stat. Softw. 2007, 23, 1–40. [Google Scholar] [CrossRef] [Green Version]
  38. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2019. [Google Scholar]
  39. Zellner, A.; Ando, T. A direct Monte Carlo approach for Bayesian analysis of the seemingly unrelated regression model. J. Econom. 2010, 159, 33–45. [Google Scholar] [CrossRef]
  40. Rossi, P. Bayesm: Bayesian Inference for Marketing/Micro-Econometrics; R package version 3.1-4. 2019. Available online: https://CRAN.R-project.org/package=bayesm (accessed on 20 April 2020).
  41. Heidelberger, P.; Welch, P.D. Simulation run length control in the presence of an initial transient. Oper. Res. 1983, 31, 1109–1144. [Google Scholar] [CrossRef]
  42. Gewke, J. Evaluating the accuracy of sampling-based approaches to calculating posterior moments. In Bayesian Statistics 4; Bernado, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M., Eds.; Clarendon Press: Oxford, UK, 1992; pp. 169–193. [Google Scholar]
  43. Plummer, M.; Best, N.; Cowles, K.; Vines, K. Coda: Convergence diagnosis and output analysis for MCMC. R News 2006, 6, 7–11. [Google Scholar]
  44. Fu, L.; Sharma, R.P.; Hao, K.; Tang, S. A generalized interregional nonlinear mixed-effects crown width model for prince rupprecht larch in northern China. For. Ecol. Manag. 2017, 389, 364–373. [Google Scholar] [CrossRef]
  45. Cawley, G.C.; Talbot, N.L.C. Efficient leave-one-out cross-validation of kernel fisher discriminant classifiers. Pattern Recognit. 2003, 36, 2585–2592. [Google Scholar] [CrossRef]
  46. Finney, D.J. On the distribution of a variate whose logarithm is normally distributed. J. R. Statist. Soc. B 1941, 7, 155–161. [Google Scholar] [CrossRef]
  47. Baskerville, G.L. Use of logarithmic regression in the estimation of plant biomass. Can. J. For. Res. 1972, 4, 149. [Google Scholar] [CrossRef]
  48. Wiant, H.V.; Harner, E.J. Percent bias and standard error in logarithmic regression. For. Sci. 1979, 25, 167–168. [Google Scholar]
  49. Yandle, D.O.; Wiant, H.V. Estimation of plant biomass based on the allometric equation. Can. J. For. Res. 1981, 11, 833–834. [Google Scholar] [CrossRef]
  50. Bond-Lamberty, B.; Wang, C.; Gower, S.T. Net primary production and net ecosystem production of a boreal black spruce wildfire chronosequence. Glob. Chang. Biol. 2004, 10, 473–487. [Google Scholar] [CrossRef]
  51. Pregitzer, K.S.; Euskirchen, E.S. Carbon cycling and storage in world forests: Biome patterns related to forest age. Glob. Chang. Biol. 2004, 10, 2052–2077. [Google Scholar] [CrossRef]
  52. Wang, X.; Zhao, D.; Liu, G.; Yang, C.; Teskey, R.O. Additive tree biomass equations for Betula platyphylla Suk. plantations in Northeast China. Ann. For. Sci. 2018, 75, 60. [Google Scholar] [CrossRef] [Green Version]
  53. Zhao, D.; Westfall, J.A.; Coulston, J.W.; Lynch, T.B.; Bullock, B.P.; Montes, C.R. Additive biomass equations for slash pine trees: Comparing three modeling approaches. Can. J. For. Res. 2019, 49, 27–40. [Google Scholar] [CrossRef]
  54. Widagdo, F.R.A.; Li, F.; Zhang, L.; Dong, L. Aggregated biomass model systems and carbon concentration variations for tree carbon quantification of natural Mongolian Oak in Northeast China. Forests 2020, 11, 397. [Google Scholar] [CrossRef] [Green Version]
  55. Affleck, D.L.R.; Dieguez-Aranda, U. Additive nonlinear biomass equations: A likelihood-based approach. For. Sci. 2016, 62, 129–140. [Google Scholar] [CrossRef]
  56. Dong, L.; Zhang, L.; Li, F. Developing additive systems of biomass equations for nine hardwood species in Northeast China. Trees-Struct. Funct. 2015, 29, 1149–1163. [Google Scholar] [CrossRef]
  57. Kusmana, C.; Hidayat, T.; Tiryana, T.; Rusdiana, O. Istomo Allometric models for above- and below-ground biomass of Sonneratia spp. Glob. Ecol. Conserv. 2018, 15, 10. [Google Scholar]
  58. Ando, T. Bayesian variable selection for the seemingly unrelated regression models with a large number of predictors. J. Jpn. Stat. Soc. 2012, 41, 187–203. [Google Scholar] [CrossRef] [Green Version]
  59. Tang, S. Bias correction in logarithmic regression and comparison with weighted regression for non-linear models. For. Res. 2011, 24, 137–143. [Google Scholar]
  60. Mascaro, J.; Litton, C.M.; Hughes, R.F.; Uowolo, A.; Schnitzer, S.A. Is logarithmic transformation necessary in allometry? Ten, one--hundred, one--thousand--times yes. Biol. J. Linn. Soc. 2014, 111, 230–233. [Google Scholar] [CrossRef] [Green Version]
  61. Madgwick, H.A.I.; Satoo, T. On estimating the aboveground weights of tree stands. Ecology 1975, 56, 1446–1450. [Google Scholar] [CrossRef] [Green Version]
  62. Zianis, D.; Xanthopoulos, G.; Kalabokidis, K.; Kazakis, G.; Ghosn, D.; Roussou, O. Allometric equations for aboveground biomass estimation by size class for Pinus brutia Ten. trees growing in North and South Aegean Islands, Greece. Eur. J. For. Res. 2011, 130, 145–160. [Google Scholar] [CrossRef]
Figure 1. The distribution of the sampling plots in this study across Heilongjiang province, Northeast China.
Figure 1. The distribution of the sampling plots in this study across Heilongjiang province, Northeast China.
Forests 11 01302 g001
Figure 2. Posterior distributions of the model parameters estimated by Bayesian statistics. DMC represents the direct Monte Carlo with Jeffreys’ invariant prior; Gs-J is the Gibbs sampler with Jeffreys’ invariant prior; and Gs-MN represents the Gibbs sampler with a prior of multivariate normal distribution by 0 vector, with the corresponding variance of 1000 and the covariance of 0. The Gs-MN1 and Gs-Mn2 is the Gibbs sampler with an informative prior of multivariate normal distribution by different means, variances, and covariances.
Figure 2. Posterior distributions of the model parameters estimated by Bayesian statistics. DMC represents the direct Monte Carlo with Jeffreys’ invariant prior; Gs-J is the Gibbs sampler with Jeffreys’ invariant prior; and Gs-MN represents the Gibbs sampler with a prior of multivariate normal distribution by 0 vector, with the corresponding variance of 1000 and the covariance of 0. The Gs-MN1 and Gs-Mn2 is the Gibbs sampler with an informative prior of multivariate normal distribution by different means, variances, and covariances.
Forests 11 01302 g002
Figure 3. The 95% prediction interval of Wr, Ws, Wb, and Wf predictions from the six approaches.
Figure 3. The 95% prediction interval of Wr, Ws, Wb, and Wf predictions from the six approaches.
Forests 11 01302 g003
Figure 4. Simulation results for the parameters estimation of SURM1 using the DMC, Gs-MN, Gs-MN1, Gs-MN2, and FGLS. The dashed and solid horizontal lines represent the 95% confidence intervals and means of the simulation using the entire dataset (174 trees), respectively.
Figure 4. Simulation results for the parameters estimation of SURM1 using the DMC, Gs-MN, Gs-MN1, Gs-MN2, and FGLS. The dashed and solid horizontal lines represent the 95% confidence intervals and means of the simulation using the entire dataset (174 trees), respectively.
Forests 11 01302 g004
Figure 5. The MAE (A) and MAE % (B) of the SURM1 using different sample sizes. Solid horizontal line represents the value of MAE and MAE% calculated using the entire dataset (174 trees).
Figure 5. The MAE (A) and MAE % (B) of the SURM1 using different sample sizes. Solid horizontal line represents the value of MAE and MAE% calculated using the entire dataset (174 trees).
Forests 11 01302 g005
Table 1. Summary statistics of the sample trees in this study.
Table 1. Summary statistics of the sample trees in this study.
AttributesMeanStdMinimumMaximum
DBH (cm)16.37.02.035.7
H (m)15.45.53.827.0
Root biomass (kg)31.6633.830.13203.60
Stem biomass (kg)106.94107.500.27510.19
Branch biomass (kg)11.379.050.1042.30
Foliage biomass (kg)3.562.080.118.73
Note: std is the standard deviation.
Table 2. The posterior means, standard deviations (std), and 95% posterior intervals (using the 2.5th and 97.5th percentiles of posterior samples) of the SURM1 (DBH-only) parameter estimates, calculated by using the Bayesian estimation. The parameters means and standard errors (SEs) estimated by FGLS are also listed.
Table 2. The posterior means, standard deviations (std), and 95% posterior intervals (using the 2.5th and 97.5th percentiles of posterior samples) of the SURM1 (DBH-only) parameter estimates, calculated by using the Bayesian estimation. The parameters means and standard errors (SEs) estimated by FGLS are also listed.
MethodsStatsβr0βr1βs0βs1βb0βb1βf0βf1
DMCMean−4.54062.7194−3.36552.7386−3.53042.065−2.94061.4810
Std0.02770.01020.02530.00930.05080.01870.04180.0153
2.50th−4.59452.6995−3.41462.7201−3.63012.0291−3.02171.4512
97.50th−4.48682.7391−3.31572.7567−3.43252.1018−2.85851.5109
Gs-JMean−4.54042.7192−3.36512.7384−3.53192.0656−2.94241.4819
Std0.10790.03960.09710.03560.14370.05260.11920.0436
2.50th−4.75192.642−3.55492.6688−3.81371.96−3.17561.3953
97.50th−4.33022.7984−3.17282.8092−3.24752.1692−2.70811.5677
Gs-MNMean−4.53972.7191−3.36622.739−3.52772.064−2.93931.4806
Std0.13460.04930.12670.04650.16270.05970.14330.0525
2.50th−4.80362.6217−3.61662.649−3.84461.947−3.21921.3778
97.50th−4.27632.8155−3.11882.8302−3.21622.1815−2.65851.5827
Gs-MN1Mean−4.5452.721−3.36872.7397−3.53042.065−2.9381.4803
Std0.09440.03390.07790.02830.12020.04330.10080.0365
2.50th−4.73092.654−3.5172.6838−3.76961.9801−3.13321.4079
97.50th−4.3592.7875−3.21422.7942−3.29592.1517−2.73931.5505
Gs-MN2Mean−4.52092.7124−3.34412.731−3.5852.0848−2.99471.5008
Std0.13390.04930.12380.04550.15880.05820.13820.0507
2.50th−4.78382.6145−3.58292.6418−3.89631.9705−3.26541.4017
97.50th−4.25462.8078−3.09992.8184−3.27292.1992−2.72411.5996
FGLSValue−4.54072.7195−3.36552.7386−3.53052.0650−2.94021.4810
SE0.10670.03910.09610.03530.14090.05170.11730.0430
Table 3. The parameter estimates of SURM2 (DBH-H). The symbols are the same as in Table 2.
Table 3. The parameter estimates of SURM2 (DBH-H). The symbols are the same as in Table 2.
MethodStatsβr0βr1βr2βs0βs1βs2βb0βb1βb2βf0βf1βf2
DMCMean−5.07732.18860.7354−4.4511.66531.4869−2.66062.9253−1.1918−2.52551.8912−0.5683
Std0.03290.02410.03130.00790.00580.00750.05220.03810.04950.05010.03710.0479
2.50th−5.14182.14070.6747−4.46661.65351.4717−2.7642.8507−1.2886−2.62391.8189−0.6635
97.50th−5.01322.2360.7973−4.43531.67681.5017−2.55942.9998−1.0945−2.42651.9645−0.4767
Gs−JMean−5.07562.18890.7344−4.45041.66511.4868−2.66222.924−1.1896−2.52661.8901−0.5667
Std0.13820.10220.13180.06280.04600.05980.17720.12940.16840.16000.11660.1511
2.50th−5.35151.98580.4751−4.5731.57431.369−3.00632.6692−1.5186−2.8421.6595−0.8624
97.50th−4.80452.38880.9943−4.32491.7561.6041−2.313.1811−0.8569−2.21162.1221−0.2764
Gs−MNMean−5.0752.18760.7354−4.44961.66421.4874−2.65992.9266−1.1934−2.52451.8934−0.5708
Std0.17980.13120.16940.13210.0960.12450.20620.15150.19640.19250.14320.1846
2.50th−5.42311.93140.3979−4.70911.47581.2382−3.06472.6302−1.5807−2.90431.6134−0.9359
97.50th−4.72022.44721.0676−4.19131.85361.7293−2.25463.2223−0.8115−2.1472.1742−0.2104
Gs−MN1Mean−5.08352.19070.7353−4.45151.66531.487−2.6612.9246−1.1908−2.52491.8902−0.5672
Std0.11480.08670.110.05360.03950.05160.14180.10880.13720.13270.09860.1271
2.50th−5.3082.02220.517−4.55621.58891.386−2.93472.7129−1.457−2.78841.6964−0.8195
97.50th−4.85892.36080.9504−4.34551.7421.5856−2.38493.1356−0.924−2.26692.0862−0.3151
Gs−MN2Mean−5.0812.13560.7902−4.23691.8691.2016−3.3422.2326−0.2395−2.96631.4610.0299
Std0.15240.09710.11900.11820.07560.09410.20090.13130.1670.18980.12960.1653
2.50th−5.37951.94630.559−4.47041.71961.0191−3.73831.9677−0.5602−3.34261.2026−0.2873
97.50th−4.78052.32571.0196−4.00462.01771.3876−2.9522.48690.0949−2.59131.71060.3584
FGLSMean−5.07772.18850.7355−4.45111.66521.4870−2.66052.9251−1.1915−2.52521.8913−0.5684
Se0.13700.10100.13070.06200.04570.05920.17220.12700.16430.15750.11610.1503
Table 4. The model fitting statistics of the six modeling methods for the four tree biomass components.
Table 4. The model fitting statistics of the six modeling methods for the four tree biomass components.
Method DIC R2RMSE (kg)
WrWsWbWfWrWsWbWf
SURM1
DMC−1894.250.96040.93850.82010.76256.758826.67763.86321.0172
Gs-J−1890.650.96040.93850.82000.76196.763026.67633.86551.0188
Gs-MN−719.5670.96040.93850.82040.76266.758226.68463.86011.0171
Gs-MN1−1939.090.96060.93840.82000.76256.745326.69143.86431.0175
Gs-MN2−719.140.95980.93890.81520.75476.818026.59473.92311.0353
FGLS-0.96050.93850.82010.76256.755226.67853.86361.0175
SURM2
DMC−3004.070.95750.98400.84780.76926.988213.60873.54411.0022
Gs-J−2993.100.95750.98400.84760.76916.988213.60633.54811.0023
Gs-MN−1274.250.95740.98400.84780.76887.000513.60263.54401.0030
Gs-MN1−3046.410.95770.98400.84780.76896.973613.60963.54481.0028
Gs-MN2−1163.850.95510.98240.83680.76007.193714.31123.67191.0231
FGLS-0.95750.98400.84780.76916.988513.60883.54481.0023
Table 5. The model validation statistics of six modeling methods for the four tree biomass components.
Table 5. The model validation statistics of six modeling methods for the four tree biomass components.
MethodWrWsWbWf
MAE (kg)MAE% (%)MAE (kg)MAE% (%)MAE (kg)MAE% (%)MAE (kg)MAE% (%)
SURM1
DMC4.295220.253217.422719.63442.511629.07800.729623.5099
Gs-J4.286420.254717.422119.63832.514529.12520.729923.5313
Gs-MN4.002020.605217.358020.17642.584630.17460.743824.4397
Gs-MN13.984420.502817.288720.09672.573530.02450.740824.3333
Gs-MN24.042520.701017.355520.18932.614830.10290.753224.3886
FGLS4.016320.584717.348420.15712.579130.08150.742524.3796
SUMR2
DMC4.319918.52828.03178.53082.319825.25450.746523.4813
Gs-J4.320518.55318.03238.53202.323225.30860.747223.5177
Gs-MN4.319419.27937.99458.59782.367326.12650.755524.3034
Gs-MN14.287819.13387.95938.54512.353025.94760.751124.1500
Gs-MN24.276419.29408.00258.63852.447325.90140.796724.3156
FGLS4.315519.21647.99958.58632.361526.02790.754224.2290
Table 6. The percent bias (B, %) and percent standard error (G, %) of the three correction factors (CF1, CF2, and CF3) with the mean percentage difference (MPD, %) of the CF0, CF1, CF2, and CF3 calculated by the error terms obtained using the Gs-MN1. MPD0 was calculated without using any correction factor (CF0), in which CF0 = 1.
Table 6. The percent bias (B, %) and percent standard error (G, %) of the three correction factors (CF1, CF2, and CF3) with the mean percentage difference (MPD, %) of the CF0, CF1, CF2, and CF3 calculated by the error terms obtained using the Gs-MN1. MPD0 was calculated without using any correction factor (CF0), in which CF0 = 1.
ComponentsCF1B1G1CF2B2G2CF3B3G3MPD0MPD1MPD2MPD3
SURM1
Wr1.03423.306918.49321.03363.250818.33031.03423.306918.493219.947020.323320.311420.3232
Ws1.02772.695316.64331.02732.657516.52271.02772.695316.643319.336919.945819.936419.9457
Wb1.06045.696024.57641.05865.535624.20741.06045.696024.576427.469829.666629.590929.6656
Wf1.04153.984620.37151.04063.901620.14941.04153.984620.371523.022424.075524.045124.0752
SURM2
Wr1.02892.808817.00001.02852.771016.88191.02892.808817.000018.016718.859618.845718.8595
Ws1.00590.58657.68111.00580.57677.61581.00590.58657.68118.37978.43658.43638.4365
Wb1.04614.406821.47091.04514.315421.23681.04614.406821.470924.040325.529825.493225.5293
Wf1.03843.698019.59591.03773.633019.41651.03843.698019.595922.870223.783023.763623.7827
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Xie, L.; Li, F.; Zhang, L.; Widagdo, F.R.A.; Dong, L. A Bayesian Approach to Estimating Seemingly Unrelated Regression for Tree Biomass Model Systems. Forests 2020, 11, 1302. https://doi.org/10.3390/f11121302

AMA Style

Xie L, Li F, Zhang L, Widagdo FRA, Dong L. A Bayesian Approach to Estimating Seemingly Unrelated Regression for Tree Biomass Model Systems. Forests. 2020; 11(12):1302. https://doi.org/10.3390/f11121302

Chicago/Turabian Style

Xie, Longfei, Fengri Li, Lianjun Zhang, Faris Rafi Almay Widagdo, and Lihu Dong. 2020. "A Bayesian Approach to Estimating Seemingly Unrelated Regression for Tree Biomass Model Systems" Forests 11, no. 12: 1302. https://doi.org/10.3390/f11121302

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop