Analysis of Error Structure for Additive Biomass Equations on the Use of Multivariate Likelihood Function

Research Highlights: this study developed additive biomass equations respectively from nonlinear regression (NLR) on original data and linear regression (LR) on a log-transformed scale by nonlinear seemingly unrelated regression (NSUR). To choose appropriate regression form, the error structures (additive vs. multiplicative) of compatible biomass equations were determined on the use of the multivariate likelihood function which extended the method of likelihood analysis to the general occasion of a contemporaneously correlated set of equations. Background and Objectives: both NLR and LR could yield the expected predictions for allometric scaling relationship. In recent studies, there are vigorous debates on which regression (NLR or LR) should apply. The main aim of this paper is to analyze the error structure of a compatible system of biomass equations to choose more appropriate regression. Materials and Methods: based on biomass data of 270 trees for three tree species, additive biomass equations were developed respectively for NLR and LR by NSUR. Multivariate likelihood functions were computed to determine the error structure based on the multivariate probability density function. The anti-log correction factor which kept the additive property was obtained separately using the arithmetic and weighted average of basic correction factors from each equation to assess two model specifications on the comparably original scale. Results: the assumption of additive error structure was well favored for an additive system of three species based on the joint likelihood function. However, the error structure of each component equation calculated from the conditional likelihood function for compatible equations might be different. The performance of additive equations corrected by a weighted average of basic correction factor from each component equation performed better than that of the arithmetic average and held good property of compatibility after corrected. Conclusions: NLR provided a better fit for additive biomass equations of three tree species. Additive equations which confirmed the responding assumption of error structure performed better. The joint likelihood function on the use of the multivariate likelihood function could be used to analyze the error structure of the additive system which was a result of a tradeoff for each component equation. Based on the average of correction factors from each component equation to correct the bias of additive equations was feasible for the hold of additive property, which might lead to a poor correction effect for some component equation.


Introduction
Allometric research characterizes the scaling relationship between various response variables and different measures of body size, which has been dominant for many years in a variety of different areas, such as physiology, numerical ecology, and morphology [1][2][3].Kittredge (1944) [4] described the biomass of tree components with tree dimension variables based on an allometric equation to quantify the tree biomass in the form of Y = aX b , where Y is tree component biomass, X is tree dimension variable and a, b respectively represents allometric coefficient and exponent.Up to date, thousands of biomass equations have been developed for various tree species and regions all over the world for the purpose of accurate quantification of forest biomass dealing with carbon reduction and climate change [5][6][7].However, researchers witness a heated issue recently regarding fitting methods which concentrate largely on the topic, linear regression on log-transformed data (hereafter, LR) with a multiplicative error in arithmetic domain or nonlinear regression on original scale (hereafter, NLR) with an additive error.
For decades, LR was the most commonly adopted pattern in allometric research.The conventional practice is to fit a straight line from log-transformed data using ordinary least square and then to back-transform the resulting equation to yield the estimate on the arithmetic scale [8][9][10][11].Nonetheless, the effectiveness and accuracy of applying LR have been subject to criticism mainly because of the following aspects: (1) Back-transformation from a straight line fitted to logarithm obtained the geometric means for prediction values instead of arithmetic means, which decreased the estimation on the original scale using direct back-transformation [12][13][14].Although this bias from anti-logarithm could be modified by a certain form of correction factor [8,11,15], some research argued that using anti-log correction factor might cause overestimation [16,17].(2) While log-transformation could stabilize the variance, it produced an insidious rotational distort for allometric equations which created a new distribution that differed in a fundamental way from the original scale [18,19].(3) This nonlinear distort unduly emphasized on small values but compressed large-individual values which led to a poor fit for the end of the curve graphically [20][21][22].(4) The artificial transformation might cause outliers undetected which made the data favorable [19,22,23].Generally, the focus of controversy for allometric equations fitted by LR lied in the injudicious use of log-transformation [14,22,24].
NLR, directly fitted to the original data by iteration method for allometric equations, has been broadly used by more and more researchers because of convenient and user-friendly statistical software [25][26][27].However, heteroscedasticity of arithmetic values fitted by NLR directly is of general occurrence which may fail to satisfy statistical assumptions [13,24].Nonetheless, researches have shown that heteroscedasticity of observations does not necessarily invalidate the deterministic equation fitted by NLR [28] and even the failure of satisfying the constancy of variance, it performed better than LR which yielded more accurate estimates on the original scale [22,29].It was worthily noted that a weight factor could address the problem of heteroscedasticity by generalized least square as well as the log-transformation of LR [26].Nonetheless, the debate on which fitting method (NLR or LR) performs better and which error structure confirms the statistical assumption more appropriately has not subsided.
Xiao et al. [30] and Ballantyne [31] proposed the approach of likelihood analysis to determine the error structure (multiplicative vs. additive) for allometric equations so that the suitable fitting procedure (LR or NLR) could be adopted.Recently, the likelihood analysis has come to be applied in the area of forestry and ecology.Lai et al. [32] used the likelihood analysis to compare the allometry of coarse root biomass from LR and NLR for Castanopsis eyrei (Champ.ex Benth.)Tutch., Schima superba Gardn.et Champ., Pinus massonoana Lamb., and mixed species and concluded the empirical data supported a multiplicative error.Ma and Jiang [33] applied the likelihood analysis to determine the error structure of individual tree volume model for Larix gmelinii (Ruprecht) Kuzeneva.and Pinus sylvestris Linn.var.mongolica Litv.which supported the multiplicative error, but the comparison of model assessment indicated NLR performed better than LR.Dong et al. [34] adopted the likelihood analysis to determine the error structure of compatible or additive biomass equations for three conifer species in Northeast China, which favored the multiplicative error.However, the proposed approach of determination on error structure by Xiao et al. [30] and the following application including additive equations developed by Dong et al. [34] were all based on the one-dimension likelihood function which was considered only appropriate for a single allometric equation.For a compatible system of several equations, there were significant contemporaneous correlations when it was simultaneously estimated.Therefore, the analysis based on the one-dimension likelihood function seems to be unreasonable when it applies to determine the error structure for additive biomass equations.
An additive system of biomass equations ensures the logically equal relationship that the predictions for the components sum to the predictions from a total equation.To achieve the additivity property, there are different methods to develop compatible equations.At first, the total predictions could be obtained simply from the sum of components equations developed independently to ensure the additivity [35][36][37].Up to date, simultaneous estimation for a system of equations widely known as seemingly unrelated regression (SUR) has been broadly used for a compatible system of biomass equations [26,34,37,38].Back-transformation from a straight line (LR) that fits the logarithm to the original scale could introduce the systematic bias.To remove and reduce this bias, researchers have computed different forms of correction factors and compared the corrected effects.However, little information could be provided when the additive biomass equation was developed by LR.Dong et al. [34] used the correction factor separately from each equation to correct the bias of each component, which did not take the additivity property into account.To our knowledge, the correction factor for additive biomass equation has not been reported, which corrects the bias from anti-log transformation and ensures the additivity property at the same time.
Cinnamomum camphora (L.) Presl, Schima superba Gardn.et Champ.and Liquidambar formosana Hance are widely distributed in Southeastern China and are also the dominant broad-leaved tree species in Guangdong province.There are many differences in morphology and physiology between broad-leaved and conifer tree species.But researches on the biomass equations centered mostly on the conifer species while there are limited studies on broad-leaved tree species [6,27,34,39].The purpose of our study is (1) to develop a compatible system of biomass equations between branch, foliage, stem wood, stem bark and total aboveground for three broad-leaved tree species separately based on NLR and LR by SUR, (2) to compute the multivariate likelihood function for determination on the error structure of additive biomass equations which extend the method of likelihood analysis to the general situation of a contemporaneously correlated set of equations, (3) to formulate the correction factor for a compatible system of biomass equations to correct the bias introduced by anti-logarithm transformation and ensure the additivity property at the same time, (4) to compare the fitting result of two procedures based on NLR and LR on the same arithmetic scale and evaluate the effect of assumption for different error structures on the result of model fitting.

Data Collection
Tree dimension variable and biomass data, including Cinnamomum camphora, Schima superba and Liquidambar formosana covering whole Guangdong province in Southeastern China with 90 trees for each species, were sampled in 2013 by Guangdong Forestry Survey and Planning Institute (Figure 1).The sample trees were classified by the diameter class of 2 cm, 4 cm, 6 cm, 8 cm, 12 cm, 16 cm, 20 cm, 26 cm, 32 cm and 38 cm (above 38 cm).Among them, 60 trees were evenly distributed following above 10 diameter classes with six trees for each class and the remaining 30 trees were chosen based on the actual distribution of diameter class and the number of trees from the 8th National Forest Inventory in Guangdong province.
The destructive sampling procedure was processed for the living sample trees avoiding severe defects.Before the tree was felled at the ground level height, the diameter at breast height (D, at 1.3 m aboveground) was measured.After felled, the living crown was evenly marked into three parts (top, middle, and bottom) and weighted separately, then the branches and leaves from each part summing to about 500-1000 g of fresh mass was randomly sampled and placed in a labeled bag for moisture content determination.The stem was also marked into three sections including 0-2/10, 2/10-5/10 and above 5/10 tree height and weighed separately.At each section of the stem, a 2-3 cm thick disk separately from the upper and lower part was cut and weighed, then taken to the laboratory for moisture content determination.All samples were dried at 85 • C to constant weight.The dry biomass of each component was calculated by multiplying the fresh weight of each component by the dry/fresh ratio of each component sample.The total foliage biomass was the sum of foliage dry biomass.The total stem wood biomass was the sum of all stem wood's sections dry mass.The total stem bark biomass is the sum of all stem bark dry mass.The aboveground biomass was the sum of branch, foliage, stem wood, and stem bark dry biomass.The above procedure of moisture content determination was conducted by the laboratory center of College of Forestry and Landscape Architecture, South China Agriculture University according to the related technical regulations [40].The data statistics were summarized for 90 sampling trees of each broad-leaved species in Table 1.

Model Specification and Estimation
To fit the allometric equation, either NLR on the arithmetic scale or LR on the logarithmic scale could yield the estimation values.The fundamentally substantial difference between these two approaches largely relies on the assumption of how error term manifests in the equation, which is known as the error structure (Xiao et al., 2011) [30].NLR assumes the equation with the normally additive error on the arithmetic scale such that: In contrast, LR assumes that the error is normally distributed and additive on the logarithmic scale such that: which corresponds to log-normally distributed, multiplicative error on the arithmetic scale, To determine which model specification was the most appropriate for a compatible system of biomass equations for three broad-leaved tree species in this study, two model forms that correlated with additivity among four component biomass equations and total aboveground biomass equation were specified as follows with cross-equation constraints on the structural parameters: (1) The first model specification assumes the error structure is additive (Equation ( 1)) and a compatible system of five biomass equations as follows: (2) The second model specification assumes the error structure is multiplicative on the arithmetic scale (Equation ( 3)) and logarithmic transformation was taken on both sides of equations (Equation ( 2)) such that where W BR , W FL , W SW , W SB and W AB represent branch biomass, foliage biomass, stem wood biomass, stem bark biomass and the aboveground biomass in kg, respectively, D is the diameter at breast height in cm, log denotes natural logarithm, a i and b i are regression coefficient for Equation (4), log a i is the intercept and b i is the regression coefficient for Equation (5), ε i and ε i are the equation error terms for NLR and LR additive model, respectively.
Above two model specifications for additive biomass equations were estimated using nonlinear seemingly unrelated regression generally known as NSUR.The logarithmic transformation tends to balance the heteroscedastic variance.For comparison, Equation ( 4) was fitted to data using weighted NSUR as demonstrated by Parresol (2001) [26] to stabilize the variance.The weight of each component equation was obtained by the weight function w = f (D) −1 , where f (D) was the prediction value for estimated equation [26,41,42].

Multivariate Likelihood Function to Analyze Error Structure
Xiao et al. [30] outlined the approach of likelihood analysis to facilitate the objective determination of the error structure based on the single one-dimension likelihood function.When applied to the additive biomass model with cross-equation correlation, it seems.Considering that, in this study, we computed the multivariate likelihood function including the joint likelihood function and the conditional likelihood function to respectively analyze model system and each component equation for the correlated error structure of additive biomass equations.Based on the joint probability density function, the joint likelihood function can be calculated by: (1) For the p components system of NLR (Equation ( 4)), the joint likelihood function that the data are generated from a normal distribution with additive error is calculated as follows: (2) For the p components system of LR (Equation ( 5)), the joint likelihood function that the data are generated from a lognormal distribution with multiplicative error on the arithmetic scale: where According to the definition of conditional distribution for the multivariate probability density function [43], the conditional likelihood function for ith component equation can be defined as follows: is the conditional likelihood function for ith component equation of NLR and LR respectively, LR is the likelihood function for ith component equation of NLR and LR respectively.
is the value of the joint likelihood function calculated from (p − i) components without the use of ith component equation for NLR and LR, respectively.
To compare different candidate models fitted to the same dataset statistically, Akaikes's Information Criterion (AIC) can be used to evaluate the goodness-of-fit of a model by involving both the likelihood and a penalty for extra parameters.The lowest value for AIC identifies the candidate model conveying the most information about the relationship between predictor and response.AIC c which is a second-order variant of AIC for small sample size is computed as where k is the number of parameters.L is the joint likelihood function for the model system (Equation ( 6) for NLR and Equation (7) for LR) or the conditional likelihood function for each component equation (Equation (8) for NLR and Equation ( 9) for LR).If AIC c−norm − AIC c−logn < −2, the assumption of additive error is favored and the result from Equation ( 4) should be processed.If AIC c−norm − AIC c−logn > 2, the assumption of multiplicative error is favored and the result from Equation ( 5) should be processed [44].If AIC c−norm − AIC c−logn ≤ 2, neither of these two error structures is appropriate and model averaging is suggested.Besides the difference of AIC respectively from NLR and LR, evidence ratio (ER) (see Appendix A) was also taken to provide the evidence for the appropriate model selection [44].

Back-Transformed Correction Factor for Additive Equations
To obtain the arithmetic value of prediction, a correction factor (hereafter, CF) is commonly used to correct the systematic bias introduce by anti-log transformation from a straight line (Equation ( 2)) fitted to logarithmic data.For the additively log-transformed biomass equations (Equation ( 5)), not only the systematic bias should be corrected, but also the additivity property of the value of prediction from back-transformation need to be satisfied.Thus, based on the basic CF, we formulated the specific correction factor for a compatible system of biomass equations.The two basic CFs for ith component can be calculated as follows [8,15]: where δ 2 ii is the ith diagonal element of the error variance-covariance matrix, y ij is the jth observed value for ith component, ŷij is the predicted value of the jth observed value for ith component.Then, the arithmetic and weighted average CF for the compatible system can be respectively obtained by where CF at , CF wt are the arithmetic and weighted average of the tth (t = 1, 2) basic correction factor from each component equation, respectively.CF it is the tth (t = 1, 2) basic correction factor for ith component equation, W i is the proportion of the ith component biomass accounted for the total aboveground biomass.

Model Assessment
This study used the entire empirical data to fit additive biomass equations [45].Model fitting and predicting was assessed by the statistics as follows.
Coefficient of determination Standard error of estimate Total relative error Average system error Relatively mean absolute error Mean prediction error where y j , ŷj are the jth observed value and the responding predicted value, y is the average of the observed value, k is the number of parameters, t α is the t value when the confidence level is α (usually taken by 95%).
To ensure that the estimated mean function captures dominant pattern in the arithmetic scale, the fitted model not only needs to be assessed by several statistics but also should be validated graphically, which was a critically important oversight by many researchers [23,46].In this study, the additive biomass equations based on different assumptions of error structure (additive vs. multiplicative) were validated graphically.

Error Structure for Each Component Equation and Additive System
Additive biomass equations were fitted to original and log-transformed data, respectively (Equation (4) and Equation ( 5)), to yield the parameter estimation value, then calculated the AIC c , respectively, namely AIC c-norm for Equation ( 4) and AIC c-logn for Equation (5) based on the conditional likelihood function from the conditional probability density function.The difference between AIC c value and parameter estimate were computed in Table 2.The AIC c-norm for Schima superba and Liquidambar formosana was clearly lower than AIC c-logn with a difference between −731.2 and −221.5 supporting the additive error for each component equation.But there existed different error structures for each component equation of Cinnamomum camphora.The AIC c-norm for branch and foliage was larger than AIC c-logn favoring the multiplicative error structure with a difference of 38.3 and 263.2 while for other component equation, the AIC c-norm was lower than AIC c-logn with a difference between −105.9 and −15.9 favoring the additive error structure.Components for two other tree species had a large ER (evidence ratio) more than 100 supporting the additive error as well, while the branch and foliage for Cinnamomum camphora had a smaller ER less than 0.01 supporting the multiplicative error as well.The joint likelihood function was calculated for the whole model system based on the joint probability density function.The analysis of error structure for additive model system was shown in Table 3.The AIC c-norm for three tree species was all lower than AIC c-logn with a difference of −7.0, −810.2 and −846.7 and got relatively large ER as well, supporting the additive error, which meant that the approach of NLR for additive biomass equations was appropriate for three broad-leaved tree species, especially for Schima superba and Liquidambar formosana in this study.

Assessment of Anti-Log Correction Factor for Additive System
Log-transformed equation predicts the logarithm of the response variable.To obtain the unbiased value in the original scale, the anti-log correction factor is necessary.The arithmetic and weighted average (CF at and CF wt ) of basic correction factors from each equation as well as responding evaluation statistics for total aboveground biomass was listed in Table 4.The arithmetic average of basic CF from each component equation was represented by CF at (t = 1, 2) and the weighted average of basic CF from each component equation was represented by CF wt (t = 1, 2).CF 0 represented the model was not corrected.16)).TRE is total relative error (see Equation ( 17)).ASE is average system error (see Equation ( 18)).RMA is relatively mean absolute error (see Equation ( 19)).MPE is mean prediction error (see Equation ( 20)).
The uncorrected model performed worse with a lower R 2 , larger standard error of estimate (SEE, hereafter) and mean prediction error (MPE, hereafter) than that of NLR for three tree species.Thus, NLR model could yield relatively better prediction compared to the uncorrected LR model.After applying the CF for Cinnamomum camphora, R 2 of LR model improved by 0.018 to 0.025, SEE decreased by 6.07 kg to 8.66 kg and total relative error (TRE, hereafter), Relatively mean absolute error (RMA, hereafter) and MPE dropped in varying degrees.Importantly, R 2 and SEE might not be the best with CF w2 , but the remaining statistics including TRE, average system error (ASE, hereafter), RMA, and MPE were even better than NLR.LR model for Schima superba obtained worse fitting and predicting accuracy when it was corrected with R 2 decreasing by 0.006 to 0.041 and SEE increasing by 1.73 kg to 10.91 kg, but among the different four CFs, the correction effect for CF w2 was relatively better than that of other correction factors, and TRE, ASE and RMA statistics was better than that of NLR model.The different assessment statistics of corrected LR model got dropped and increased to different degrees for Liquidambar formosana.Using CF w2 increased the R 2 by 0.003, dropped the SEE by 1.43 kg compared with LR 0 , and the ASE and RMA statistics was better than NLR model, but in terms of R 2 and SEE statistics, it was slightly worse than NLR model.
CF wt (t = 1, 2) corrected better for Schima superba and Liquidambar formosana especially the CF based on secondly basic correction factor (Equation ( 12)), that is CF w2 .Although CF at could yield higher R 2 , TRE and ASE got relatively larger and reached −3.82%, −1.35% and −8.42%, −6.06%, respectively.Generally speaking, the value of CF wt was larger than that of CF at .The approach of weighted average apparently reduced TRE and ASE for additive biomass equations.Taking Liquidambar formosana as an example, based on two basic CFs using the approach of weighted average, TRE reduced by 8.08% and 8.12% while ASE reduced 7.99% and 8.02%.As far as all the evaluation statistics, CF w2 corrected best for aboveground biomass of additive equations for three broad-leaved tree species.When CF w2 for additive system was used, it did not perform better than NLR model except Cinnamomum camphora which is slightly better than NLR, but the difference between NLR and corrected LR model was small for three tree species total aboveground biomass.

Comparison of Model Fitting and Error Structure
Major violation indicated the inappropriateness of the model and potential invalidity of the result.To assess the correction effect of CF w2 for each component, the result of evaluation statistics was listed in Table 5. LR 0 represented the uncorrected model and LRw 2 represented the corrected model by CF w2 .For Schima superba and Liquidambar formosana, the NLR model of each component including the total aboveground biomass obtained better estimation supporting the additive error which was consistent with the determination (Tables 2 and 3).For Cinnamomum camphora, the NLR model for stem wood and stem bark yielded relatively better prediction with higher R 2 and small SEE, TRE, RMA and MPE, favoring the additive error which was consistent with the determination (Table 2).Nonetheless, for the branch, R 2 for LR w2 model was slightly higher than NLR, but it got the worst TRE and ASE, while NLR model performed better for foliage component.In addition, the LR model for total aboveground biomass corrected by CF w2 got a relatively better fitting result (Table 4) favoring the multiplicative error which was slightly inconsistent with the determination (Table 2).It could be seen that the error structure of component equation for Cinnamomum camphora was different and this would be discussed later in detail.
Observed values of biomass components together with the fitted curve were plotted against diameter at the breast for three tree species, respectively (Figures 2-4).All models showed a good fit to the small untransformed observations.There was no visually apparent difference between LR 0 and LR w2 model for three tree species.Except for Schima superba branch and foliage as well as Liquidambar formosana foliage, the fitted curve from three model curves showed a close path following the path of data.Nonetheless, most NLR models estimated slightly larger than LR 0 model especially for a large diameter and the larger the diameter was, the clearer this pattern exhibited.The mean function from NLR model could capture a relatively dominant pattern, which followed the path of the data, especially for the larger individual.The additive error structure for a compatible system was fairly more appropriate graphically to formulate the model specification and fit on the original scale which was consistent with the determination in Table 3.

Discussion
The one−dimension likelihood function was derived from univariate normal distribution and solved the estimate issue of a single function called maximum likelihood estimates (MLE), which was used later to determine the error structure of allometric equation by Xiao et al. (2011) [30] and Ballantyne (2013) [31].In this study, each equation of a compatible system of additive equations was estimated simultaneously to ensure the significantly contemporaneous correlations by NSUR.However, the one-dimension likelihood function only considered the single variate but ignored the correlations of multiple variates in the additive equations, which might be inappropriate to determine the error structure of additive equations.In contrast, the multivariate likelihood function took the relationships of multiple variates into account and reflected the multivariate error distribution for additive equations more accurately.
This study computed the joint and conditional multivariate likelihood function for additive biomass equation of three broad-leaved tree species respectively based on the joint and conditional probability density function to analyze the error structure (additive vs. multiplicative) of each component equation and model system.The model satisfying the responding error structure fitted better and major violation indicated the inappropriateness of the model and potential invalidity of the result.The NLR model for Schima superba and Liquidambar formosana indeed yielded better estimation than uncorrected and corrected LR model statistically and graphically, which verified our determination on additive error structure properly in this study.However, for Cinnamomum

Discussion
The one-dimension likelihood function was derived from univariate normal distribution and solved the estimate issue of a single function called maximum likelihood estimates (MLE), which was used later to determine the error structure of allometric equation by Xiao et al. (2011) [30] and Ballantyne (2013) [31].In this study, each equation of a compatible system of additive equations was estimated simultaneously to ensure the significantly contemporaneous correlations by NSUR.However, the one-dimension likelihood function only considered the single variate but ignored the correlations of multiple variates in the additive equations, which might be inappropriate to determine the error structure of additive equations.In contrast, the multivariate likelihood function took the relationships of multiple variates into account and reflected the multivariate error distribution for additive equations more accurately.
This study computed the joint and conditional multivariate likelihood function for additive biomass equation of three broad-leaved tree species respectively based on the joint and conditional probability density function to analyze the error structure (additive vs. multiplicative) of each component equation and model system.The model satisfying the responding error structure fitted better and major violation indicated the inappropriateness of the model and potential invalidity of the result.The NLR model for Schima superba and Liquidambar formosana indeed yielded better estimation than uncorrected and corrected LR model statistically and graphically, which verified our determination on additive error structure properly in this study.However, for Cinnamomum camphora, the corrected model of total aboveground obtained more accurate estimated value than NLR, especially the total aboveground model corrected by CF w2 which had six evaluation statistics relatively better than that of NLR while for the foliage component the NLR model performed better.It indicated that the error of total equation might be the additive, but the error of components was not necessarily the same.This is mainly due to the different error structures for each component equation determined by the conditional likelihood function (see Table 2), but either NLR or LR could be taken to estimate additive biomass equations.Nonetheless, to hold the property of compatibility for each component NSUR compromised the error among component equations [26,37].The error structure for additive system based on the joint likelihood function was the result of a tradeoff for component equations and might cause the inconsistence of error structure between additive system and each component equation, leading to the determination on error structure and model assessment for aboveground and foliage component was inconsistent just like Cinnamomum camphora in this study.
Likelihood analysis based on AIC provided a method for analyzing the error structure to determine more appropriate regression (NLR or LR) especially for a compatible system of biomass equations [30].Nonetheless, using AIC as a direct indicator to compare candidate regression equations (NLR or LR) has been criticized by some researchers [46].Through the graphical validation of NLR and LR equation, Packard (2013) thought that AIC was not a sufficient way to choose alternative statistical models between NLR and LR regression.In addition, some researchers thought the individual AIC, AICc, or BIC values were not interpretable in absolute terms as they contain arbitrary constants and are much affected by sample size [46].Evidence ratio rescaled these information criteria and was good evidence to compare candidate models, which overcame the shortcomings of direct comparison from AIC.It is noteworthy that the larger the difference between AIC, the larger the evidence ratio.Evidence ratio might be more appropriate to compare the model which had a close AIC that could not be directly differentiated in absolute terms.Moreover, as Packard (2014) [23] said, the good fit must capture the dominant pattern in the untransformed data, Figures 2-4 in this study clearly indicate that the lower the AIC of an equation, the better the capture of pattern in all range of data.When considering the candidate equation, statistic test might not be enough to assess the appropriateness of fit and several criteria, as well as graphical validation, were quite necessary.
Both log-transformation for LR model and weighted estimation for NLR model can stabilize the heteroscedasticity and make the constant of variance.However, the log-transformation was thought to create a newly logarithmic scale to estimate parameters [10,14].Thus, the value of back-transformed prediction would not reflect the real relationship and relied largely on which variance of each response value changed on the arithmetic scale [19,21].In our study, it is noted that the uncorrected model LR 0 substantially underestimated the predicted value, especially for large observed value (TRE much larger than zero and the curve was lower than others).When the log-transformed observed value did not fall on the real linear curve, it is easy to understand from log function curve that the back-transformed linear model put much weight on the predicted value for a small individual and compressed the predicted value for a large individual.This nonlinear transformation caused an accurate estimation for the small value and a poor estimation for the large estimation on an arithmetic scale.
To obtain the accurate predicted value closer to the arithmetic scale, a correction factor is necessary to correct the systemic bias introduced by log-transformation [8,10,13], but the compatible property of each component value summing to the total value is also needed for the additive system.Based on the two basic CFs, the arithmetic and weighted average from each component equation were computed in this study.The first basic CF (Equation ( 11)) has been the most used CF from the log-normal function but only satisfying the assumptions strictly, can it yield perfect correction effect [8].Because of overcompensating the bias with the standard error of estimate, it might cause an overestimate [16,17].The second basic CF is independent on the model distribution and corrects the bias from the observed and predicted value, which might cause the value to be lower than 1.0 [11,15].Using second basic CF to formulate the systemic correct factor for additive biomass equations performed better in this study, which was consistent with the result proposed by Snowdon in 1991 [15].In addition, weighted average applied the proportion of each component accounting for the total as a weight to calculate the CF, which considered the relationship among components.Thus, it could obtain a better correction effect compared with that of the arithmetic average in this study.But because the CF for the additive system was calculated from the average of each component, it might lead to a poor correction effect for some specific components.For example, the fitting effect got worse when it was corrected for Schima superba branch and the total aboveground.
NLR has become a commonly used approach with a feature of inexpensive, user-friendly software in allometric studies.So, does it imply NLR perform better definitely than LR model and does it mean the conventionally log-transformed model is unnecessary [13,19,24]?This debate on NLR (additive error) or LR (multiplicative error) model which one is better suited for allomeric research has never subsided.Because of the unbalanced weight put on the predicted value on the original scale for LR model and better fitting of a large value for NLR model, it is suggested that the LR model might be appropriate for small individuals, such as young forests, while NLR model might be appropriate for large individuals, such as mature forests.However, to choose a better model, both statistical analysis and graphical validation for the real empirical data are needed.This research provided a statistical analysis of the determination on the error structure for additive biomass equations.For a compatible model system, especially when the error structures of component equations were analyzed differently, describing the error structure accurately and improving the fitting accuracy could be an interesting research area in the future.

Conclusions
In this study, we developed the multivariate likelihood function to analyze the error structure for additive biomass equations of three broad-leaved tree species, which extended the likelihood function proposed by Xiao [30] and Ballantyne [31] to the general occasion of a contemporaneously correlated set of equations.To compare NLR and LR on the original scale, the correction factors specific for additive equations were developed by the arithmetic and weighted average of two basic corrections from each component equation to hold additive property.The main conclusion could be that: (1) the multivariate likelihood function could be used to analyze the error structure of additive biomass equations and the result of model assessment confirmed our determination.The conditional likelihood function could be used for component equations.The joint likelihood function could be used for additive system.The determination on error structure was a result of tradeoff for additive biomass equations and the error for total equation might be the additive but the error for components was not necessarily the same, (2) the correction factors developed in this study could yield a good effect of correction especially for the approach of weighted average based on the second CF (Equation ( 12)) which could be used for additive equations to hold the compatible property after corrected, (3) the additive equations confirming the responding error structure got more fitting accuracy while violating the responding assumption caused the accuracy loss.In this study, NLR got relatively better goodness-of-fit for the additive biomass equations of three broad-leaved tree species.

Figure 1 .
Figure 1.The locations of trees for three broad-leaved tree species.

LR
are the joint likelihood function of NLR and LR for p component equations, respectively.Where n is the sample size.X ij , Y ij (i = 1, . . ., p; j = 1, . . ., n) are the jth value for predictor and response variable of the ith component equation, respectively.Σ NLR , Σ LR are the error variance-covariance matrix of NLR and LR, respectively.|Σ NLR |, |Σ LR | are the determinant of responding matrix respectively, w ij is the weight of the jth predicted value of the ith component equation.a NLR , b NLR , b NLR and b LR are the responding regression coefficient for NLR and LR, respectively.

Figure 2 .
Figure 2. The curve fitted from nonlinear regression (NLR), uncorrected linear regression (LR0) and corrected linear regression based on the weighted average of secondly basic correction factor from each component equation that is CFw2 ((LRw2) for Cinnamomum camphora (Cinnamomum camphora (L.) Presl).The scattered points were the observations from data.(A)-(D) represented branch, foliage, stem wood, and stem bark, respectively.The solid line represented the nonlinear regression model (NLR), the dashed line represented the uncorrected linear regression model (LR0) and the dotted line represented the corrected linear regression based on the weighted average of secondly basic correction factor from each component equation that is CFw2 (LRw2).

Figure 2 .
Figure 2. The curve fitted from nonlinear regression (NLR), uncorrected linear regression (LR 0 ) and corrected linear regression based on the weighted average of secondly basic correction factor from each component equation that is CF w2 ((LR w2 ) for Cinnamomum camphora (Cinnamomum camphora (L.) Presl).The scattered points were the observations from data.(A-D) represented branch, foliage, stem wood, and stem bark, respectively.The solid line represented the nonlinear regression model (NLR), the dashed line represented the uncorrected linear regression model (LR 0 ) and the dotted line represented the corrected linear regression based on the weighted average of secondly basic correction factor from each component equation that is CF w2 (LR w2 ).

Figure 3 .
Figure 3.The curve fitted from nonlinear regression (NLR), uncorrected linear regression (LR0) and corrected linear regression based on the weighted average of secondly basic correction factor from each component equation that is CFw2 ((LRw2) for Schima superba (Schima superba Gardn.et Champ.).The scattered points were the observations from data.(A)-(D) represented branch, foliage, stem wood, and stem bark, respectively.The solid line represented the nonlinear regression model (NLR), the dashed line represented the uncorrected linear regression model (LR0) and the dotted line represented the corrected linear regression based on the weighted average of secondly basic correction factor from each component equation that is CFw2 (LRw2).

Figure 3 .
Figure 3.The curve fitted from nonlinear regression (NLR), uncorrected linear regression (LR 0 ) and corrected linear regression based on the weighted average of secondly basic correction factor from each component equation that is CF w2 ((LR w2 ) for Schima superba (Schima superba Gardn.et Champ.).The scattered points were the observations from data.(A-D) represented branch, foliage, stem wood, and stem bark, respectively.The solid line represented the nonlinear regression model (NLR), the dashed line represented the uncorrected linear regression model (LR 0 ) and the dotted line represented the corrected linear regression based on the weighted average of secondly basic correction factor from each component equation that is CF w2 (LR w2 ).

Figure 4 .
Figure 4.The curve fitted from nonlinear regression (NLR)、uncorrected linear regression (LR0) and corrected linear regression based on the weighted average of secondly basic correction factor from each component equation that is CFw2 ((LRw2) for Liquidambar formosana (Liquidambar formosana Hance).The scattered points were the observations from data.(A)-(D) represented branch, foliage, stem wood, and stem bark, respectively.The solid line represented the nonlinear regression model (NLR), the dashed line represented the uncorrected linear regression model (LR0) and the dotted line represented the corrected linear regression based on the weighted average of secondly basic correction factor from each component equation that is CFw2 (LRw2).

Figure 4 .
Figure 4.The curve fitted from nonlinear regression (NLR), uncorrected linear regression (LR 0 ) and corrected linear regression based on the weighted average of secondly basic correction factor from each component equation that is CF w2 ((LR w2 ) for Liquidambar formosana (Liquidambar formosana Hance).The scattered points were the observations from data.(A-D) represented branch, foliage, stem wood, and stem bark, respectively.The solid line represented the nonlinear regression model (NLR), the dashed line represented the uncorrected linear regression model (LR 0 ) and the dotted line represented the corrected linear regression based on the weighted average of secondly basic correction factor from each component equation that is CF w2 (LR w2 ).

Table 1 .
The descriptive statistics for 90 sampling trees of each broad-leaved tree species.

Table 2 .
Results of parameter estimate and likelihood analysis based on conditional.Note: Values in parentheses are the standard error of the mean.∆AIC c = AIC c−norm − AIC c−logn .AIC c−norm was AIC value calculated from the nonlinear regression model and AIC c−logn was AIC value calculated from the linear regression model.NLR and LR represents the nonlinear and linear regression respectively.ER represents the evidence ratio.The symbol "<<" and ">>" denotes far less than and far greater than.

Table 3 .
Results of likelihood analysis based on the joint likelihood function for additive system of three tree species.

Table 4 .
Evaluation statistics of aboveground biomass applying different correction factors for three tree species.

Table 5 .
Evaluation statistics for each component equation from nonlinear model (NLR), uncorrected linear model (LR 0 ) and linear model corrected by CF w2 (LR w2 ) for three tree species.