Airborne LIDAR-Derived Aboveground Biomass Estimates Using a Hierarchical Bayesian Approach

: Conventional ground survey data are very accurate, but expensive. Airborne lidar data can reduce the costs and e ﬀ ort required to conduct large-scale forest surveys. It is critical to improve biomass estimation and evaluate carbon stock when we use lidar data. Bayesian methods integrate prior information about unknown parameters, reduce the parameter estimation uncertainty, and improve model performance. This study focused on predicting the independent tree aboveground biomass ( AGB ) with a hierarchical Bayesian model using airborne LIDAR data and comparing the hierarchical Bayesian model with classical methods (nonlinear mixed e ﬀ ect model, NLME). Firstly, we chose the best diameter at breast height ( DBH ) model from several widely used models through a hierarchical Bayesian method. Secondly, we used the DBH predictions together with the tree height (LH) and canopy projection area (CPA) derived by airborne lidar as independent variables to develop the AGB model through a hierarchical Bayesian method with parameter priors from the NLME method. We then compared the hierarchical Bayesian method with the NLME method. The results showed that the two methods performed similarly when pooling the data, while for small sample sizes, the Bayesian method was much better than the classical method. The results of this study imply that the Bayesian method has the potential to improve the estimations of both DBH and AGB using LIDAR data, which reduces costs compared with conventional measurements.


Introduction
On the world's land surface, forests cover about 4200 million hectares, and the carbon stock of forests accounts for about 45% of the world's terrestrial carbon reserves [1]. Aboveground biomass is an important component of forest ecosystems and accounts for a large proportion of forest carbon stock; thus, quantifying forest aboveground biomass is important for forest managers [2].
Diameter at breast height (DBH) and aboveground biomass (AGB) are two important measurements of individual trees, widely used in yield estimations and forest growth [3]. Tree DBH is an easy measuring factor with high accuracy-all DBH data are measured. However, AGB is difficult to measure and has less accuracy than ground-observed DBH-only limited tree biomass can be measured in the sample [4]. Consequently, in the past few decades, many studies have proposed numerous equations to estimate aboveground biomass [5][6][7][8][9][10]. Developing an AGB model according to ground-based DBH

Study Area and Data
The study area from which data were taken is located at the Xishui forest farm of Su'nan Yuguzu Autonomous County, China (38 • 290-38 • 350N, 100 • 120-100 • 200E) and lies within the temperate alpine cold semi-humid and semi-arid zone. The altitude is in the 2550 to 3680 m range, with a mean value of 2993 m in this area. This area is occupied by mountainous forests and steppes, and one of the main functions of these forests is to protect the water resources in the Dayekou Basin of the Qilian Mountains. Grass and natural mature secondary pure forests dominate the sunny and shady slopes, respectively. The dominant tree species in the forest in this area is Picea crassifolia Kom.
We established a single permanent sample plot (PSP) of 100 m × 100 m along the hillside and divided it into sixteen subplots of 25 m × 25 m. DBH, tree height (H), crown base height, and crown diameters in two perpendicular directions of all standing trees were measured in each subplot. We used a differential global positioning system unit to measure the corners and centers of the PSP in each subplot, and a total station to measure the positions of individual trees.
Airborne LIDAR data were acquired using a Lite Mapper 5600 system on 23 June 2008. The mean flight speed was 230 km h −1 and the mean flight height was 3699 m above sea level. The point elevation varied from 2725 to 3193 m, and the LIDAR point density was 4.34 points m −2 . We can obtain individual LIDAR tree heights and crown projection areas on the basis of the airborne LIDAR point cloud from a canopy height model (CHM), and the CHM was derived as the difference Remote Sens. 2019, 11, 1050 3 of 12 between the digital surface model (DSM) and the digital elevation model (DEM) of the study area. The digital surface model grid, which contained elevation information including all ground objects, was interpolated from raw LIDAR point cloud data by a maximum height interpolation method and by filling null cells. For obtaining the absolute vegetation heights, we must remove the influence of the terrain. Thus, we obtained the digital elevation model from the ground-classified LIDAR point cloud data using a progressive morphological filter and, in this study, with a grid cell size of 0.5 m [3]. We also eliminated the noise using a Gaussian smoothing filter to obtain a more accurate crown projection area [3]. We used the local maxima algorithm to find the individual tree crown tops, and used the region growing method to obtain the tree crown boundary, the crown projection area was calculated according to the determined tree crown boundary. The details of the algorithms can be found in Fu et al. [3]. A total of 402 individual P. crassifolia tree crowns in the 16 subplots nested in the PSP were delineated; the individual tree LIDAR crown projection area and field measurement data were matched by individual tree locations measured by total station-Fu et al. [3] has drawn a figure to explain this. The biomasses of each component (stem, branch, foliage, and fruit) of the 402 P. crassifolia trees were estimated using the empirical allometric models developed by Wang et al. [25] using the field-measured individual DBH and tree height H. The total tree AGB was obtained by summing the component biomass values. For more information about the study sites and data collection, please see Fu et al. [3].

Random Sampling
In order to compare the LIDAR AGB estimation by the Bayesian method and the classical method in different sample sizes, five samples (Samples 1-4) of different sizes were randomly selected from the population. The independent variables in the population, including LIDAR crown projection area (CPA), LIDAR tree height (LH), ground-measured DBH, and the ground-estimated AGB, were assessed. Sample 1 randomly sampled 10% of the population, and Samples 2-4 randomly sampled 25%, 50%, and 75% of the population, respectively. The statistics of sample sizes 1-4 are presented in (Table 1), and we also put the statistics of the full dataset into this table.

DBH and AGB Models
In this paper, four frequently used candidate model forms [3] and another three models [26][27][28] were used to develop LIDAR DBH estimation models by regarding the LIDAR-derived tree height and the crown projection area as the predictor variables. These models are listed as follows: where β 1 -β 4 are parameters of the models, LH is the LIDAR tree height, CPA is the LIDAR crown projection area, and ε DBH is the error term.
On the basis of the best DBH model, five commonly used AGB model forms were applied to estimate the aboveground biomass. They are listed as follows: where α 1 -α 3 are model parameters, DBH is the LIDAR estimated DBH, and ε AGB is the error term.

Bayesian Method
Bayesian estimation is used to combine new evidence and prior distributions with the Bayesian theorem to get a new probability. Zhang et al. [16,29] presented the Bayesian rules in detail. Let y = (y 1 , y 2 , y 3 , . . . ) represent a vector of data and θ = (θ 1 , θ 2 , θ 3 , . . . ) be a vector of parameters to be estimated. Bayes' expression is where p(θ) is the prior distribution for the parameters, which can be obtained by the classical method; and p(θ|y) is the posterior distribution of the Bayesian frameworks.
An important feature of the Bayesian method is that the parameter will be given a prior distribution before sampling. The choice of prior distribution is critical for the Bayesian method [25]. Non-informative priors with large or infinite variance that reflect prior "ignorance" are generally chosen [29][30][31]. Alternatively, if prior information is available from external knowledge (reported parameters from the literature), the information can be used to construct a prior distribution. Zhang et al. [32] used the parameter estimates of height growth models from classical method as the prior information for the Bayesian method, consistent with Zhang et al. [32]. Here, we used the parameters of NLME estimation as the prior distribution for the Bayesian estimation.

Model Evaluation
The best DBH and AGB model was chosen from the value of the deviance information criterion (DIC) [16], when pooling the data: represents the posterior mean of the deviance; and is the effective number of parameters in the model and represents the complexity of the model. A smaller value of DIC for a model indicates a better fit to the data [31].
In addition, the stationarity test in Heidelberger-Welch Diagnostics [33,34] was regarded as a criterion to test if the model converged in our study. The stationary process is a random process in statistics, and its unconditional joint probability distribution does not change with time. Thus, the mean value and variance do not change with time either [35].
The determination coefficient (R 2 ) and mean absolute deviation (MAD) were used to compare the Bayesian method with the classical method. Higher R 2 and smaller MAD values indicate a better model fit to the data. They are calculated by where y is the observed AGB,ŷ is the LIDAR predicted DBH or AGB, y is the mean value of the observed AGB, and n is the number of samples. This study developed several candidate DBH models and selected the best model. On the basis of the best DBH model, DBH predictions were introduced to the five candidate AGB models as one of the independent variables. Both the DBH and the AGB models were developed through the hierarchical Bayesian approach. In addition, five different random sample sizes were used for comparing the Bayesian method with the classical method.
Considering the random effects of plot, we added a random effect parameter to β 1 in DBH models and to α 1 in AGB models using hierarchical Bayesian and classical methods (nonlinear mixed effects model, NLME). The Bayesian and NLME models were constructed by using SAS procedures MCMC and MODEL, respectively [36]. For the Bayesian estimation, a burn-in period of 10,000 steps and 100,000 iterations was used to estimate parameters. The thinning parameter was set to 5 to reduce the correlation between neighboring iterations.

Model Selection
In this study, deviance information criterion (DIC) statistics and the stationarity test were used to evaluate the Bayesian DBH and AGB models. As shown in Table 2, all the DBH models other than model I.5 passed the stationarity test (Table 2). Model I.3 had the smallest DIC value, followed by I.4, I.7, I.2, I.1, and I.6. Thus, model I.3 was selected as the best model to predict DBH used as an independent variable for modeling AGB. The parameter estimations of model I.3 for different sample sizes are listed in Table 3. In all sample sizes, the standard deviations of all the parameters estimated by the Bayesian method were smaller than those estimated by the classical method, meaning that the parameters estimated by the Bayesian method were more stable than those estimated by the classical method. Simultaneously, we can see from Table 3 that in Bayesian estimation, the range of parameters at sample size 39 was between −0.036 and 49.34, while it was between 0.066 and 111.3 at full data. The ranges of parameter estimates for small sample size were significantly narrower than those of the full data, which confirms that for a small sample size, the Bayesian method shows higher efficiency. In terms of the fitting statistics, AGB models II.1, II.3, and II.4 passed the stationarity test (Table 4), however, the other two models failed. Among the three models, II.1, II.3, and II.4, the DIC in II.3 (4754.906) was much smaller than those in the other two models. Therefore, model II.3 was selected as the best model to predict AGB. The parameter estimations of model II.3 are listed in Table 5. The standard deviation of parameter estimations by Bayesian method in all sample sizes are smaller than classical method.   Figure 1 shows the correlations and residuals of AGB using the Bayesian method and the classical method when pooling the data. Both these plots show that the two methods performed well when modeling AGB. Also, the two plots from the two methods are very similar. However, as sample size decreased, the two methodologies produced large differences in R 2 and MAD. For example, when pooling the data, the R 2 of the Bayesian method was 0.5867, which was 2.587% larger than that of the classical method (0.5719). However, for a sample size of 39, the R 2 of the Bayesian method was 11.238% larger than that of the classical method (Table 6). Similarly, although the MAD value from the Bayesian method was slightly smaller (2.172%) than that from the classical method using the whole data set, it was much smaller (16.62%) than that from the classical method for a sample size of 39 (Table 6). In addition, we found that parameter estimation through the Bayesian method was more stable than that through the classical method when using different sample sizes. For the parameter α 1 , the parameter range was 0.027-0.281 for the Bayesian method and 0.007-0.167 for the classical method. In particular, when the sample size was 39, the estimate changed largely compared with that for the pooled data set using the classical method, which could also be seen in parameters α 2 and α 3 (Figure 2). In addition, we found that the classical approach always produced estimates in a wider range than the Bayesian approach (Figure 2 In addition, we found that parameter estimation through the Bayesian method was more stable than that through the classical method when using different sample sizes. For the parameter α1, the parameter range was 0.027-0.281 for the Bayesian method and 0.007-0.167 for the classical method. In particular, when the sample size was 39, the estimate changed largely compared with that for the pooled data set using the classical method, which could also be seen in parameters α2 and α3 ( Figure  2). In addition, we found that the classical approach always produced estimates in a wider range than the Bayesian approach (Figure 2). In addition, we found that parameter estimation through the Bayesian method was more stable than that through the classical method when using different sample sizes. For the parameter α1, the parameter range was 0.027-0.281 for the Bayesian method and 0.007-0.167 for the classical method. In particular, when the sample size was 39, the estimate changed largely compared with that for the pooled data set using the classical method, which could also be seen in parameters α2 and α3 ( Figure  2). In addition, we found that the classical approach always produced estimates in a wider range than the Bayesian approach ( Figure 2).

Discussion
In this study, an individual tree DBH and AGB model was selected from seven DBH models and five AGB models using the Bayesian method on the basis of airborne LIDAR data. When selecting a model, we ensured that all models were compared at the same level and that the random effects of the parameters were considered in the same position. The criteria of model selection were the value of DIC and the results of the parameter stationarity test. When selecting the AGB model, the independent variable DBH was the LIDAR estimated DBH, which ensured that the selected AGB model took the correlation between DBH and AGB into account.
LIDAR has the ability to measure the forest structure, and it is a useful tool for obtaining canopy information [11]. This study used LIDAR to obtain the image data of our study area, then acquired individual tree information, including individual tree height and crown projection area, for model development. For aboveground biomass estimation, many studies have used LIDAR data. Naesset et al. [12] used LIDAR data for estimating the timber volume of forest stands. They obtained the LIDAR canopy height and canopy cover density for timber volume estimation, and they found that LIDAR data produced equal results to aerial photointerpretation, though different sites had large differences when estimating timber volume. Holmgren et al. [11] studied how to use LIDAR-derived data in regressive models for estimating the mean tree height and stem volume; the R 2 of mean tree height linear regression functions was between 0.89 and 0.91 and that for average

Discussion and Conclusions
In this study, an individual tree DBH and AGB model was selected from seven DBH models and five AGB models using the Bayesian method on the basis of airborne LIDAR data. When selecting a model, we ensured that all models were compared at the same level and that the random effects of the parameters were considered in the same position. The criteria of model selection were the value of DIC and the results of the parameter stationarity test. When selecting the AGB model, the independent variable DBH was the LIDAR estimated DBH, which ensured that the selected AGB model took the correlation between DBH and AGB into account.
LIDAR has the ability to measure the forest structure, and it is a useful tool for obtaining canopy information [11]. This study used LIDAR to obtain the image data of our study area, then acquired individual tree information, including individual tree height and crown projection area, for model development. For aboveground biomass estimation, many studies have used LIDAR data. Naesset et al. [12] used LIDAR data for estimating the timber volume of forest stands. They obtained the LIDAR canopy height and canopy cover density for timber volume estimation, and they found that LIDAR data produced equal results to aerial photointerpretation, though different sites had large differences when estimating timber volume. Holmgren et al. [11] studied how to use LIDAR-derived data in regressive models for estimating the mean tree height and stem volume; the R 2 of mean tree height linear regression functions was between 0.89 and 0.91 and that for average stem volumes was between 0.82 and 0.9 when using the LIDAR mean tree height and LIDAR crown coverage area as variables. LIDAR is widely used in biomass prediction.
The study of forest biomass by the Bayesian method has been relatively less frequent than by the classical method. When the Bayesian method is compared to the classical method, an important difference is that the Bayesian method treats the parameters as random variables, while the classical method treats the parameters as fixed values; that is, the classical method treats the probability as the stable value of the frequency of a large number of repeated trials of events. Zapata-Cuartas et al. [14] used the existing biomass equation reported in the literature as the prior distribution, assuming that the information can be useful to probability distributions, and validated the larger R 2 in the Bayesian method. This study used a different approach to the prior distribution-we used the parameters estimated by NLME for the prior distribution and obtained the same results.
We compared the classical method and the Bayesian method in estimating DBH and AGB; the results show that parameter estimation using the Bayesian method had higher stability than that using the classical method, and the R 2 of the Bayesian method was higher than that of the classical method. Zianis et al. [17] also compared the two methods-they obtained a contrary result that the classical method was superior to the Bayesian method [17], but they did not compare the two methods on different sample sizes. In order to show that the Bayesian method is more accurate in estimating AGB for different sample sizes, we randomly sampled from the complete data set ( Table 6). The data showed that the Bayesian method was more efficient than the classical method for different sizes of samples, which is the same as the result for the full data set. Zapata-Cuartas et al. [14] compared these two methods in the same way, and their results showed that for small sample sizes, the Bayesian method outperforms classical methods of least-square regression. Huang et al. [37] used the same method and showed that when the sample size is less than 50, we should use the Bayesian method to estimate aboveground biomass. Yao et al. [38] also presented a comparison of the Bayesian and classical methods-they found a more stable parameter value when using the Bayesian method. Thus, using the Bayesian method to estimate the aboveground biomass from small sample sizes is a good direction in biomass estimation. If classical methods are applied, especially with a small sample size, violation of statistical assumptions of error can lead to biased point estimates [20,39]. In addition, in this study, we have not taken into account the error propagation, thus a join model is perhaps a good method, and further studies are needed to solve the join model under the Bayesian framework for reducing the model uncertainties.
This study developed LIDAR DBH and LIDAR AGB models using the Bayesian method based on airborne LIDAR data. Previous studies have used the classical method to develop models and applied the classical method to airborne LIDAR data [9,10]. Fu et al. [1] developed a system to predict individual tree diameter and aboveground biomass using the classical method and compared two other widely used model structures (the classical method to estimate AGB depending on DBH and the classical method to estimate AGB not depending on DBH). The current article compared the classical method to the hierarchical Bayesian method, and the results of this study showed that the Bayesian method has higher R 2 and smaller MAD than the classical method for all sample sizes; thus, the Bayesian method was better than the classical method in estimating LIDAR DBH and LIDAR AGB for Picea crassifolia Kom. in northwestern China.
Overall, this study combined the advantages of the Bayesian method and airborne LIDAR data; that is, airborne LIDAR data are easier to obtain than conventional aboveground measurements, and the Bayesian method was proven to have greater accuracy than the classical method. In addition, the present study also provides a reference for modeling with a small sample size.
Author Contributions: X.Z., L.F., Q.L., and G.W. conceived the study. M.W., X.Z., and Q.L. performed the analysis and wrote the draft of the paper. All authors contributed to reviewing and improving the paper.