Estimating Tree Volume Distributions in Subtropical Forests Using Airborne LiDAR Data

Accurate and reliable information on tree volume distributions, which describe tree frequencies in volume classes, plays a key role in guiding timber harvest, managing carbon budgets, and supplying ecosystem services. Airborne Light Detection and Ranging (LiDAR) has the capability of offering reliable estimates of the distributions of structure attributes in forests. In this study, we predicted individual tree volume distributions over a subtropical forest of southeast China using airborne LiDAR data and field measurements. We first estimated the plot-level total volume by LiDAR-derived standard and canopy metrics. Then the performances of three Weibull parameter prediction methods, i.e., parameter prediction method (PPM), percentile-based parameter recover method (PPRM), and moment-based parameter recover method (MPRM) were assessed to estimate the Weibull scale and shape parameters. Stem density for each plot was calculated by dividing the estimated plot total volume using mean tree volume (i.e., mean value of distributions) derived from the LiDAR-estimated Weibull parameters. Finally, the individual tree volume distributions were generated by the predicted scale and shape parameters, and then scaled by the predicted stem density. The results demonstrated that, compared with the general models, the forest type-specific (i.e., coniferous forests, broadleaved forests, and mixed forests) models had relatively higher accuracies for estimating total volume and stem density, as well as predicting Weibull parameters, percentiles, and raw moments. The relationship between the predicted and reference volume distributions showed a relatively high agreement when the predicted frequencies were scaled to the LiDAR-predicted stem density (mean Reynolds error index eR = 31.47–54.07, mean Packalén error index eP = 0.14–0.21). In addition, the predicted individual tree volume distributions predicted by PPRM of (average mean eR = 37.75) performed the best, followed by MPRM (average mean eR = 40.43) and PPM (average mean eR = 41.22). This study demonstrated that the LiDAR can potentially offer improved estimates of the distributions of tree volume in subtropical forests.


Introduction
As biologically diverse terrestrial ecosystems on earth, forests provide a number of essential goods and multiple services to human beings, including timber stocks and other non-timber forest products [1,2] such as biodiversity [3], wildlife habitat [4], carbon sequestration [5], and soil and water cycle [6].Sustainable forest management, which has been supported as an important guiding principle, plays a key role in supplying ecosystem goods and services as well as promoting ecological functions of forest ecosystems [3,7].Traditionally, sustainable forest management requires a wide range of reliable and up-to-date information on forest resources, and this information is of great importance for understanding forest growth and supporting forest management decisions [8].Tree size distribution (TSD) of forest stands has been recognized as one of the most important forest structure attributes [9,10], is essential for forest growth, yield of timber volume, and estimation of biomass [11].Quantitative information on TSDs will be beneficial to inform managers on how to regulate stand timber composition and optimize decisions of timber harvest operations [12], such as allowable harvest size, rotation age, and harvest method (e.g., thinning and selection cuttings) [13,14], thus improving the efficiency and sustainability of operations and reducing logging costs of timber production [15,16].Airborne Light Detection and Ranging (LiDAR) has the capability to directly characterize three-dimensional (3D) forest structure by emitting laser pulses and receiving return signals from forest canopy [17,18].LiDAR technology has a capability to produce highly accurate digital elevation models (DEM) and provide wall-to-wall estimations of forest structure attributes [17,19,20], which is useful and effective in ecosystem-based forest management and operational forest applications [21,22].Due to the flexibility and high accuracy of airborne LiDAR for acquiring spatially explicit information of forest resources, this technology can be incorporated to improve the efficiency of forest operational decision makings and to facilitate the sustainable forest management [23,24].
Currently, two approaches have been commonly used for deriving forest structure attributes by airborne LiDAR, i.e., the individual tree crown-based (ITC) approach and the area-based method (ABM) [25,26].For ITC approach, individual trees are extracted from the point cloud using segmentation algorithms, which requires individual tree tops to be identified and tree crowns to be delineated [27,28].The ITC approach can be used to extracted individual tree attributes [29,30], such as timber volume [31][32][33] and stem density [33,34] in previous studies.The ITC is intuitive because the extracted parameters can provide insights into individual tree-level attributes [35].However, the performance of ITC often depends on the accuracy of tree detection [36], stem density [18], and point cloud density [28,36].Moreover, the systematic errors often inevitably occurred in ITC approaches because the individual tree segmentation algorithms may generate errors of omission and commission [19,37].For ABM approach, a number of forest attributes e.g., tree height, stem density, and volume etc. and their spatial distributions can be generated using LiDAR data for each grid cell which usually in accordance with size of field plots [25,30].The ABM approach allows for predicting forest attributes within forest stand while avoiding the bias often introduced by the ITC approach [25].Breidenbach et al. (2010) [35] proposed a semi-ITC that combined a modified ITC approach with ABM for addressing the problems with tree detection and the estimation bias of ITC.The ITC or semi-ITC can theoretically provide more information on individual tree distributions; however, they both require relatively high pulse density of airborne LiDAR data, which requires an increased cost [36,38].The capability of ITC and ABM for estimating mean forest stand attributes and TSDs in temperate forests were compared by Peuhkurinen et al. (2011) [32].The results showed that the ITC approaches had lower accuracies than the ABM approaches for estimating TSDs due to the systematic errors within ITC.
Previous studies also demonstrated that the ABM could be more operationally advanced than ITC in characterizing tree size distribution (TSD) for forest applications [38].TSD can be characterized by probability density function (PDF) in the ABM approach using airborne LiDAR data.The most common PDF for characterizing TSDs of forest structure attributes (e.g., diameter at breast (DBH) or basal area) is the Weibull distribution due to its flexibility [39].Currently, two main approaches have been used to predict TSDs, i.e., the parameter prediction method (PPM) and parameter recovery method (PRM) [40].Both of the two TSDs approaches predict inherently the fitting parameters of the distributions of individual tree attributes [41].In the case of Weibull distribution, PPM can directly predict Weibull parameters of the TSDs using a maximum likelihood estimation (MLE) [42].PRM uses an indirectly approach that first estimate other distributions-related attributes and then using these estimates to predict Weibull parameters.PRM can be further classified as the percentile-based parameter recover method (PPRM) (i.e., based on percentiles) and the moment-based parameter recover method (MPRM) (i.e., based on moments) [42][43][44].In the case of the diameter distribution, the arithmetic mean diameter (as the first order moment) and quadratic mean diameter (as the second order moment of distributions in diameter size classes) can be estimated for deriving Weibull parameters to model diameter distributions [40].Traditionally, these three approaches (i.e., PPM, PPRM, MPRM) were parameterized by the ground-inventoried forest attributes, e.g., stand age, mean DBH, and dominant height [40,42,[45][46][47], but the acquisition of these attributes over a large area is still labor-intensive and time-consuming.In recent years, only a few studies estimated TSDs using PDFs parameters predicted by LiDAR.Gobakken and Naesset (2004) [48] derived DBH and basal area distributions (calculated based on PPM and PPRM methods) to predict total stem volume from airborne LiDAR data over a boreal forest (e.g., Norway spruce and Scots pine stands) in southeast Norway.Their results showed that the methods based on PPRM (bias: −4.8%-2.7%)were more suitable for estimating the DBH and basal area distribution than the methods based on PPM (bias: −4.7%-6.6%).Thomas et al. (2008) [49] used airborne LiDAR data to perform estimates of forest DBH and basal area distributions using Weibull models based on a PPM approach within the boreal forest stands of central Ontario.Their results suggested that the unimodal two-parameter Weibull model (Adjusted R 2 = 0.65-0.88) is a promising technique for the prediction of DBH class distributions.
Although these studies demonstrated that PPM and PPRM approaches have the potential to predict TSDs by LiDAR data, few studies modeled TSDs by the MPRM approach [50].On the other hand, most of previous studies focused on estimates of DBH and basal area distributions rather than tree volume distributions [50][51][52], and these studies were limited to boreal and temperate forests with relatively simple and uniform forest structures.For the decision making of forest management, reliable estimations of the total volume and tree volume distribution provide a basis to improve the management of forest resources and achieve operational forest production goals [53].Subtropical forests, which has been reported by Food and Agriculture Organization of the United Nations (FAO) (2015) [7] that account for approximately 320.1 million hectares of forest areas in the world, are regarded as an important ecosystem with high species richness, complex structure and high productivity [54,55].Individual tree volume distribution information is fundamental for guiding timber harvest carbon management, and supplying ecosystem services in the subtropical forests [11,56].In this study, we estimated individual tree volume distributions by assessing PPM, PPRM, and MPRM approaches over a study site in the subtropical forests of southeast China.The aims of this study are: (1) To extract two suites of LiDAR metrics (i.e., standard and canopy metrics) and select optimal metrics for estimating the plot-level volume; (2) to estimate Weibull parameters by assessing the approaches of PPM (based on direct estimations of Weibull parameters), PPRM (based on percentiles), and MPRM (based on moments) using LiDAR data across different forest types; and (3) to predict the stem density using estimated total volume and mean individual tree volume (obtained from predicted Weibull parameters) for scaling the predicted volume distributions, and to evaluate the accuracies by field measurements.

Materials and Methods
The workflow of this study is presented in Figure 1.The procedures consist of four main steps: Firstly, above-ground returns of raw point clouds data were filtered to obtain ground returns and then interpolated to generate a digital terrain model (DTM) with 1 m resolution.The point clouds were further normalized by the created DTM data.Secondly, the LiDAR metrics (i.e., standard and canopy metrics) were extracted and regressed against field data to estimate total forest volume.The approaches of PPM, PPRM, and MPRM were implemented to estimate the Weibull parameters, percentile parameters and moment parameters of volume distribution respectively.The percentile and moment parameters were then be used for calculating other two groups of Weibull parameters.Thirdly, the stem density within each plot was estimated by means of the LiDAR-estimated plot total volume and mean individual tree volume generated from the estimated Weibull parameters.Finally, the individual tree volume distributions were generated by predicted Weibull parameters and scaling by the field reference and predicted stem density.

Study Area
This study was in Yushan National Forest Park, located in Changshu city of Jiangsu Province in southeast China (31°40′29.1″N,120°42′2.3″E)(Figure 2).It covers approximately 1502.77hectares of forest, accounting for 96% of the total area.The mountain terrain of this study site has a ridge line of approximately 6500 m that extends towards the southeast orientation, with an elevation in the range of 20-261 m.The climate of this site belongs to north-subtropical monsoon with annual mean temperature, annual mean relative humidity and precipitation of approximately 15.4 °C, 80%, and 1047.7 mm, respectively.Yushan Forest is a secondary forest composed of three main forest types, i.e., coniferous-dominated forests, broadleaved-dominated forests, and mixed forests.The dominant broadleaves include deciduous tree species, e.g., Chinese sweet gum (Liquidambar formosana Hance), Camphorwood (Cinnamomum camphora (L.) Presl.), and Sawtooth oak (Quercus acutissima Carruth.);there are some other evergreen broadleaved tree species, e.g., Chinese holly (Ilex chinensis Sims.), Chinese cork oak (Quercus variabilis), Chinese chestnut (Castanea sequinii), White oak (Quercus fabri Hance) etc.The dominant conifers are Slash pine (Pinus elliottii Engelm.),Chinese fir (Cunninghamia lanceolata (Lamb.)Hook.), and Masson pine (Pinus massoniana Lamb.), mixed with a small number of Loblolly pine (Pinus taeda L.) and Japanese black pine (Pinus thunbergii).

Field Plot Measurements
The field measurements for ground truth data were carried out in the summers of 2012 and 2013, i.e., June to August 2012 and August 2013.A total of 51 square plots (900 m 2 ) within the study site were used in this research.The establishment of these plots was guided by existing historical data of forest resource inventory in 2012, according to the forest types, stand age class, species composition, and site quality [56].These plots consist of conifer-dominated plots (n = 14), broadleaf-dominated

Field Plot Measurements
The field measurements for ground truth data were carried out in the summers of 2012 and 2013, i.e., June to August 2012 and August 2013.A total of 51 square plots (900 m 2 ) within the study site were used in this research.The establishment of these plots was guided by existing historical data of forest resource inventory in 2012, according to the forest types, stand age class, species composition, and site quality [56].These plots consist of conifer-dominated plots (n = 14), broadleaf-dominated plots (n = 14), and mixed plots (n = 23).DBH (> 5 cm) of all live trees were measured using a DBH tape, and tree top height (m), crown-based height (m), and stem density (ha −1 ) were recorded.In this study, we used the Vertex IV hypsometer (Långsele, Sweden) to measure tree height.The center position and each corner of the plots were located by Trimble GeoXH6000 (Trimble, Sunnyvale, CA, USA) handheld GNSS (Global Navigation Satellite System) instrument, with a real-time correction referenced from the Jiangsu Continuously Operating Reference Stations (JSCORS) to obtain a sub-meter accuracy for each position [57].The individual tree volume within each plot was calculated according to the local or neighboring provincial tree volume equations (using DBH and height as predictive variables), which were referenced from Zhao et al. (2015) [58], Zeng (2014) [59], Xu and Bao (1994) [60], Cheng et al. (2017) [61] and Huang (2017) [62] (Table A1).Then, the plot-level volume was summed by the individual tree volume within each plot.Dead wood and trees with DBH < 5 cm were not used for calculating the total volume.Skewness (SK) and kurtosis (KT) can be used to describe the shape and modelling properties of the distribution function [63].Hence, the SK and KT of the volume distributions were also calculated for each plot (Table 1).

LiDAR Data Pre-Processing
The airborne LiDAR data were collected over the entire study area on 17 August 2013.The data acquisition was operated from a fixed-wing aircraft (Y-12) flying at an altitude of 900 m above the ground with a flight speed of 55 m•s −1 , using a Riegl LMS-Q680i LiDAR sensor [64].The side overlap between the two flight strips was ≥60%, and the average ground point distances were 0.48 m (in the scanning direction) and 0.49 m (in the flying direction), respectively.The laser pulses emitted by the scanner were in the near-infrared band (1550 nm) with a pulse repetition frequency of 360 kHz and a scanning frequency of 112 Hz.The sensor was set up with a maximum scanning angle of ±30 • (only ±15 • used for analysis), and the temporal sampling time for recording returned pulses was 1 ns.These configurations produced an average beam footprint size of 0.45 m and a point density of approximately 5.06 pts m −2 (Table A2).The point cloud data were extracted from raw full-waveforms by a decomposition process and were stored in LAS 1.2 format (American Society for Photogrammetry and Remote Sensing, Bethesda, MD, USA) by the data provider.
The LiDAR point clouds were first de-noised by removing the outliers.Then, the above-ground returns of the point cloud data were filtered by a linear least square interpolation algorithm adopted from Kraus and Pfeifer (1998) [65], and then smoothed by a 5 × 5 m 2 median filter.A digital terrain model (DTM) was generated by calculating the average elevation from the ground returns within each rasterized cell grid (1 m spatial resolution).If there were no returns within cell grids, a nearest neighboring interpolation was used to fill these "empty" cell grids.Finally, the normalization of point clouds was calculated by subtracting the DTM from the absolute elevation of each point.The point clouds of 51 plots were extracted by the coordinates of field plots.

Model Fitting Approaches for LiDAR-Based Prediction of Volume Distribution
The two-parameter Weibull distribution is flexible and adaptive; thus, this distribution has the capability to characterize the TSDs in forest stands [39,44].The two-parameter Weibull model, with only the parameters of scale and shape, was used in this study, and the density function was given as follows [39]: where x is the random variable, in this case the individual tree volume, which represents the center of a defined volume class in a plot; and b and c are the scale and shape parameters of the Weibull density function, respectively.

Parameter Prediction Method (PPM)
In the PPM, the two Weibull parameters of Equation (1) were regarded as the field reference values and were first obtained by fitting the field tree volume distributions using the Weibull model.Then, LiDAR-derived metrics, as independent variables, were regressed to predict b and c.
To estimate the Weibull b and c parameters of the field reference tree volume distribution, an MLE method was directly used.The MLE method is a common estimator used for the calculation of Weibull distribution parameters in forestry due to its asymptotic unbiasedness and minimal variance [66].Previous studies have demonstrated that using the MLE algorithm could result in better goodness-of-fit statistics between the predicted distribution and field distribution compared to the other fitting methods [43,48].The values of b and c can be calculated by MLE with the following equations: where n is the total amount of sample data, and in this case, x i is the tree volume of each tree.To obtain the MLE, iterative methods are required.In this work, the MLE was implemented using MATLAB R2014b software (Mathworks, Natick, MA, USA) [67].

Percentile-Based Parameter Recovery Method (PPRM)
The PPRM was used to calculate the Weibull parameters using the LiDAR-estimated percentiles of volume distribution.The approach, which uses percentiles to indirectly estimate b and c using a Weibull cumulative distribution function, was described by Dubey (1967) [68], Shiver (1988) [69].The Weibull cumulative distribution function is given in Equation (4): By converting the above formula, we can obtain the value of x as the following equation: The volume percentile v p is defined as the tree volume value at the p% position for a set of tree volume data arrayed from small to large.We assumed that there were two percentiles, denoted as v p 1 and v p 2 , with exactly p 1 and p 2 % of the randomly selected trees being smaller than or equal to the volume percentiles v p 1 and v p 2 , respectively [40,48].The two percentiles were calculated according to Equations ( 6) and ( 7): Thus, the Weibull parameters (i.e., b and c) were calculated with the following equations [48]: The 24th and 93rd percentiles were used for estimating b and c in a Weibull distribution, see details in Dubey (1967) [68].These percentiles (i.e., 24th and 93rd) were also used in the PBRM in other studies [40,48,50].

Moment-Based Parameter Recovery Method (MPRM)
Previous studies have shown that the utilization of moments to recover parameters for representing the DBH distribution could be useful because the moment estimation method could be calculated from the mean DBH (as the first moment) and the quadratic mean DBH (as the second moment) [70], which have been demonstrated to be highly related to the stem number and basal area [40,42,71,72].In this study, the mean volume v and quadratic mean volume v q of the volume distribution were calculated as the first and second moments.The values of b and c were calculated with the following equations: where v is the mean volume, v q is the quadratic mean volume, and Γ is the gamma function.The c of Equation ( 11) was resolved using an iterative bisection method [47].

Estimating Stem Density and Individual Tree Volume Distributions
By estimating the scaled b and shape c parameters, the tree volume distributions could be predicted, and this distribution shows the overall frequency of individual trees in each volume class.Previous studies have demonstrated that stem density could be predicted using airborne LiDAR-derived metrics as independent variables [48,73].In this study, we stratified the 51 plots into coniferous plots, broadleaved plots, and mixed forest plots and estimated their stem density and volume distributions separately and in combination.
We first employed LiDAR metrics to estimate the plot-level total volume (V LiDAR ) and the Weibull scale and shape parameters (based on the PPM, PPRM, and MPRM).Then, the stem density (N LiDAR ) of each plot was derived by the following equation: where E(v) is the mean tree volume within a plot, namely, the mean value of the Weibull distribution: The final equation used to calculate the stem density of each plot was: This study employed field stem density (N ref ) and LiDAR-predicted stem density (N LiDAR ) as scaling variables, namely, to sample the populations of distributions, to predict the distribution of individual tree volumes using the estimated values of b and c.

LiDAR Metrics
The metrics used as the predictor variables were calculated for each 30 × 30 m plot using procedures similar to Naesset (2002) [74].LiDAR metrics were calculated from first returns which were considered to be sensitive to forest structural parameters [26,74].In this study, two suites of LiDAR metrics were calculated, including standard metrics [75] and canopy metrics [76] The standard metrics included height-related metrics (e.g., height percentile, mean height, SK, and KT of heights) and canopy return density-related metrics (e.g., densities and canopy cover).The canopy metrics included canopy volume (CV) metrics, which represent the total canopy volume and the spatial organization of the components (e.g., trunk and foliage) within the canopy; these metrics were calculated by canopy volume model (CVM) (see details in Lefsky et al. (1999) [77]).In this study, the Filled (Filled), Empty (Empty), Open gap (OG), Closed gap (CG), Euphotic (Eu), and Oligophotic (Oligo) of the total canopy volume classes were extracted.Each plot space was first organized as a 5 × 5 × 0.5 m 3 grid composed of voxels.All voxels within the grid were classified as either "filled" or "empty" zones depending on whether LiDAR points existed within each voxel.If point clouds were situated in the uppermost 65% of all filled voxels of each voxel column, the "filled" voxels were further stratified into "euphotic" zones, while "oligophotic" zones were located below the "euphotic" zone within the "filled" voxels.Similarly, "empty" voxels were divided into "closed gap" (below the canopy) and "open gap" (above the canopy) categories [78].Additionally, the rumple (representing canopy roughness) and the coefficient of variation of leaf area density (LAD) (CvLAD) (providing vertical heterogeneity information on forest structure) were employed as part of the canopy metrics in this study.All metrics were calculated in FUSION Version 3.80 [79] and MATLAB R2014b software (Mathworks, Inc., Natick, MA, USA) [67].LiDAR metrics used in this study are presented in Table A3.

Model Validation
An analysis of the Pearson's correlations (r) between the total LiDAR metrics and parameters (i.e., plot total volume, scale and shape of Weibull parameters estimated by MLE), two percentiles (i.e., 24th and 93rd percentiles) and moments (i.e., mean volume and quadratic mean volume) was performed for selecting optimal metrics.In this study, the metrics with relatively high correlations (r ≥ 0.6) were chosen as candidate metrics for regression analyses.To improve model linearity, all of the dependent variables (i.e., volume, Weibull parameters, two percentiles, mean volume, and quadratic mean volume) and independent variables (i.e., LiDAR-derived metrics) were converted to a form of natural logarithm in the regression analyses because this functional form offers more effective estimates of forest structural attributes [74].Then, all variables were back-transformed from the natural logarithm to original arithmetic units using a bias correction factor (BCF) [80].
According to the stratified results, general models (i.e., all plots) and forest type-specific (i.e., coniferous forest plots, broadleaved forest plots and mixed forest plots) models were established.Variables were left in the model after the F-test with a significance level of p < 0.05 [81].The method of principal component analysis (PCA) based on the correlation matrix was applied to avoid high correlation of independent variables.To reduce the collinearity of regressions, the models with a low condition factor (k < 30) based on PCA were accepted in this study [74].The best-fitting models were eventually selected according to the lowest Akaike information criterion (AIC) value [82].We used the coefficient of determination (R 2 ), root mean square error (RMSE), and relative RMSE (rRMSE), which was transformed back to the original scale to evaluate the performance of predictive models.To examine whether there were significant differences between the field parameters and the estimated parameters in the regression models, a Wilcoxon signed rank test was applied to evaluate whether the observed and predicted data came from a similar population distribution.To examine whether the predictive models differed between various forest types, dummy variables (or class variables) were employed as the dependent variables in selected regression models [22,83].Finally, the predictive accuracies of all selected models were validated by a leave-one-out cross-validation [84].

Goodness-of-Fit for Tree Volume Distribution
The goodness-of-fit between the reference and LiDAR-predicted tree volume distributions on each plot was evaluated by the mean absolute error (MAE), RMSE, rRMSE, Reynolds' error index [85], and Packalén (2008) [86] error index.The calculations of the two error indexes were as follows: where f ref i . is the reference and f LiDAR i is the LiDAR-estimated stem frequency in volume class i, m is the total number of classes, and N ref is the reference of the N LiDAR estimated stem density, which is equal to the total frequencies of all volume classes.In this study, the width of each volume class was 0.05 m 3 .From Equations ( 15) and ( 16), e R and e p are similar; however, the e p employed the relative frequencies for the error in the estimate of stem density (N LiDAR ) [86].In both error indexes, a smaller value indicates a better fit, and the e p value varies from 0 to 1.The test statistics error indexes for assessing the agreement of individual volume distributions were further compared via the Wilcoxon rank sum test (or Mann-Whitney U-test) to find statistically significant differences between the three models (i.e., the PPM, PPRM, and MPRM) regarding their goodness-of-fit values [46].

Development of Predictive Models for Plot Total Volume, Weibull Parameters, Percentiles, and Moments
By using the stepwise multiple regression analysis, the total volume, the PPM, the PPRM, and the MPRM were estimated, and the results of the models were evaluated in different forest types.All the selected models included no more than four variables.The evaluation criteria, including R 2 , RMSE, rRMSE, and Wilcoxon signed rank test, were calculated after developing the models.The parameter estimates and fitting statistics of the predictive models for plot total volume (V) are shown in Table 2.The corresponding statistics for the two Weibull parameters (i.e., b and c) in the PPM, the 24th and 93rd percentiles (i.e., v p 1 and v p 2 ) in the PPRM and the moments (i.e., mean volume v and quadratic mean tree volume v q ) in the MPRM, are shown in Tables 3-5.In comparison, the forest type-specific models (R 2 = 0.56-0.91,rRMSE = 10.65%-39.81%)obtained relatively better performance than the general models (R 2 = 0.35-0.80,rRMSE = 23.60%-49.70%).This result indicated that the stratification method based on forest type improved the accuracies of parameters.For separate models developed by the three forest types, the models performed best in coniferous plots (R 2 = 0.78-0.91,rRMSE = 10.15%-23.16%),which is better than in broadleaved (R 2 = 0.66-0.86,rRMSE = 12.41%-32.38%)and mixed plots (R 2 = 0.56-0.76,rRMSE = 14.13%-39.81%).
From Tables 2-5, the maximum k value of the developed models was 29.03, indicating there were no serious collinearity problems.The differences between the reference and predicted values of all 7 parameters (i.e., V, b, c, v p1 , v p2 , v, and v q ) were evaluated by the Wilcoxon signed rank test, and the results are presented in Tables 2-5.The results based on the level of significance at a p-value of 0.05 showed that the differences between the reference and predicted parameters were not significant (p-value = 0.25-0.95).Notes: Details of the LiDAR metrics selected in the models are displayed in Table A3.Significance level: NS = not significant (p > 0.05) ** p < 0.01; *** p < 0.001.
The total volume models were regressed against h var , h cv , h max , and Filled, and the R 2 values for total volume ranged from 0.60 to 0.79.The RMSE and rRMSE values ranged from 15.05 to 20.12 m 3 /ha and from 11.40% to 25.73%, respectively.A comparison of the results among the PPM, PPRM, and MPRM models found that the MPRM models (R 2 = 0.62-0.91,rRMSE = 14.20%-32.40%)developed for estimating v (R 2 = 0.76-0.91,rRMSE = 14.20%-25.65%)and v q (R 2 = 0.62-0.86,rRMSE = 15.60%-32.40%)performed slightly better than the PPRM models (R 2 = 0.59-0.87,rRMSE = 10.65%-36.82%) in terms of estimating v p1 (R 2 = 0.62-0.87,rRMSE = 10.65%-36.82%) and v p2 (R 2 = 0.57-0.87,rRMSE = 16.56%-35.19%).In the direct estimation of b and c, the PPM models obtained R 2 values of 0.76-0.88 and 0.35-0.78,respectively, indicating that 76%-88% and 35%-88% of the variance in b and c were explained by the regression models.The RMSE values varied between 0.01 and 0.03 for the estimated b and between 0.10 and 0.18 for the estimated c.The rRMSE values of the estimated b models (16.58%-23.60%)were higher than those for the estimated c models (23.16%-49.70%).As can be observed in Tables 2-5, across the seven groups of predictive models for the seven parameters (V, b, c, v p1 , v p2 , v, and v q ), the number and composition of the input LiDAR metrics are different, but all of them had 4 input variables.Each predictive model included the height-related metrics (e.g., h kurtosis , h IQ ), which illustrated that there was generally a significant relationship between the estimated parameters and the metrics.The most frequently selected metrics were h min (selected by 3 out of 7 models) and h kurtosis (selected by 5 out of 7 models), and other height-related metrics (h 25 , h 50 , and h IQ ) were selected by 2 out of 7 models, separately.According to the combination of metrics, 6 suites of parameter models contained simultaneous height-related and density-related metrics (e.g., d 5 and d 7 ), except for the shape c models that contained only height-related metrics.In addition, the plot total volume and 24th percentile models included canopy metrics that were highly related to canopy structure, such as Filled metrics including tree volume, and Rumple including canopy roughness.
The statistical results of the Weibull parameters b and c estimated by the PPM, PPRM, and MPRM are presented in Table A4.In general, the values of b from the PPM, PPRM, and MPRM varied from 0.04 to 0.30, 0.03 to 0.29, and 0.04 to 0.28, respectively, while the values of c varied from 0.11 to 2.07, 0.15 to 1.66, and 0.31 to 2.34, respectively.For the three Weibull parameter prediction methods, the PPRM for b and c had lower deviations (SD = 0.03-0.29)than those of the other two methods (SD = 0.04-0.39).

Estimation of Stem Density
In this study, the stem density of each plot was estimated by the LiDAR-estimated plot total volume and mean tree volume calculated from the predicted Weibull parameters of the individual volume distribution.Generally, the results showed that the predicted forest type-specific models (R 2 = 0.49-0.83,rRMSE = 16.08%-23.07%)had relatively better performance, while the general models had relatively lower estimation accuracies (R 2 = 0.32-0.64,rRMSE = 22.70%-44.42%)(Figure 3).In terms of tree number, the PPRM predictive models performed best (R 2 = 0.55-0.83,rRMSE = 16.08%-22.70%)across all forest types.Compared with the predictive models based on the PPM (R 2 = 0.32-0.76,rRMSE = 18.98%-44.42%),the predictive models based on the MPRM (R 2 = 0.61-0.74,rRMSE = 19.71%-23.49%)had a relatively higher performance, except for the estimation of stem density in broadleaved forest.

Estimating Tree Volume Distribution
The predicted tree volume distributions of each plot were generated from the predicted Weibull parameters, which were further calculated from the PPM, PPRM, and MPRM models.Figure 4 shows the reference volume distributions and the predictive Weibull distributions obtained from the PPM, PPRM, and MPRM using LiDAR-predicted stem density (i.e., N LiDAR was used as scaling variables for 9 sample plots with different stem densities).The corresponding statistics of the reference and LiDAR-predicted scale and shape, SK and KT of volume distributions for the same sample plots are listed in Table A5.From Figure 4 and Table A5, all three methods generally performed well for fitting reference volume distributions of the sample plots, with the scale parameters in the range of 0.05-0.23 and the shape parameters in the range of 0.64-4.21.In terms of SK and KT, the SK and KT of the observed volume distributions varied from 0.20 to 4.12 and 1.56 to 19.84, respectively.The volume distributions of the nine sample plots showed right-skewed and positive kurtosis shapes, which correspond to the SK (0.04-4.41) and KT (1.50-22.51) of the distributions predicted with the PPM, PPRM, and MPRM.The SK and KT values in the plots increased, while the mean volume decreased with increasing stem density, indicating that the shapes tended to skew to the right.Meanwhile, the reference volume distributions became more asymmetric, representing a reversed J-shaped distribution in the high stem density plots of conifer (N = 3167 trees/ha), broadleaved (N = 1633 trees/ha), and mixed forest (N = 2233 trees/ha), suggesting that the high density may result in a strong right-skewed distribution.At the lowest density, the decreased peak of the distributions made the shapes of the distributions more platykurtic, and the peaks were mostly distributed in the volume classes between 0.075 and 0.225 m 3 .The performance of the tree volume distributions predicted through the modelled Weibull parameters by three methods for the 9 plots was further assessed by the two error indexes, e R and e P .As the stem density increased, the values of e R and e P gradually decreased, and it appeared that the reference and predicted (Reynolds' error index e R = 16.59-22.04,Packalén error index e P = 0.07-0.11)volume distributions performed better in high density.Overall, the PPRM (e R = 13.73-39.70,Figure 5 presents the comparison results of the reference and predicted tree volume distributions using the field total stem density as the scaling variable.The results showed that the predicted volume distributions fitted by the Weibull distributions predicted from the PPRM were closer to the reference distributions than other methods.The predictions of distributions based on the PPRM produced relatively lower mean values of MAE (3.73-6.50),RMSE (5.65-9.31ha −1 ) and rRMSE (45.84%-58.26%)as well as lower mean error indexes (mean eR = 29.40-36.69,mean eP = 0.15-0.18).Generally, the volume distribution prediction methods exhibited agreement between the reference and predicted distributions in conifer forests (mean eR = 28.07-29.98,mean eP = 0.14-0.15)than other forest types (mean eR = 29.40-41.92,mean eP = 0.15-0.21).
In comparison, when the LiDAR-predicted stem density was used as the scaling variables to estimate the total distribution frequency in each volume class, modelling volume distributions based on the PPRM (average mean eR = 37.75, average mean eP = 0.17) performed better than the modelling based on the MPRM (average mean eR = 40.43,average eP = 0.18) and PPM (average mean eR = 41.22,average eP = 0.18) in terms of error indexes, which were similar to the results when using field stem density as the scaling variable (Figure 6).The mean values of MAE, RMSE, and rRMSE of the Figure 5 presents the comparison results of the reference and predicted tree volume distributions using the field total stem density as the scaling variable.The results showed that the predicted volume distributions fitted by the Weibull distributions predicted from the PPRM were closer to the reference distributions than other methods.The predictions of distributions based on the PPRM produced relatively lower mean values of MAE (3.73-6.50),RMSE (5.65-9.31ha −1 ) and rRMSE (45.84%-58.26%)as well as lower mean error indexes (mean e R = 29.40-36.69,mean e P = 0.15-0.18).Generally, the volume distribution prediction methods exhibited agreement between the reference and predicted distributions in conifer forests (mean e R = 28.07-29.98,mean e P = 0.14-0.15)than other forest types (mean e R = 29.40-41.92,mean e P = 0.15-0.21).
modelling distributions based on the PPRM obtained values of 2.60-6.81,2.84-11.29 and 51.53-70.61trees/ha, respectively.The volume distributions predicted from the LiDAR data produced relatively higher error indexes (mean eR = 31.47-54.07,mean eP = 0.15-0.21)than those produced using scaling with field-measured stem density (mean eR = 28.07-41.92,mean eP = 0.14-0.21).In comparison, when the LiDAR-predicted stem density was used as the scaling variables to estimate the total distribution frequency in each volume class, modelling volume distributions based on the PPRM (average mean e R = 37.75, average mean e P = 0.17) performed better than the modelling based on the MPRM (average mean e R = 40.43,average e P = 0.18) and PPM (average mean e R = 41.22,average e P = 0.18) in terms of error indexes, which were similar to the results when using field stem density as the scaling variable (Figure 6).The mean values of MAE, RMSE, and rRMSE of the modelling distributions based on the PPRM obtained values of 2.60-6.81,2.84-11.29 and 51.53-70.61trees/ha, respectively.The volume distributions predicted from the LiDAR data produced relatively higher error indexes (mean e R = 31.47-54.07,mean e P = 0.15-0.21)than those produced using scaling with field-measured stem density (mean e R = 28.07-41.92,mean e P = 0.14-0.21).
distributions in different Weibull parameter predictive models (PPM, PPRM, and MPRM) using field reference stem density Nref as the scaling variable.The goodness-of-fit for predicted volume distributions was evaluated by MAE (a), RMSE (b), rRMSE (c), eR (d), eP (e).The number beside each box represents the mean value of the relative statistics.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MPRM: moment-based parameter recovery method; AP: all plots; CF: coniferous forest plots; BF: broadleaved forest plots; MF: mixed forest plots.To evaluate the statistically significant differences of the Packalén error index e P for the Weibull parameter prediction methods (PPM, PPRM, and MPRM), the Wilcoxon rank sum test was performed to compare the difference between the e P and other methods (Figure 7).The z statistics results represented deviations between two compared samples, where a higher absolute value suggests a significant deviation.From Figure 7, the results for e P revealed that in the comparisons between the PPRM and PPM for all plots, broadleaved forests and mixed forests were significant at p < 0.01 with relatively high deviations, while the comparisons between the PPRM and MPRM were significant at p < 0.05 level for all forest types.Moreover, the comparisons between the PPM and MPRM were not significant at either p < 0.05 or p < 0.01.The results for e P suggested that the PPRM produced significantly more accurate fitness for all forest types than other methods at p < 0.05.suggests a significant deviation.From Figure 7, the results for eP revealed that in the comparisons between the PPRM and PPM for all plots, broadleaved forests and mixed forests were significant at p < 0.01 with relatively high deviations, while the comparisons between the PPRM and MPRM were significant at p < 0.05 level for all forest types.Moreover, the comparisons between the PPM and MPRM were not significant at either p < 0.05 or p < 0.01.The results for eP suggested that the PPRM produced significantly more accurate fitness for all forest types than other methods at p < 0.05.

Estimation of Plot Total Volume, Weibull Parameters, Percentiles and Moments
In this study, we proposed a framework for providing an enhanced ABM to generate predictions of individual tree volume distributions from PPM, PPRM, and MPRM approaches using airborne LiDAR data in relatively complex forest stands within a subtropical forest.The results demonstrated that airborne LiDAR could be timely and effectively applied to derive the volume distributions in such forest conditions.Holmgren et al. (2004) [87] estimated the stand volume (RMSE = 31 m 3 /ha, rRMSE = 11.00%) using the 90th height percentile metrics derived from LiDAR data in a boreal forest over southwestern Sweden.Montealegre et al. (2016) [88] estimated management-related forest stand-level attributes, including total volume, using airborne LiDAR data in Mediterranean Aleppo pine forests.The results showed that timber volume (R 2 = 0.90, RMSE = 11.01 m 3 /ha) was well predicted by the height-related metrics (e.g., 95th percentile of height and kurtosis of height) and the density-related metrics (i.e., percentage of all returns above 1 m).Other studies have also demonstrated that the height and density metrics were effective predictors of stand volume [74,[89][90][91].In our study, volume models were developed by standard metrics (i.e., hvar, hcv, hmax), indicating these height metrics were sensitive to volume and other regressed models also included the standard metrics (e.g., h25, h50, hkurtosis), however, the standard metrics reported by their studies are not well

Estimation of Plot Total Volume, Weibull Parameters, Percentiles and Moments
In this study, we proposed a framework for providing an enhanced ABM to generate predictions of individual tree volume distributions from PPM, PPRM, and MPRM approaches using airborne LiDAR data in relatively complex forest stands within a subtropical forest.The results demonstrated that airborne LiDAR could be timely and effectively applied to derive the volume distributions in such forest conditions.Holmgren et al. (2004) [87] estimated the stand volume (RMSE = 31 m 3 /ha, rRMSE = 11.00%) using the 90th height percentile metrics derived from LiDAR data in a boreal forest over southwestern Sweden.Montealegre et al. (2016) [88] estimated management-related forest stand-level attributes, including total volume, using airborne LiDAR data in Mediterranean Aleppo pine forests.The results showed that timber volume (R 2 = 0.90, RMSE = 11.01 m 3 /ha) was well predicted by the height-related metrics (e.g., 95th percentile of height and kurtosis of height) and the density-related metrics (i.e., percentage of all returns above 1 m).Other studies have also demonstrated that the height and density metrics were effective predictors of stand volume [74,[89][90][91].In our study, volume models were developed by standard metrics (i.e., h var , h cv , h max ), indicating these height metrics were sensitive to volume and other regressed models also included the standard metrics (e.g., h 25 , h 50 , h kurtosis ), however, the standard metrics reported by their studies are not well explained by biophysical mechanisms [56].In addition, standard metrics, such as height-related percentile metrics, usually have inter-related relationships, and these metrics rely on plot sizes and LiDAR point density [10,92].Compared to boreal forests with a much higher homogeneous composition and more discernible canopy architecture, the subtropical forests are commonly multi-layered and encompass some stands with considerable variability in tree height and stem density, which may lead that these traditional standard metrics have a relatively low transferability and are hard to capture forest vertical structural heterogeneity [56,93].
Canopy is a vital constituent of forest structure, and canopy structure is critical for estimates of forest structural parameters [76,77].Therefore, a few metrics based on canopy structure were developed to estimate forest structural attributes because these metrics take into account the canopy geometry and tree architecture [37].As one of the means to quantify and analyze complex forest canopy structure, canopy vertical profile can be used to extract relevant metrics for further characterizing the potential heterogeneity of forest spatial structure [66].For instance, Lefsky et al.(1999) [77] developed canopy volume metrics (i.e., open gap, closed gap, euphotic, oligophotic, filled, and empty) derived from canopy volume profiles (CVPs) to characterize the 3D canopy structure by quantifying the total canopy volume component and the spatial arrangement of the canopy material (e.g., branch, stem, and foliage) in a grid unit.The canopy volume metrics have proven to be useful for structural attributes, including the estimation of volume.The rumple index, referred to as the canopy surface roughness, has been proposed to capture the heterogeneity of the outer canopy surface with respect to the plot area.This metric has shown to be highly sensitive to variation within a heterogeneous canopy structure [94].Bouvier et al. (2015) [37] proposed that the coefficient of variation of the LAD (CvLAD) metric contained more distribution information on the forest vertical structure of a stand and found that the CvLAD was more sensitive to volume.Canopy cover is also useful for developing a better characterization of the 3D canopy structures [17, 95,96].The utilization of canopy metrics could provide more heterogeneous 3D information on canopy structure than can standard metrics [37].Thus, these canopy metrics, including canopy volume, coefficient of variation of vertical LAD, rumple and canopy cover, were also employed for the regression models in this study.Overall, the results showed that most of the developed models included the canopy metrics, such as Filled of the total volume (V) models, Rumple of the scale (b) models, CC 2m of the 24th percentile v p 1 models, and mean volume (v) models; exceptions included the shape (c) models, 93rd percentile v p 2 models, and quadratic mean volume (v q ) models.The results indicated that the inclusion of canopy metrics could effectively improve the estimations of forest structural attributes, which was concluded in some previous studies [22,56,97].Zhang et al. (2017) [56] demonstrated the usability of canopy volume metrics derived from CVPs for enhancing the estimates of forest structural parameters (e.g., volume) in same study area.Filled metric actually represents overall canopy volume and was highly related with stem volume [78,98], which may be a reason for explaining that Filled metric was selected by volume models.Our results found that canopy volume metrics only used for volume models (V) whereas most of developed models selected other canopy metrics (Rumple and CC 2m ).Actually, canopy volume metrics described in previous literatures have the potential to provide information on vertical heterogeneity because they were directly derived from vertical profiles.Rumple metrics used in our study actually provided more horizontal heterogeneity information on canopy structure, which may explain that this metric was selected by scale models.In addition, as Yushan forest is in secondary succession, the forest canopy surface became more uneven, which may lead Rumple (representing canopy surface roughness) was more sensitive to scale parameters.Prediction models for estimating Weibull parameters, percentiles, and moments were derived from airborne LiDAR data.Indeed, all of these predicted parameter LiDAR-derived metrics were based on stand-level volume information.For instance, the Weibull parameters and percentiles were derived from fitted volume distributions and field reference volume distributions, respectively; therefore, they provided information on individual tree volume distributions, while two moments represented stand-level information on the mean and variance of volume, which provides an empirical linkage into the management of stand density and construction of yield tables.In terms of the performance of predictive models, the PPM models (b and c) (R 2 = 0.57-0.87rRMSE = 16.56%-35.19%)performed slightly lower than did the MPRM (R 2 = 0.62-0.91,rRMSE = 14.20%-32.40%)(v and v q ) and PPRM (R 2 = 0.59-0.87,rRMSE = 10.65%-36.82%)(v p 1 and v p 2 ) models (Tables 3-5).This result is likely because the two Weibull parameters were directly derived from the continuous curves of the fitted volume distributions, which could compound systematic error distributions fitted to LiDAR data in complex forest stands (e.g., an uneven-sized volume composition stand).
Additionally, b (i.e., Weibull scale parameter) and c (i.e., Weibull shape parameter) performed relatively worse than percentiles, which was in accordance with the results of Gobakken and Naesset (2004) [48].Similar to Thomas et al. (2008) [49], Tompalski et al. (2015) [99] and Mulverhill et al. (2018) [52], the b models (R 2 = 0.76-0.88,rRMSE = 16.58%-23.60%)obtained relatively better performance than did the c models (R 2 = 0.35-0.78,rRMSE = 23.16%-49.70%) in this study.The c value can control the shape of the distribution, which could be an exponential, normal, or inverse-J distribution [39], while the change in b involved a scale size of the distribution.However, the shape of the distribution is more difficult to precisely characterize than is the scale parameter.In addition, shape of the volume distribution actually represents more horizontal information on tree volume distributions within a stand, but most of LiDAR-derived metrics used in this study only provided more vertical structure information rather than horizontal structure information, which may explain that the lower performance of shape models.The LiDAR metrics, i.e., h 20 , h 50 , h kurtosis , and CC 2m , were selected in the final b estimation models (Table 3), indicating that these metrics were sensitive to b.Similarly, Mulverhill et al. (2018) [52] reported that h 90 , h kurtosis , and CC 2 were especially suitable for predicting b.In our study, the inclusion of h kurtosis and CC 2 in model b is likely explained by the kurtosis of the height and canopy cover involving the scale range of forest canopy structure (e.g., volume and biomass) within a stand in certain conditions.The significance of using the h 20 and h 50 metrics to predict b may be explained by more information on tree volume with increasing tree height.
In this study, the estimated general models of all plots (R 2 = 0.35-0.80,rRMSE = 23.60%-49.70%)produced relatively lower estimation accuracies than did the type-specific models (R 2 = 0.56-0.91,rRMSE = 10.65%-39.81%) for the seven parameters (i.e., V, b, c, v p1 , v p2 , v, and v q ), indicating that the use of separate models based on stratified plot data according to different forest types can improve the estimates of forest structural attributes.Tonolli et al. (2011) [90] combined LiDAR and multispectral data to estimate timber volume in subtropical forests (located in the southern Italian Alps).They found that the stratification of various forest types can improve the estimated results (rRMSE = 15.5%-21.3%).Naesset (2004) [100] and Eskelson et al. (2008) [101] reported that the method of stratifying field plots according to forest type or species composition did not have any significant impact on the estimations of forest structural attributes in boreal forests.However, due to the more complex forest structure (e.g., multi-layered and uneven canopies, shrubby understory) and the greater species diversity of subtropical forests in local study sites, the effects of forest types (tree-species composition) became more significant for estimating forest structural attributes.In addition, the metrics derived from the relatively low point density of the LiDAR data may be difficult to indirectly distinguish between unimodal and multimodal volume distributions in plots.Hence, stratifying the field plot data according to different forest types could refine the information on unimodal and multimodal volume distributions and improve the performance of predictive models [102].In general, coniferous models were relatively more accurate in all predictive models than were broadleaved forest plots and mixed forest plots.The relationships between forest structure and estimated structural attributes are usually species-dependent, and coniferous forests are easily characterized by relatively simple and even conical crown structures [37,56].

Estimation of Stem Density
Stem density, defined as the number of trees per unit area, and information on the spatial arrangement of individual tree attributes (e.g., DBH, tree height, basal area, and timber volume), is fundamental for forest growth and yield [103].Therefore, reliable estimates of stem densities can also be profound for guiding forest management operations and informing decision making in managed forest ecosystems [104].Operationally, stem density management (e.g., selective thinning) indeed affects the rate of volume production.Drew and Flewelling (1979) [105] proposed the empirical relationship between mean volume and stem density using stem density management diagrams (SDMDs) to estimate thinning yields.Previous studies have demonstrated that stand total volume could be calculated based on this relationship between mean volume and stem density [103,106].In this study, we also predicted stem density by dividing the estimated plot total volume by the mean tree volume (i.e., mean value of distributions) generated from the two LiDAR-estimated Weibull parameters based on the PPM, PPRM, and MPRM.
Traditionally, the correlation between airborne LiDAR-derived metrics and stem density was poorer than that between other structural attributes (i.e., tree height and biomass), which was demonstrated by a few studies [74,91,107].However, the accuracy of this method in which stem density was predicted directly using metrics as explanatory normally depends on forest structure, and the results are commonly underestimated, likely because the LiDAR signals for detecting information on understory structure were suppressed by the higher frequency of larger trees or the clustered crown structure [108].Thus, we investigated the utilization of the established relationships between the LiDAR-estimated plot total volume and calculated mean tree volume (generated from the predicted Weibull parameters of individual tree volume distributions) for more accurate estimates of stem density.In our case, most of the stem density models (except for the general model based on the PPM) (R 2 = 0.49-0.83,rRMSE = 16.08%-44.42%)performed better than did the results (R 2 = 0.45-0.64,rRMSE = 20.07%-28.90%) reported by Zhang et al. (2017) [56] at the same study site.The results indicated that stem density could be effectively estimated by converting the predicted plot total volume and mean tree volume.Moreover, the results revealed that predictive forest-type-specific models (R 2 = 0.49-0.83,rRMSE = 16.08%-23.07%)performed relatively better than did predictive general models at all plots (R 2 = 0.32-0.64,rRMSE = 22.70%-44.42%),which was similar to the results reported by Naesset (2002) [74].However, as Table 6 shows, the results of cross-validation reveal that most of the stem density models slightly underestimated the ground truth by 1.00-27.03ha −1 , with relatively low standard deviations (195.35-453.75 ha −1 ), except for the general model based on the PPM and the broadleaved models based on the PPM and PPRM.As a comparison, a larger bias (−103-15 ha −1 ) and standard deviation (128-400 ha −1 ) of stem density were found in a boreal forest by Naesset (2002) [74], suggesting that our approach based on volume could result in minimally biased predictions of stem density.One reason the bias was not completely eliminated may still be that individual volume distributions predicted by LiDAR-derived Weibull parameters had difficulty fully capturing the small trees suppressed in multi-layered forests.On the other hand, the errors may originate from the predicted total volume and mean tree volume, and the errors did not fully cancel each other out to a certain extent.Although the biases and deviations of the predicted total volume and mean tree volume could be quite small, uncertainties of accumulative errors in this model would also result in an overestimation of stem density, such as that in the general model based on the PPM (Table A4).

Estimation of Tree Volume Distribution
In this study, three parameter predictions (i.e., PPM, PPRM, and MPRM) for characterizing tree volume distributions with different stem densities (i.e., low, moderate, and high) are exhibited in Figure 4.The stem densities of forest stands are varied in successive cuttings [109], resulting in dynamic and changing volume distributions.The reference volume distributions of the 9 sample plots selected according to forest type and stem density were relatively well fit (e R = 16.59-65.48,e P = 0.07-0.33),with the scale parameters in the range of 0.05-0.23 and the shape parameters in the range of 0.64-4.21(Table A5).SK has been recognized to measure asymmetry.Negative SK values indicate a left-skewed distribution with a long tail to the left, while positive values represent a right-skewed distribution with a long tail to the right.KT is generally considered a relative measure of the peak of a distribution, with KT values being larger under distributions with a larger peak [70].Siipilehto and Mehtätalo (2013) [110] compared the PPM and MPRM for fitting the diameter distributions with various stem densities over a boreal forest, and they found that the SK and KT of predicted diameter distributions increased with increasing stem density.Because there was a positive correlation between volume and DBH, consistent results were also reported in our study.In this study, we summarized the mean value of the Packalén error index e p across tree volume classes of all plots for each forest type using predicted tree volume distributions based on the PPM, PPRM, and MPRM.Similar to other studies that focused on DBH or basal area distributions [47,48], the highest values of the mean e P were observed in the smallest tree volume class of the volume distributions (Figure 8).However, the three approaches generally provided relatively consistent results in terms of e P for each individual tree volume class, always tending to decrease with the increasing tree volume class size, indicating decreasing variations in the fitted volume distributions.This result may explain why the best performance of fitting initially occurred in the smallest volume class (Figure 4).
Similar to other studies that focused on DBH or basal area distributions [47,48], the highest values of the mean eP were observed in the smallest tree volume class of the volume distributions (Figure 8).However, the three approaches generally provided relatively consistent results in terms of eP for each individual tree volume class, always tending to decrease with the increasing tree volume class size, indicating decreasing variations in the fitted volume distributions.This result may explain why the best performance of fitting initially occurred in the smallest volume class (Figure 4).In this study, we employed field-measured stem density Nref and LiDAR-estimated NLiDAR as scaling variables to predict the distribution of individual tree volumes using estimated b and c.In terms of error indexes, the relationship between the predicted and reference volume distributions when the predicted frequencies of the volume distributions were scaled to ground-truth stem density was relatively stronger (mean eR = 28.07-41.92,mean eP = 0.14-0.21)than that when the predicted frequencies were scaled to stem density predicted from LiDAR data (mean eR = 31.47-54.07,mean eP = 0.15-0.21)(Figures 5 and 6).This result was especially pronounced in all of the general models (i.e., all plots), where the mean eR and eP were 35.16-41.24and 0.18-0.21(Figure 5), respectively, and the In this study, we employed field-measured stem density N ref and LiDAR-estimated N LiDAR as scaling variables to predict the distribution of individual tree volumes using estimated b and c.In terms of error indexes, the relationship between the predicted and reference volume distributions when the predicted frequencies of the volume distributions were scaled to ground-truth stem density was relatively stronger (mean e R = 28.07-41.92,mean e P = 0.14-0.21)than that when the predicted frequencies were scaled to stem density predicted from LiDAR data (mean e R = 31.47-54.07,mean e P = 0.15-0.21)(Figures 5 and 6).This result was especially pronounced in all of the general models (i.e., all plots), where the mean e R and e P were 35.16-41.24and 0.18-0.21(Figure 5), respectively, and the corresponding values for the predicted distributions using N LiDAR as the scaling variable were 41.17-54.07 and 0.18-0.21,respectively (Figure 6).These large discrepancies may partially be due to the relatively poor performance of the predicted stem density of the general models (R 2 = 0.32-0.64,rRMSE = 22.70%-44.42%)(Figure 3a-c).The underestimation in most stem density models could be explained by the different performances regarding the predicted volume distributions using N ref and N LiDAR as scaling variables.
Although field-measured stem density (as scaling variables) provides relatively more accurate estimates of tree volume distribution, the use of estimated stem density (as scaling variables) is available for the purpose of remote sensing at the stand-level by airborne LiDAR technology.From Figures 5 and 6, the results revealed that the volume distributions modelled based on the PPRM performed relatively better than did those based on the MPRM and PPM in terms of predicting errors.Liu et al. (2004) [43] used the PPM, PPRM, and MPRM to estimate the Weibull parameters to estimate the DBH distributions in unthinned black spruce plantations.They found that the PPRM (mean e R = 80.98) was the most suitable for estimating the distributions of DBH compared with the MPRM (mean e R = 82.73)and PPM (mean e R = 83.98),which was similar to our results.The main advantage of using the PPRM is that the volume percentiles derived from the ground-truth distributions and LiDAR-derived metrics (related to stand attributes) can be estimated with more confidence than when the PPM directly predicts the Weibull parameters [48].Siipilehto (2009) [40] noted that one frequent disadvantage of the PPM is that the parameters of the PDFs were not closely related to the forest structural attributes.Moreover, the PPM was based on the MLE algorithm in this study.The PPM usually had an overreliance on assumed distribution models when MLE was applied, and difficulties may also arise when there are observation uncertainties in the data [111].Both the PPM and the MPRM had an iterative process in solving the Weibull parameters, either by an MLE or by a bisection method, which may easily be trapped in divergence if the initial value was improperly selected [112].The MPRM performed slightly worse than the PPRM in this study, which may be due to the overfit of the volume distribution, such as the performance of the MPRM presented in Figure 4a,e.Moreover, the small number of trees in the large volume classes made them difficult to capture by the predicted distributions (Figure 4d,e,h).

Implications and Future Work
The management modes of Yushan forests are multipurpose sustainable forest management.Thus, tree volume distributions are useful for optimizing timber production, stand compositions, silvicultural practices, as well as ecological monitoring purposes of the subtropical forests.Tree volume distributions can also be used directly to describe stand attributes such as structure, age, and volume, or used as inputs to estimate biomass(by converting volume with a biomass expansion factor) and carbon budget [11].To support these needs, airborne LiDAR technology was used to estimate tree volume distributions by LiDAR-estimated Weibull parameters in this study.While this remote sensing technology for data acquisition may produce relatively high costs, the overall cost is still lower than most of the traditional field surveys in large scale, and the wall-to-wall LiDAR data could also be used to derive and share multiple products (e.g., DTM, volume and fire risk maps etc.) for extrapolating to larger scales, thus promoting sustainable forest management [52,88].In addition, the ITC approach requires relatively high pulse density of airborne LiDAR data to provide more information on individual tree distributions.In contrast, our methodology effectively estimated individual tree volume distributions with relative low pulse density and costs.As the emergence of Unmanned Aerial Vehicles (UAVs) platform technology in recent years, UAVs have the capability of acquiring high density LiDAR point clouds, which has advantages of cost-effectiveness and convenience.Thus, our future work could use UAV-LiDAR data to reduce costs.
Airborne LiDAR technology is increasingly being used to map forested terrain.In this study, LiDAR data were used to generate a DTM by calculating the average elevation from the ground returns within each rasterized cell grid (1 m spatial resolution).Actually, LiDAR metrics derived from normalized point clouds are influenced by DTM quality to some extent, thus requiring a high quality DTM is essential for minimizing uncertain impacts on estimating volume distributions [93].The accuracy assessments of LiDAR-derived DTMs, over different forest areas have been the focus of previous studies [113][114][115][116][117]. The various factors including forest structure, slope, off-nadir angle, understory vegetation, and interpolation algorithms have proven to influence the accuracy of DTM [113][114][115][116][117]. Thus, the accuracy assessments of DTM should also be considered in our future work.In addition, a unimodal Weibull distribution form in this case can be modeled to fit simple unimodal volume distributions.Thus, our future work should consider whether other models, e.g., discrete percentile-based methods [118] or finite mixture models [52], or other data sources could be applied to more accurately estimate multimodal tree volume distributions using airborne LiDAR data [119].

Conclusions
In this study, we investigated an enhanced area-based method (ABM) to estimate individual tree volume distributions using LiDAR-derived Weibull parameters predicted by three Weibull parameter prediction methods, i.e., parameter prediction method (PPM), percentile-based parameter recovery method (PPRM), and moment-based parameter recovery method (MPRM), over a subtropical forest within southeast China.The results demonstrated that, compared with the general models, the forest type-specific models obtained relatively higher accuracies for estimating total volume Weibull parameters, percentiles, raw moments, and stem density.The relationship between the predicted and reference volume distributions was stronger when the predicted frequencies of the volume distributions were scaled to the ground-truth stem density (mean e R = 28.07-41.92,mean e P = 0.14-0.21)than when the predicted frequencies were scaled to the stem density predicted from LiDAR data (mean e R = 31.47-54.07,mean e P = 0.14-0.21).In addition, the predicted individual tree volume distributions based on the PPRM (average mean e R = 37.75) performed relatively better than those based on the MPRM (average mean e R = 40.43)and PPM (average mean e R = 41.22).This study demonstrated that this enhanced ABM can potentially improve the accuracy of estimates of the tree volume distributions in subtropical forests.However, this study has a limitation on describing multimodal distributions, thus future works should focus on other models such as parametric finite mixture model or non-parametric KNN model for estimating multimodal distributions.

Figure 1 .
Figure 1.Overview of workflow for modelling individual volume distributions and predicting stem density.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MRPM: moment-based parameter recovery method; DTM: digital terrain model.(): mean volume predicted by LiDAR data; NLiDAR: stem density predicted by LiDAR data; VLiDAR: total volume predicted by LiDAR data; NPPM: LiDAR-predicted stem density based on the PPM; NPPRM: LiDARpredicted stem density based on the PPRM; NMPRM: LiDAR-predicted stem density based on the MPRM; MLE: maximum likelihood estimation; bref and cref: scale and shape of the Weibull distribution, respectively;  _ and  _ : two percentiles of the field reference volume distribution within a plot; vq_ref: field reference mean volume within a plot; ̅ : quadratic mean volume of the individual trees within a plot.

Figure 1 .
Figure 1.Overview of workflow for modelling individual volume distributions and predicting stem density.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MRPM: moment-based parameter recovery method; DTM: digital terrain model.E(v): mean volume predicted by LiDAR data; N LiDAR : stem density predicted by LiDAR data; V LiDAR : total volume predicted by LiDAR data; N PPM : LiDAR-predicted stem density based on the PPM; N PPRM : LiDAR-predicted stem density based on the PPRM; N MPRM : LiDAR-predicted stem density based on the MPRM; MLE: maximum likelihood estimation; b ref and c ref : scale and shape of the Weibull distribution, respectively; v p 1_ref and v p 2_ref : two percentiles of the field reference volume distribution within a plot; v q_ref : field reference mean volume within a plot; v ref : quadratic mean volume of the individual trees within a plot.

Figure 2 .
Figure 2. Aerial photograph of Yushan Forest and the distribution of field plots, as well as the field photos of the three typical forest types in the study site.

Figure 2 .
Figure 2. Aerial photograph of Yushan Forest and the distribution of field plots, as well as the field photos of the three typical forest types in the study site.

Figure 3 .
Figure 3. Field-measured stem density versus LiDAR-estimated stem density based on the prediction methods of PPM (left panel), PPRM (middle panel), and MPRM (right panel) in all forest plots (a-c), coniferous forests (d-f), broadleaved forests (g-i), and mixed forests (j-l).Light grey shade represents the 95% confidence interval for predictions.PPM: parameter prediction method; PPRM: percentilebased parameter recovery method; MPRM: moment-based parameter recovery method.

Figure 3 .
Figure 3. Field-measured stem density versus LiDAR-estimated stem density based on the prediction methods of PPM (left panel), PPRM (middle panel), and MPRM (right panel) in all forest plots (a-c), coniferous forests (d-f), broadleaved forests (g-i), and mixed forests (j-l).Light grey shade represents the 95% confidence interval for predictions.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MPRM: moment-based parameter recovery method.

e 35 Figure 4 .
Figure 4.The reference tree volume distribution (green bars) fitted with the Weibull distribution based on the Weibull parameters derived from the PPM, PPRM, and MPRM (using LiDAR-predicted stem density NLiDAR as scaling variables) for nine sample plots of coniferous forests (a-c), broadleaved forests (d-f), and mixed forests (g-i) in low (left panel), medium (median panel) and high (right panel) stem densities.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MPRM: moment-based parameter recovery method; Nref: field reference stem density; vref: field mean tree volume.

Figure 4 .
Figure 4.The reference tree volume distribution (green bars) fitted with the Weibull distribution based on the Weibull parameters derived from the PPM, PPRM, and MPRM (using LiDAR-predicted stem density N LiDAR as scaling variables) for nine sample plots of coniferous forests (a-c), broadleaved forests (d-f), and mixed forests (g-i) in low (left panel), medium (median panel) and high (right panel) stem densities.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MPRM: moment-based parameter recovery method; N ref : field reference stem density; v ref : field mean tree volume.

Figure 5 .
Figure 5.The box plots for the comparison of reference and predicted individual tree volume distributions in different Weibull parameter predictive models (PPM, PPRM, and MPRM) using field reference stem density Nref as the scaling variable.The goodness-of-fit for predicted volume distributions was evaluated by MAE (a), RMSE (b), rRMSE (c), eR (d), eP (e).The number beside each box represents the mean value of the relative statistics.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MPRM: moment-based parameter recovery method; AP: all plots; CF: coniferous forest plots; BF: broadleaved forest plots; MF: mixed forest plots.

Figure 6 .
Figure 6.The box plots for the comparison of reference and predicted individual tree volume distributions in different Weibull parameter predictive models (PPM, PPRM, and MPRM) using

Figure 5 .
Figure 5.The box plots for the comparison of reference and predicted individual tree volume distributions in different Weibull parameter predictive models (PPM, PPRM, and MPRM) using field reference stem density N ref as the scaling variable.The goodness-of-fit for predicted volume distributions was evaluated by MAE (a), RMSE (b), rRMSE (c), e R (d), e P (e).The number beside each box represents the mean value of the relative statistics.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MPRM: moment-based parameter recovery method; AP: all plots; CF: coniferous forest plots; BF: broadleaved forest plots; MF: mixed forest plots.

Figure 6 .
Figure 6.The box plots for the comparison of reference and predicted individual tree volume distributions in different Weibull parameter predictive models (PPM, PPRM, and MPRM) using

Figure 6 .
Figure 6.The box plots for the comparison of reference and predicted individual tree volume distributions in different Weibull parameter predictive models (PPM, PPRM, and MPRM) using LiDAR-predicted stem density N LiDAR as the scaling variable.The goodness-of-fit for predicted volume distributions was evaluated by MAE (a), RMSE (b), rRMSE (c), e R (d), e P (e).The number beside each box represents the mean value of the relative statistics.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MPRM: moment-based parameter recovery method; AP: all plots; CF: coniferous forest plots; BF: broadleaved forest plots; MF: mixed forest plots.

Figure 7 .
Figure 7.The statistical values of the Wilcoxon rank sum test (α = 0.05) for the Packalén error index eP values from the reference and predicted volume distributions using LiDAR-predicted stem density NLiDAR as the scaling variable by the PPM, PPRM, and MPRM.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MPRM: moment-based parameter recovery method; significance level: * p < 0.05; ** p < 0.01.

Figure 7 .
Figure 7.The statistical values of the Wilcoxon rank sum test (α = 0.05) for the Packalén error index e P values from the reference and predicted volume distributions using LiDAR-predicted stem density N LiDAR as the scaling variable by the PPM, PPRM, and MPRM.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MPRM: moment-based parameter recovery method; significance level: * p < 0.05; ** p < 0.01.

Figure 8 .
Figure 8. Mean value of the Packalén error index ep by each tree volume class of predicted tree volume distributions based on the PPM, PPRM, and MPRM.(a): all plots; (b): coniferous forests; (c): broadleaved forests; (d): mixed forests.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MPRM: moment-based parameter recovery method.

Figure 8 .
Figure 8. Mean value of the Packalén error index e p by each tree volume class of predicted tree volume distributions based on the PPM, PPRM, and MPRM.(a): all plots; (b): coniferous forests; (c): broadleaved forests; (d): mixed forests.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MPRM: moment-based parameter recovery method.

Table 1 .
Summary of field measured plot-level forest structure parameters and the skewness and kurtosis of field reference tree volume distribution data.

Table 6 .
The statistical results of the cross-validation for stem density estimations based on the methods of PPM, PPRM, and MPRM.

Table A2 .
Properties of LiDAR sensor and parameters of data acquisition.

Table A3 .
Summary of LiDAR metrics used for fitting models in this study.25 , h 50 , h 75 , h 95 The percentiles of the canopy height distributions (25th, 50th, 75th and 95th) of first returns h min , h max , h range , The minimum height, maximum height, and difference of maximum and minimum of first returns h mean , h var , h cv The mean height, variance and coefficient of variation above ground of first returns h skewness , h kurtosis The skewness and kurtosis of the heights of first returns h MAD , h IQ The median of the absolute deviations from the overall mode and the interquartile range of first returns h SQRT , h CURT The quadratic and cubic mean of heights of first returns d 1 , d 3 , d 5 , d 7 , d 9 The proportion of returns above the quantiles (10th, 30th, 50th, 70th and 90th) to total number of first returns Filled and Empty zones of canopy volume model (CVM), i.e., the voxels that contained point clouds and the voxels that contained no point clouds within canopy spaces OG, CG Open and Closed gap zones of canopy volume model (CVM), i.e., the empty voxels located above and below the canopy, respectively Eu, Oligo Euphotic and Oligophotic zones of canopy volume model (CVM), i.e., the voxels located within an uppermost percentile (65%) of all filled grid cells of that column, and the voxels located below, respectively Rumple Rumple is the ratio of canopy outer surface area to ground surface area, which represents the canopy surface roughness CvLAD Coefficient of variation within the vertical leaf area density (LAD) profile CC 2m Canopy cover, represents percentages of first returns above 2 m

Table A4 .
Summary statistics of predicted b and c by three Weibull parameter prediction methods.PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MPRM: moment-based parameter recovery method; b: Weibull scale parameter; c: Weibull shape parameter.

Table A5 .
The error indexes, Weibull parameters, skewness, and kurtosis of reference and predicted individual tree volume distributions for 9 sample plots.: Ref: field reference individual volume distribution; PPM: parameter prediction method; PPRM: percentile-based parameter recovery method; MPRM: moment-based parameter recovery method; e R : Reynolds' error index; e P : Packalén error index; b: Weibull scale parameter; c: Weibull shape parameter; SK: skewness of volume distributions; KT: kurtosis of distributions. Notes