Estimating Forest Structural Parameters Using Canopy Metrics Derived from Airborne LiDAR Data in Subtropical Forests

: Accurate and timely estimation of forest structural parameters plays a key role in the management of forest resources, as well as studies on the carbon cycle and biodiversity. Light Detection and Ranging (LiDAR) is a promising active remote sensing technology capable of providing highly accurate three dimensional and wall-to-wall forest structural characteristics. In this study, we evaluated the utility of standard metrics and canopy metrics derived from airborne LiDAR data for estimating plot-level forest structural parameters individually and in combination, over a subtropical forest in Yushan forest farm, southeastern China. Standard metrics, i.e., height-based and density-based metrics, and canopy metrics extracted from canopy vertical proﬁles, i.e., canopy volume proﬁle (CVP), canopy height distribution (CHD), and foliage proﬁle (FP), were extracted from LiDAR point clouds. Then the standard metrics and canopy metrics were used for estimating forest structural parameters individually and in combination by multiple regression models, including forest type-speciﬁc (coniferous forest, broad-leaved forest, mixed forest) models and general models. Additionally, the synergy of standard metrics and canopy metrics for estimating structural parameters was evaluated using ﬁeld measured data. Finally, the sensitivity of vertical and horizontal resolution of voxel size for estimating forest structural parameters was assessed. The results showed that, in general, the accuracies of forest type-speciﬁc models ( Adj-R 2 = 0.44–0.88) were relatively higher than general models ( Adj-R 2 = 0.39–0.77). For forest structural parameters, the estimation accuracies of Lorey’s mean height ( Adj-R 2 = 0.61–0.88) and aboveground biomass ( Adj-R 2 = 0.54–0.81) models were the highest, followed by volume ( Adj-R 2 = 0.42–0.78), DBH ( Adj-R 2 = 0.48–0.74), basal area ( Adj-R 2 = 0.41–0.69), whereas stem density ( Adj-R 2 = 0.39–0.64) models were relatively lower. The combination models ( Adj-R 2 = 0.45–0.88) had higher performance compared with models developed using standard metrics (only) ( Adj-R 2 = 0.42–0.84) and canopy metrics (only) ( Adj-R 2 = 0.39–0.83). The results also demonstrated that the optimal voxel size was 5 × 5 × 0.5 m 3 for estimating most of the parameters. This study demonstrated that canopy metrics based on canopy vertical proﬁles can be effectively used to enhance the estimation accuracies of forest structural parameters in subtropical forests.


Introduction
Forested ecosystems are spatially dynamic and continuously changing and therefore comprise complex and heterogeneous forest structures [1,2]. Forest structure, defined as the spatiotemporal arrangement of structural components in specific vertical and horizontal spatial patterns within a forest stand [3][4][5], is recognized as both a product and driver of forest biophysical processes [6]

Study Area
This study was conducted in Yushan Forest, a state-operated forest and national park located near the town of Changshu in Jiangsu Province, southeastern China (120°42′9.4″E, 31°40′4.1″N). The total site area is approximately 1260 ha, which covers approximately 1140 ha of forests. Topographically, the site's mountain terrain extends from northwest to southeast and the ridge line is more than 6500 m, with the elevation range between approximately 20 and 261 m above sea level. This site is situated in the north-subtropical monsoon climatic region with an annual mean temperature of 15.4 °C, and precipitation of 1047.7 mm, and annual mean relative humidity of approximately 80%. The highest monthly precipitation occurs from June to September. The soil type in Yushan is composed mainly of mountain yellow-brown earth. The forest in Yushan belongs to the north-subtropical mixed secondary forest with three main forest types: conifer-dominated, broadleaved dominated and mixed forests. The dominant broad-leaved tree species include Oriental oak (Quercus variabilis Bl.), Chinese sweet gum (Liquidambar formosana Hance) and Sawtooth oak (Quercus acutissima Carruth.) of deciduous broad-leaved trees species, mixed with other evergreen broadleaved tree species including Camphorwood (Cinnamomum camphora (L.) Presl.) and Chinese holly (Ilex chinensis Sims.). The primary coniferous forests are dominated by evergreen coniferous tree species, including Masson pine (Pinus massoniana Lamb.), Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.), slash pine (Pinus elliottii Engelm.) and Japanese Blackbark Pine (Pinus thunbergii Parl.). Figure 2 shows an overview of the study site and distribution of sample plots and Figure 3 shows the field photos of three forest types.

Study Area
This study was conducted in Yushan Forest, a state-operated forest and national park located near the town of Changshu in Jiangsu Province, southeastern China (120 • 42 9.4"E, 31 • 40 4.1"N). The total site area is approximately 1260 ha, which covers approximately 1140 ha of forests. Topographically, the site's mountain terrain extends from northwest to southeast and the ridge line is more than 6500 m, with the elevation range between approximately 20 and 261 m above sea level. This site is situated in the north-subtropical monsoon climatic region with an annual mean temperature of 15.4 • C, and precipitation of 1047.7 mm, and annual mean relative humidity of approximately 80%. The highest monthly precipitation occurs from June to September. The soil type in Yushan is composed mainly of mountain yellow-brown earth. The forest in Yushan belongs to the north-subtropical mixed secondary forest with three main forest types: conifer-dominated, broad-leaved dominated and mixed forests. The dominant broad-leaved tree species include Oriental oak (Quercus variabilis Bl.), Chinese sweet gum (Liquidambar formosana Hance) and Sawtooth oak (Quercus acutissima Carruth.) of deciduous broad-leaved trees species, mixed with other evergreen broad-leaved tree species including Camphorwood (Cinnamomum camphora (L.) Presl.) and Chinese holly (Ilex chinensis Sims.). The primary coniferous forests are dominated by evergreen coniferous tree species, including Masson pine (Pinus massoniana Lamb.), Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.), slash pine (Pinus elliottii Engelm.) and Japanese Blackbark Pine (Pinus thunbergii Parl.). Figure 2 shows an overview of the study site and distribution of sample plots and Figure 3 shows the field photos of three forest types.   In order to obtain the relative height of trees, raw point cloud data were first filtered by removing outliers. The data were filtered to remove non-ground points using an algorithm adapted from Kraus

LiDAR Data
Small footprint airborne LiDAR data were acquired on 17 August 2013 using a Riegl LMS-Q680i sensor flown at 900 m above ground level, with a flight speed of 55 m·s −1 and a flight line side-lap of ≥60%. The sensor recorded returned waveforms of laser pulse with a temporal sample spacing of 1 ns (approximately 15 cm). The LiDAR system was configured to emit laser pulses in the near-infrared band (1550 nm) at a 360 kHz pulse repetition frequency and a 112 Hz scanning frequency, with a scanning angle of ±30° from nadir and a swath of 1040 m. The dataset had an average beam footprint size of 0.45 m (nadir) in diameter. The average ground point distances of the dataset were 0.49 m (flying direction) and 0.48 m (scanning direction) in a single strip, with pulse density of approximately 5.06 pulse m −2 . The final extracted point clouds and associated waveforms were stored in LAS 1.3 format (American Society for Photogrammetry and Remote Sensing, Bethesda, MD, USA).
In order to obtain the relative height of trees, raw point cloud data were first filtered by removing outliers. The data were filtered to remove non-ground points using an algorithm adapted from Kraus

LiDAR Data
Small footprint airborne LiDAR data were acquired on 17 August 2013 using a Riegl LMS-Q680i sensor flown at 900 m above ground level, with a flight speed of 55 m·s −1 and a flight line side-lap of ≥60%. The sensor recorded returned waveforms of laser pulse with a temporal sample spacing of 1 ns (approximately 15 cm). The LiDAR system was configured to emit laser pulses in the near-infrared band (1550 nm) at a 360 kHz pulse repetition frequency and a 112 Hz scanning frequency, with a scanning angle of ±30 • from nadir and a swath of 1040 m. The dataset had an average beam footprint size of 0.45 m (nadir) in diameter. The average ground point distances of the dataset were 0.49 m (flying direction) and 0.48 m (scanning direction) in a single strip, with pulse density of approximately In order to obtain the relative height of trees, raw point cloud data were first filtered by removing outliers. The data were filtered to remove non-ground points using an algorithm adapted from Kraus and Pfeifer (1998) [40], which was based on a method of linear least-squares interpolation, and then the data were smoothed by the median filter (moving square windows of size 5 × 5 m). After filtering the non-ground points, a 1-m Digital terrain model (DTM) was created by calculating the average elevation from the ground points within a cell (cells that contain no points were filled by interpolation using neighboring cells). Then, the point cloud was then normalized against the ground surface height and extracted for each plot. Point clouds for all plots (n = 51) were finally extracted using the coordinates of the lower left and upper right corners.

Field Data
The field data for the study site were collected from June to August in 2012 and in August of 2013. Throughout the Yushan study region, a total of 51 square sample plots (30 × 30 m) were established, covering the forest type, dominant species compositions, age classes, and site indices, according to an historical forest resource inventory data (2012). All plots were divided into broad-leaved forest (n = 14), coniferous forest (n = 14), and mixed forest (n = 23). The centers of each plot and plot corners were located using Trimble GeoXH6000 Handheld GPS (Trimble, Sunnyvale, CA, USA) units equipped with a dual frequency GNSS antenna, and corrected with high precision real-time differential signals received from the Jiangsu Continuously Operating Reference Stations (JSCORS), resulting in a submeter positional of accuracy of less than 0.5 m [41]. The plot directions and inclined angles were recorded by forest compass, and the border lengths were measured by PI tape. For all live trees with a diameter at breast height (DBH) over 5 cm, tree type, diameter, height, height to crown base, crown width in both cardinal directions, crown class, and crown transparency were measured. DBH was measured on all trees using a diameter tape. Heights of all trees were measured using a Vertex IV hypsometer (Haglöf, Långsele, Sweden). Crown widths were obtained by measuring the average of two values measured along two perpendicular directions from the location of the tree top. In addition, small trees (DBH < 5 cm) and dead wood were also tallied for total stem density, but not used in biomass calculations.
Several forest structural parameters were assessed in this study, including mean DBH, Lorey's mean height (i.e., the basal area weighted height), stem density, basal area, volume and aboveground biomass. In addition, aboveground biomass of each tree was calculated by means of the species specific allometric equations from local or nearby province [42][43][44][45][46][47] (Appendix A (Table A1)), and the tree-based calculation results were summed within each plot to determine plot-level forest aboveground biomass. Plot-level volume was similarly calculated using provincial species specific volume equations of individual trees, which were based on DBH as predictor variables. A summary of plot-level forest structural parameters data is presented in Table 1.

Canopy Volume Model Approach
A voxel-based CVM approach was applied for point cloud data to derive metrics in this study. The canopy spaces were first organized as a matrix composed of voxels (5 × 5 × 0.5 m 3 ), and these voxels were classified as either "filled" or "empty" volume depending on the presence or absence of LiDAR points within each voxel. "Filled" voxels were further classified as either "euphotic" zone, if they were located in the uppermost 65% of all filled voxels, or as "oligophotic" zone if they were located below the point, whereas "empty" voxels were located either below ("closed gap") or above the canopy ("open gap") [38]. Open gap, euphotic, oligophotic and closed gap were determined as four canopy structure classes, with units defined as the volume of each class per unit area. All volume elements (Open gap, Oligophotic, Euphotic, Closed gap, Filled, Empty) were derived as canopy volume (CV) metrics using the CVM method and canopy volume profile (CVP) was visualized. Figure 4 shows the illustration of voxel-based CVM approach. Point clouds of a plot (30 × 30 m 2 ) were voxelized, and divided into 36 vertical columns of voxels, and each column was further stratified with four canopy structure classes. All columns of a plot were expanded in a panel and the canopy volume distribution (CVD) was presented ( Figure 4c). Finally, the volume percentages of canopy structure classes of each height interval (0.5 m) were calculated, resulting in CVP ( Figure 4d). LiDAR points within each voxel. "Filled" voxels were further classified as either "euphotic" zone, if they were located in the uppermost 65% of all filled voxels, or as "oligophotic" zone if they were located below the point, whereas "empty" voxels were located either below ("closed gap") or above the canopy ("open gap") [38]. Open gap, euphotic, oligophotic and closed gap were determined as four canopy structure classes, with units defined as the volume of each class per unit area. All volume elements (Open gap, Oligophotic, Euphotic, Closed gap, Filled, Empty) were derived as canopy volume (CV) metrics using the CVM method and canopy volume profile (CVP) was visualized. Figure 4 shows the illustration of voxel-based CVM approach. Point clouds of a plot (30 × 30 m 2 ) were voxelized, and divided into 36 vertical columns of voxels, and each column was further stratified with four canopy structure classes. All columns of a plot were expanded in a panel and the canopy volume distribution (CVD) was presented ( Figure 4c). Finally, the volume percentages of canopy structure classes of each height interval (0.5 m) were calculated, resulting in CVP ( Figure 4d). Notably, an appropriate voxel volume size for CVM in this study was been considered because various voxel sizes likely change the distributions and proportions of canopy structure classes. Thus, this study also investigated the influence of various voxel sizes on the accuracies of the models.

Weibull Fitting Approach
Canopy height distributions (CHD), which describe vertical distributions of foliage elements and non-photosynthetic tissues within canopy spaces, were used to measure the distribution of laser returns within the 0.3-m bins (i.e., a 30 × 30 × 0.3 m 3 rectangular section) from the ground to canopy top [48,49]. In this study, a two-parameter Weibull density function (PDF) was used to describe CHD on each plot. As a Weibull model is highly adaptive, ranging from an inversed J-shape to unimodal skewed and unimodal symmetrical curve, the Weibull model has flexibility in characterizing distributions of a range of forest attributes [50,51]. The two parameters, i.e., Weibull scale (α1) and Notably, an appropriate voxel volume size for CVM in this study was been considered because various voxel sizes likely change the distributions and proportions of canopy structure classes. Thus, this study also investigated the influence of various voxel sizes on the accuracies of the models. Given the average beam footprint size of 0.45 m, average ground point distances of 0.49 m (flying direction) and 0.48 m (scanning direction) and pulse density of approximately 5.06 pulse·m −2 , horizontal resolutions of 1 m to 10 m were chosen (which were multiples of the footprint size and average ground point distances). Vertical resolutions of 0.5 m and 1 m were chosen to correspond to roughly three and six sampling intervals of the returned waveform. A sensitivity analysis was performed using CV-metrics (i.e., Open gap, Oligophotic, Euphotic, Closed gap, Filled, Empty).

Weibull Fitting Approach
Canopy height distributions (CHD), which describe vertical distributions of foliage elements and non-photosynthetic tissues within canopy spaces, were used to measure the distribution of laser returns within the 0.3-m bins (i.e., a 30 × 30 × 0.3 m 3 rectangular section) from the ground to canopy top [48,49]. In this study, a two-parameter Weibull density function (PDF) was used to describe CHD on each plot. As a Weibull model is highly adaptive, ranging from an inversed J-shape to unimodal skewed and unimodal symmetrical curve, the Weibull model has flexibility in characterizing distributions of a range of forest attributes [50,51]. The two parameters, i.e., Weibull scale (α 1 ) and Weibull shape (β 1 ), were derived by the maximum likelihood estimation method. Weibull scale determines the basic shape of the distribution density curve and Weibull shape controls the breadth of the distribution [52]. Foliage profile (FP) can delineate the vertical distribution of canopy phytoelement (e.g., leaf, stem, twig, etc.) density above the ground within a forest stand [37]. FP is defined as the total one-sided leaf area that is involved in photosynthesis per unit canopy volume at canopy height z, and describes changes in the leaf area distribution with increasing height [53]. FP is highly related to leaf area index (LAI), which was demonstrated in previous studies [35,54], and the relationship between FP and LAI is: where L(z) is the cumulative leaf area index (LAI c ) from the ground to a given height z; FP(z) represents the foliage area volume density at height z (is the vertical foliage profile in a thin layer or "slice" through a canopy as a function of height z); z 1 and z 2 are different canopy height. A height interval or each vertical "slice" was 0.3 m. Meanwhile, we assumed that foliage elements in a thin "slice" were very small so that occlusion can be neglected, and leaves presented Poisson random distribution. Because airborne LiDAR is incapable of resolving foliage angle distribution, clumping and non-foliage elements, the foliage profiles derived from airborne LiDAR are referred to here as "apparent" foliage profiles and effective LAI [37]. In this study, LAI can be indirectly determined from LiDAR by estimating the derived gap probability in the canopy [37,38], and the gap probability be estimated as the total number of laser hits up to a height z relative to the total number of LiDAR shots as follows: where P gap (z) is a gap probability measurement at height z, #z is the number of hits down to a height z above the ground, and N is the total number of shots emitted up to the sky. Previous studies have showed that Weibull distribution function can also delineate vertical foliage profiles distributions [37,55]. In this study, the Weibull fitted scale parameter (α 2 ) and shape parameter (β 2 ) were derived from the apparent FP by linking Weibull cumulative function to cumulative projected foliage area index [37,38]: where α 2 and β 2 are fitted parameters, z is the height, and H is the maximum height in a plot. Moreover, another suite of standard metrics were calculated, including height-based (HD) metrics (h 25 , h 50 , h 75 , h 95 , h mean , h cv , h skewness and h kurtosis ) and density-based (DB) metrics (d 1 , d 3 , d 5 , d 7 , d 9 , CC 2m ). A summary of these metrics with corresponding descriptions is shown in Table 2. The skewness and kurtosis of the heights of all points.

Density-based
Canopy return density (d 1 , d 3 , d 5 , d 7 and d 9 ) The proportion of points above the quantiles (10th, 30th, 50th, 70th and 80th) to total number of points.
Canopy cover above 2 m (CC 2m ) Percentages of first returns above 2 m.

Canopy metrics
Canopy volume Filled and Empty zones of CVM (i.e., Filled and Empty) The voxels contained point clouds and voxels contained no point clouds within canopy spaces.
Open and Closed gap zones of CVM (i.e., Open gap (OG) and Closed gap (CG)) The empty voxels located above and below the canopy respectively.
Euphotic and Oligophotic zones of CVM (i.e., Euphotic (Eu) and Oligophotic (Oligo)) The voxels located within an uppermost percentile (65%) of all filled grid cells of that column, and voxels located below the point in the profile Weibull-fitted α 1 and β 1 parameter of Weibull distribution The scale parameter α and shape parameter β of the Weibull density distribution fitted to CHD. α 2 and β 2 parameter of Weibull distribution The scale parameter α and shape parameter β of the Weibull density distribution fitted to FP.

Metrics Selection and Statistical Analysis
All of the LiDAR metrics in Table 2 were used to analyze pair-wise relationships among different forest structural parameters (DBH, Lorey's mean height, stem density, basal area, volume and AGB) by Pearson's correlations (r). Then the metrics with low correlations (r < 0.2) were excluded and candidate metrics were used in the regression analysis. In the multiple regression analysis, all of the dependent variables and independent variables were transformed using the natural logarithm to improve linearity and corrected for bias using a bias correction factor (BCF) [56]. Some studies have applied log transformations to both dependent variables and independent variables for estimations of forest parameters [57,58]. Multiple regression models including forest type-specific (coniferous forest, broad-leaved forest, and mixed forest) models and general models of all plots were then established. Both stepwise variable selection and the maximum coefficient of determination (R 2 ) improvement variable selection techniques were applied to select the metrics to be included in the models [59]. Independent variables were left in the model using an F-test with a p < 0.05 significance level. The standard least-squares method was used [60].
To ensure that the independent variables were not highly correlated, multicollinearity was evaluated using Principal Component Analysis (PCA) based on the correlation matrix. Models with condition number (k) lower than 30 were accepted to ensure that there was no serious multicollinearity in the selected models [57]. The best fitting models were then selected based on the lowest Akaike information criterion value [61]. The accuracies of predictive models were evaluated using adjusted coefficient of determination (Adj-R 2 ), Root-Mean-Square Error (RMSE), which has been transformed back to original scale, and relative RMSE (rRMSE), which are defined as the percentage of the ratio of RMSE and the observed mean values. In this study, dummy variables (or class variables) were added to the selected models as the dependent variables to assess whether these models differ between forest types [62]. Once the best models were chosen, leave-one-out cross-validation was performed to evaluate the predictive accuracies of the models [63].
where x i is the observed value for plot i, x is the observed mean value for plot i,x i is the estimated value for plot i, n is the number of plots i, and p is the number of variables.

Profile Analysis
The plots of each forest type were stratified into three groups (low, medium, and high), according to the Lorey's mean height from low to high. In each group, three plots were selected, and a total of nine typical plots were selected. For the typical plots, CVD, CVP and FP were extracted, as shown in  In addition, Figure 8 shows the mean LAI c for plots in different forest types and mean CVD.
Remote Sens. 2017, 9,940 10 of 26 where i x is the observed value for plot i, x is the observed mean value for plot i, i x is the estimated value for plot i, n is the number of plots i, and p is the number of variables.

Profile Analysis
The plots of each forest type were stratified into three groups (low, medium, and high), according to the Lorey's mean height from low to high. In each group, three plots were selected, and a total of nine typical plots were selected. For the typical plots, CVD, CVP and FP were extracted, as shown in Figures 5-7. In addition, Figure 8 shows the mean LAIc for plots in different forest types and mean CVD. Figure 5 shows the spatial arrangements of four canopy structure classes for coniferous, broadleaved, and mixed forest plots. Generally, Oligophotic zones were larger than euphotic zone in filled volume; coniferous forests had the largest open gap zone and the smallest closed gap zone, whereas broad-leaved forests plots had a larger and wider spread of closed gap zone than mixed forest. Similarly, the percentage of closed gap volume was larger in broad-leaved forests than in mixed forests, and the lowest percentage of closed gap volume was in coniferous forests ( Figure 6). The mean CVPs (Figure 6d,h,l) show that the percentages of open gap volume were the highest in coniferous forests, and the differences were not significant between the percentages of open gap volume in broad-leaved forests and mixed forests. The percentages of filled volume in coniferous and mixed forests were significantly higher than in broad-leaved forest, and the differences for the percentage of filled volume between coniferous and mixed forest were not significant.  relatively high for mixed forests, followed by coniferous forests and broad-leaved forests (Figure 8a). Above the tree height of 12 m, the increasing slope of the mean LAIc of the broad-leaved forests with increasing tree height was higher than that of coniferous forests, and maintained a relative high increasing trend, whereas the increasing trend of coniferous forests and mixed forests gradually tended to saturate. As a result, the mean LAIc value of broad-leaved forests was eventually higher than that of mixed forests, and lowest for coniferous forests.

Accuracy Assessments
The selected metrics and accuracy assessment results of all the multi-regression models (i.e., SM models, CM models, and combination models) are shown in Tables A1-A3 and Table 3 summarizes their accuracies. All of the forest structural parameters were generally well estimated (Adj-R 2 = 0.39-0.88, rRMSE = 5.13-29.86%). Overall, Lorey's mean height (Adj-R 2 = 0.61-0.88, rRMSE = 5.13-12.79%) and AGB (Adj-R 2 = 0.54-0.81, rRMSE = 12.19-28.42%) was predicted most accurately. For volume,  Figure 5 shows the spatial arrangements of four canopy structure classes for coniferous, broad-leaved, and mixed forest plots. Generally, Oligophotic zones were larger than euphotic zone in filled volume; coniferous forests had the largest open gap zone and the smallest closed gap zone, whereas broad-leaved forests plots had a larger and wider spread of closed gap zone than mixed forest. Similarly, the percentage of closed gap volume was larger in broad-leaved forests than in mixed forests, and the lowest percentage of closed gap volume was in coniferous forests ( Figure 6). The mean CVPs (Figure 6d,h,l) show that the percentages of open gap volume were the highest in coniferous forests, and the differences were not significant between the percentages of open gap volume in broad-leaved forests and mixed forests. The percentages of filled volume in coniferous and mixed forests were significantly higher than in broad-leaved forest, and the differences for the percentage of filled volume between coniferous and mixed forest were not significant.
Weibull models were fitted to canopy foliage distribution and matched the shape of foliage profile relatively well (Figure 7). In general, the FP profiles first exhibited a strong increasing trend, followed by a decreasing trend. Particularly, the peaks of FP in coniferous and mixed forests occurred in the lower or middle portions of the canopies whereas the peaks of broad-leaved forests were distributed more toward middle and upper portions of the canopies. Comparing with the mean foliage profiles and Weibull curves of three forest types, the curve showing the spatial distribution of FP values was smoother in broad-leaved forests than those of coniferous or mixed forests. The Weibull shapes of mixed forest canopy were slightly steeper than those of coniferous forest stands, indicating a wider spread of foliage within the canopy (Figure 7d,h,l). This same trend can be seen in the mean CHDs (Figure 8b-d).
The mean LAIc values below the threshold of 12 m (approximately middle canopy) were relatively high for mixed forests, followed by coniferous forests and broad-leaved forests (Figure 8a). Above the tree height of 12 m, the increasing slope of the mean LAIc of the broad-leaved forests with increasing tree height was higher than that of coniferous forests, and maintained a relative high increasing trend, whereas the increasing trend of coniferous forests and mixed forests gradually tended to saturate. As a result, the mean LAIc value of broad-leaved forests was eventually higher than that of mixed forests, and lowest for coniferous forests.
For all of the general SM models, the standard metrics that were regressed against for fitting models included most of the standard metrics, indicating those had a relatively strong correlation with forest structural parameters. Overall, h 95 (selected by four out of six models), d 7 (selected by four out of six models), d 3 , h cv and d 9 (each of them was selected by three out of six models) were the most frequently selected, indicating these metrics are more sensitive and representative to the forest structural parameters. For general CM models, all of CV metrics and WF metrics were selected for estimating forest structural parameters. Within CV metrics, the statistic of Oligophotic (all selected by six models), Empty (selected by four out of six models) and Open (selected by four out of six models) were sensitive to forest structural parameters and these metrics were selected both in the general models and forest type-specific models, suggesting that the three metrics have a strong ability to explain variations. Within WF metrics, α 1 was relatively sensitive to structural parameters (selected two out of six models). In six general combination models, most of standard metrics (nine out of 14) and canopy metrics (four out of total 10) were used in combination for parameter estimations. The metrics of Oligophotic, Empty, h 95 remained sensitive to structural parameters (selected by 2-4 out of six general combo models). Moreover, h 75 , d 1 and β 1 (selected by 2-3 out of 6) became more sensitive for DBH, Lorey's mean height, and stem density in combination models than SM models. Figure 9 shows the LiDAR estimated versus the field measured forest structural parameters as well as the results for cross-validation in all plots models based on standard metrics and canopy metrics. As indicated, Lorey's mean height and AGB models were fitted best and resulted in R 2 values of 0.79 and 0.66, followed by DBH (R 2 = 0.60), volume (R 2 = 0.60) and basal area (R 2 = 0.52), whereas the accuracy of stem density model was the lowest (R 2 = 0.49). For Lorey's mean height, AGB, DBH, and volume estimations, their relationships were close to the 1:1 line whereas basal area and stem density had a relationship that deviated from the 1:1 line, with a slightly larger deviation in broad-leaved forests.

The Selection of Voxel Sizes
In this study, a sensitivity analysis was performed using different voxel sizes to derive CV metrics based on CVM approach and to quantify their influence on the results. As shown in Figure  10, a quantitative comparison of estimation accuracy for four main forest structural parameters (i.e., DBH, Lorey's mean height, stem density, and basal area) was performed. In general, the R 2 values of the models showed a trend of first increasing and then decreasing when horizontal resolutions of voxels were varied from 1 m to 10 m (Figure 10a,b), and the voxels in horizontal resolution of 5 m had the best performance. Figure 10a was subtracted from Figure 10b to calculate the result of Figure  10c, which demonstrated the difference of rRMSE values of forest structural parameters for various vertical resolutions (0.5 m and 1 m). The values presented were mostly positive, except for some of the differences were negative (e.g., G at 3 m horizontal resolution) (Figure 10c). In particular, DBH and stem density models had all positive values across 1 m to 10 m of horizontal resolutions, indicating the two parameters were strongly influenced by the vertical resolution of the voxels. As a result, the suitable voxel size in this study was 5 × 5 × 0.5 m 3 .

The Selection of Voxel Sizes
In this study, a sensitivity analysis was performed using different voxel sizes to derive CV metrics based on CVM approach and to quantify their influence on the results. As shown in Figure 10, a quantitative comparison of estimation accuracy for four main forest structural parameters (i.e., DBH, Lorey's mean height, stem density, and basal area) was performed. In general, the R 2 values of the models showed a trend of first increasing and then decreasing when horizontal resolutions of voxels were varied from 1 m to 10 m (Figure 10a,b), and the voxels in horizontal resolution of 5 m had the best performance. Figure 10a was subtracted from Figure 10b to calculate the result of Figure 10c, which demonstrated the difference of rRMSE values of forest structural parameters for various vertical resolutions (0.5 m and 1 m). The values presented were mostly positive, except for some of the differences were negative (e.g., G at 3 m horizontal resolution) (Figure 10c). In particular, DBH and stem density models had all positive values across 1 m to 10 m of horizontal resolutions, indicating the two parameters were strongly influenced by the vertical resolution of the voxels. As a result, the suitable voxel size in this study was 5 × 5 × 0.5 m 3 .

Canopy Vertical Profiles
Canopy is an important constituent of forest structure [64], and canopy structure is critical for estimation of forest structural parameters [65]. Canopy vertical profile is one of the means to quantify and analyze complex forest canopy structure and further characterize the potential heterogeneity of forest spatial structure [66]. A wide range of forest structural parameters can be directly quantified from canopy vertical profiles such as canopy height and canopy vertical distribution [67]. Also, a set of forest structural parameters (aboveground biomass, basal area, volume, LAI, canopy cover, etc.) can be predicted by establishing empirical models from LiDAR data [68]. In this study, a voxel-based CVM and Weibull fitting approach were conducted to extract two key suites of metrics for estimating forest structural parameters and derive correlative canopy vertical profiles including CVD, CVP, CHD, FP, and LAIc. As mentioned above, the CVM approach provides a broad classification approach to categorize the canopy into photosynthetically active and less active zones [39]. Therefore, it can better reflect the spatial heterogeneity of forest structure, which is caused by the difference of light environment in the canopy. Furthermore, the CVP explicitly presented variation in the spatial arrangement of elements (i.e., open gap, euphotic, oligophotic, closed gap) within the vertical forest canopy [38]. As shown in Figures 5 and 6, the broad-leaved forests had the largest closed gap volume and the smallest open gap volume when compared to coniferous forests and mixed forests. The explanations of these phenomena need to take into account the canopy geometry and tree architecture [36]. At our research site, coniferous forests are dominated by Masson pine and slash pine; these species usually consist of a regular and conical crown, demonstrating a heavily thinned upper canopy and a dense sub-canopy ( Figure 3). Furthermore, more open upper canopies in coniferous stands allow more light to pass through to the lower canopy strata [69,70], so a shrubby understory may incrementally emerge, resulting in the most open gap and the lowest closed gap zones in coniferous forests. Conversely and notably, broadleaves with elliptical or spherical crown are very tall and have

Canopy Vertical Profiles
Canopy is an important constituent of forest structure [64], and canopy structure is critical for estimation of forest structural parameters [65]. Canopy vertical profile is one of the means to quantify and analyze complex forest canopy structure and further characterize the potential heterogeneity of forest spatial structure [66]. A wide range of forest structural parameters can be directly quantified from canopy vertical profiles such as canopy height and canopy vertical distribution [67]. Also, a set of forest structural parameters (aboveground biomass, basal area, volume, LAI, canopy cover, etc.) can be predicted by establishing empirical models from LiDAR data [68]. In this study, a voxel-based CVM and Weibull fitting approach were conducted to extract two key suites of metrics for estimating forest structural parameters and derive correlative canopy vertical profiles including CVD, CVP, CHD, FP, and LAI c . As mentioned above, the CVM approach provides a broad classification approach to categorize the canopy into photosynthetically active and less active zones [39]. Therefore, it can better reflect the spatial heterogeneity of forest structure, which is caused by the difference of light environment in the canopy. Furthermore, the CVP explicitly presented variation in the spatial arrangement of elements (i.e., open gap, euphotic, oligophotic, closed gap) within the vertical forest canopy [38]. As shown in Figures 5 and 6, the broad-leaved forests had the largest closed gap volume and the smallest open gap volume when compared to coniferous forests and mixed forests. The explanations of these phenomena need to take into account the canopy geometry and tree architecture [36]. At our research site, coniferous forests are dominated by Masson pine and slash pine; these species usually consist of a regular and conical crown, demonstrating a heavily thinned upper canopy and a dense sub-canopy ( Figure 3). Furthermore, more open upper canopies in coniferous stands allow more light to pass through to the lower canopy strata [69,70], so a shrubby understory may incrementally emerge, resulting in the most open gap and the lowest closed gap zones in coniferous forests. Conversely and notably, broadleaves with elliptical or spherical crown are very tall and have positively skewed canopies with a lower canopy transparency in this study area, as indicated by the large decrease in open gap zones. Additionally, the closed canopy volume generally increased with decreasing stand density [55], hence the broad-leaved forests with a lower stem density (1126.00 ha −1 ) also had a more closed canopy gap. Although with a much more shrubby understory, mixed conifer-broadleaf forests generally encompass median height broadleaved trees [65] with a high stem density (1431.78 ha −1 ) and canopy transparency, resulting in a higher amount of closed gap volume than coniferous forests and a slightly higher amount of open gap volume. On the other hand, as Yushan forest is in secondary succession, the forest canopy surface became more uneven, and the competitions among shade-intolerant species (e.g., Masson pine, Chinese sweet gum) were accelerated and further inhibited the establishment and growth of these species [71,72]. As a result, in late-successional stage, the shade-tolerant species (e.g., Oriental oak, camphorwood and Chinese holly) eventually dominated the canopy [69,72,73] and coexisted with other species. This process could cause the transmittance of light through the canopy to decline [74], which may result in an increase the spatial heterogeneity of the light environment [75,76] and a further enhancement of more microsite light availability in lower canopies [70,[76][77][78][79]. Thus, for each forest type, the oligophotic zone, which represented a larger proportion of the total filled volume compared to the euphotic zone that represented photosynthetically active tissues ( Figure 6). As mentioned above, the canopy architectures of the three forest types can help explain why the distributions of FP and CHD in coniferous forests and mixed forests inclined to the under canopy, whereas the curves of broad-leaved forests were distributed more towards the middle or upper canopy (Figures 7 and 8b-d).
In general, due to a thinner upper canopy and dense under canopy for each forest type, the mean LAI increased rapidly and shifted to an infinitesimal increment from the ground up to the top of the canopy (Figure 8a). Below the threshold of 12 m (approximately middle canopy), dense foliage accumulated in the lower canopy of mixed forests and coniferous forests but mixed forests had more understory shrubs and slightly denser canopies than coniferous forests whereas broad-leaved forest had less shrubbery; therefore, there was a dramatically increased LAIc in mixed forests, followed by coniferous forests and broad-leaved forests. Along with still moderate density of foliage near the upper canopy in broad-leaved forests as well as thinned density of foliage in mixed forests and coniferous forests, the mean LAIc increased trend remained relatively stable in broad-leaved forests compared to other forest types. Eventually, broad-leaved forests had the highest mean LAIc, followed by mixed forests and coniferous forests, which is consistent with the findings of previous studies [80,81].

Predictive Models
In comparison, the forest type-specific models had higher accuracies (Adj-R 2 = 0.44-0.88, rRMSE = 5.13-28.42%) than the general models (Adj-R 2 = 0.39-0.77, rRMSE = 8.54-29.86%). Bouvier et al. (2015) [14] developed a separate model for coniferous, deciduous and mixed stands to estimate forest structural parameters in the Lorraine forests. The results demonstrated that the separate models reduce estimation errors (2.0-5.3%) compared to general models in some complex forests conditions, which was confirmed by our research results. Fu et al. (2011) [82] reported R 2 values for AGB of 0.37 of the general model and 0.43-0.68 of forest type-specific models in subtropical forests (located in southern Yunnan province, China). In our study site, the multi-layered forest conditions in subtropical forests contained greater species diversity, making the effects of tree-species composition (classified as forest types) significant. Overall, the models of the forest structural parameters were relatively more accurate for coniferous forests than broad-leaved forests and mixed forests. The relationships between stand structure and the forest structural parameters are species-dependent, and coniferous forests are usually characterized by relatively simple stand structures when compared with broad-leaved or mixed stands. So it is likely that the model prediction accuracy may decrease in multispecies stands [14]. Xu et al. (2015) [83] estimated forest structural parameters (i.e., Lorey's mean height, stem density, basal area and volume) in the subtropical deciduous mixed forests (on Purple Mountain, located in eastern Nanjing), using canopy height metrics (i.e., height percentile, mean height, maximum height and minimal height) and canopy density metrics. Compared with our results (rRMSE = 5.13-22.28%), theirs showed a relatively lower rRMSE for Lorey's mean height (6.47%), stem density (27.04%), basal area (16.38%), and volume (6.93%). Compared with canopy metrics-based models (Adj-R 2 = 0.39-0.83, rRMSE = 6.94-29.26%), standard metrics-based models had relatively higher performance (Adj-R 2 = 0.42-0.84, rRMSE = 5.60-29.86%), except for the volume and AGB (both in forest type-specific models). The combination models performed best (Adj-R 2 = 0.45-0.88, rRMSE = 5.13-28.96%), explaining a large amount of the variability for all forest structural parameters and indicating the increased utility of canopy metrics in capturing spatially explicit information describing a heterogeneous forest structure. For DBH, stem density, basal area, and AGB,   [36] reported adjusted R 2 values of 0.61, 0.52, 0.87, and 0.91 in boreal forests, markedly higher than reported in this study. The cause of the lower performance in this research is likely the complex structure of the subtropical forests, which are typified by multi-layered forests which that encompass some stands with considerable variability in tree height and stem density, especially in old-growth stands, whereas boreal forests have a much higher homogeneous composition and more discernible canopy architecture.
When considering the selected frequency of metrics in the fitted models, for CV metrics, Oligophotic, Empty metrics were mostly selected by the combination models. This may be explained by the higher proportions of canopy elements ( Figures 5 and 6), which were oligophotic, empty volume zones, revealing the strong sensitivity and representativeness of these two metrics to local forest structures. Previous studies [38,84] found that the Weibull scale and shape parameter were related to canopy attributes (e.g., crown depth and crown length), hence two Weibull parameters were both selected by the CM models of structural parameters (e.g., mean diameter, Lorey's mean height, basal area, and volume models) linked to canopy attributes. In combination models, the most selected WF metrics were α 1 and β 1 , indicating that both of them are suitable for estimating structure parameters in local forests. The capacity of the Weibull parameters to represent the key attributes of mean crown dimension is important, as it provides a mechanism to summarize complex canopy characteristics into simple parameters that can be empirically analyzed in relation to various forest stand characteristics. The two-parameter Weibull model was applied for characterizing many types of FPs and CHDs in this study. In general, these profiles of single layer canopies corresponded well ( Figure 7). However, the unimodal Weibull distribution function applied to the profile is inadequate to describe properly multimodal structure, which may occur in multi-layered, multi-age, complex forest stands [52]. Thus, the relatively poor fit for multi-layered forests could result in errors in estimates of structural parameters, which may explain why the Weibull parameters are not statistically more significant predictors than CV metrics. In this regard, future work could focus on how to apply a multi curve fitting approach in order to further capture the full distributions of canopy vertical profiles. On the other hand, different plot selection strategies could influence the performance of predictive models [85]. The plot selection in this study was only according to forest type, thus our future work could also examine different plot selection strategies of field training plots (e.g., using LiDAR data or geographical factors as a prior information, etc.) and utilize a suitable strategy to improve the estimation accuracy of forest structural parameters.

The Selection of Voxel Sizes
Voxels representing canopy elements such as trunks and branches were abstracted by a volume grid and placed in a 3D grid [86]. As a method of volume visualization of LiDAR points, voxels have already been applied to airborne LiDAR data for improving calculations of forest attributes [87,88]. Voxel size is a key parameter pertaining to the scale of forest structural parameter estimates to the physical dimension of canopy components [89]. Thus, a sensitivity analysis was conducted to investigate the influence of various voxel sizes on forest structural estimations. As shown in Figure 10, in very low (i.e., 1 × 1 × 0.5 m 3 ) or high (i.e., 10 × 10 × 1 m 3 ) resolution conditions, the R 2 values showed a relatively lower performance. If voxels are too small, a voxel-based CVM approach may produce redundant unfilled voxels of empty volume containing few tree canopy elements, which may lead to the underestimation of forest structural parameters; however, too large voxels may lead to too few voxels and result in statistically insignificant descriptions of canopy features [90]. In these conditions, the voxel approach could become ineffective at characterizing the vertical distribution of various canopy structures and the capability to capture 3D heterogeneity of canopy structure for CV metrics could be constrained, hence resulting in relatively lower performances of the models. After taking into account factors of plot size (30 × 30 m 2 ), point cloud densities (3.74 pts·m −2 ), etc., Hilker et al. (2010) [39] used a voxel size of 6 × 6 × 1 m 3 for discrete airborne LiDAR data to estimate the tree height and LAI in Douglas-fir-dominated forest stands with relatively high tree heights (30-35 m). Concerning a much higher point cloud density (5.06 pts·m −2 ) of LiDAR data and relatively lower tree heights (4.47-18.52 m) in this study site, a 1 m vertical resolution produced more coarse data than the vertical resolution of 0.5 m (approximately treble the temporal sample spacing of 1 ns (15 cm)), thus, constraining the ability of canopy volume metrics to describe the vertical variability of the forest canopy structures. Moreover, potential tree movement due to wind between laser acquisitions is also considered a source of uncertainty, as laser returns from the same target can be located in different voxels for different laser acquisitions. By using a voxel size larger than the pulse diameter, this issue can be slightly reduced [91]. Overall, the optimal voxel size is a key parameter to determine in order to improve characterizations of forest structure [92,93]. Consequently, the optimal voxel spatial resolution should be determined based on plot size, the characteristics of the LiDAR instrument used (e.g., beam diameter, footprint size, average point density and temporal sample spacing, etc.), and forest structure attributes (e.g., tree height, crown diameter, crown depth, etc.)

Conclusions
In this study, a set of canopy metrics derived from canopy vertical profiles, which has the potential to aid in our understanding of the physical characteristics of forest structure, was extracted. The capability of the standard metrics (extracted from the point cloud data) and canopy metrics for estimating forest structural parameters (i.e., DBH, Lorey's mean height, stem density, basal area, volume, and AGB) was assessed, individually and in combination, over a subtropical forest in southeastern China. Moreover, a sensitive analysis of different voxel sizes was performed to investigate the optimal voxel size for estimating forest structural parameters.
The results demonstrated that the forest type-specific models had relatively higher accuracies (Adj-R 2 = 0.44-0.88, rRMSE = 5.13-28.42%) compared with the general models (Adj-R 2 = 0.39-0.77, rRMSE = 8.54-29.86%). The estimation accuracies of Lorey's mean height and AGB were the highest, followed by volume, DBH and basal area, whereas stem density was relatively lower. Overall, metrics of Oligophotic, Empty, Open, α 1 were the most frequently selected, indicating their potential capability for predicting forest structural parameters in the forest stands within the study site. The results demonstrated the synergistic use of standard metrics and canopy metrics for better predicting forest structural parameters (∆Adj-R 2 = 0.01-0.20, ∆rRMSE = −5.71-1.39%), compared with models developed using standard metrics (only) and canopy metrics (only). In addition, the optimal voxel size for estimating forest structural parameters in this study is 5 × 5 × 0.5 m 3 , and the voxel vertical and horizontal resolutions should be determined based on plot size, the characteristics of the acquired LiDAR data (i.e., beam diameter, footprint size, average point density, and temporal sample spacing) and forest structure attributes (i.e., tree height, crown diameter, and crown depth).
Acknowledgments: The project was funded by the Natural Science Foundation of Jiangsu Province (No. BK20151515) and the National Natural Science Foundation of China (No. 31400492). This research was also supported by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD). Special thanks to Xin Shen, Kun Liu, and Ting Xu for field works. The authors gratefully acknowledge the foresters in Yushan forest for their assistance with data collection and sharing their rich knowledge and working experience of the local forest ecosystems.
Author Contributions: Zhengnan Zhang analyzed the data and wrote the paper. Lin Cao helped in project and study design, paper writing, and analysis. Guanghui She helped with data analysis and paper writing.