- freely available
Forests 2014, 5(2), 363-383; doi:10.3390/f5020363
Published: 24 February 2014
Abstract: The objective of this study was to evaluate the applicability of using a low-density (1–3 points m−2) discrete-return LiDAR (Light Detection and Ranging) for predicting maximum tree height, stem density, basal area, quadratic mean diameter and total volume. The research was conducted at the Penobscot Experimental Forest in central Maine, where a range of stand structures and species composition is present and generally representative of northern Maine’s forests. Prediction models were developed utilizing the random forest algorithm that was calibrated using reference data collected in fixed radius circular plots. For comparison, the volume model used two sets of reference data, with one being fixed radius circular plots and the other variable radius plots. Prediction biases were evaluated with respect to five silvicultural treatments and softwood species composition based on the coefficient of determination (R2), root mean square error and mean bias, as well as residual scatter plots. Overall, this study found that LiDAR tended to underestimate maximum tree height and volume. The maximum tree height and volume models had R2 values of 86.9% and 72.1%, respectively. The accuracy of volume prediction was also sensitive to the plot type used. While it was difficult to develop models with a high R2, due to the complexities of Maine’s forest structures and species composition, the results suggest that low density LiDAR can be used as a supporting tool in forest management for this region.
Data on forest structure, such as stem density, basal area and timber volume, are used in both strategic and tactical forest management plans. To achieve the goals of sustainable forest management, managers need to acquire accurate forest structural and conditional information for a variety of spatial scales, including the stand, landscape or regional levels, depending on management objectives. Conventionally, information acquired at the ground plot level is collected and expanded for estimates of total tree volume per stand, per county or even larger areas. However, conventional field measurements generally consist of a limited number of sampling plots that are established in stands for which the forest structural variability within and between stands would not be accounted .
In contrast, airborne discrete-return LiDAR (Light Detection and Ranging), a type of remote sensing, has been widely accepted as an appropriate technology and supporting tool in ecosystem studies and sustainable forest management [2,3,4,5]. Using an airborne discrete-return LiDAR system, forest managers can deploy a robust and reliable data sampling approach to complement conventional field measurements for estimating volume or other forest inventory attributes from the plot to landscape level [5,6]. The ability of LiDAR to inform forest management decisions has been demonstrated in a range of forest types, including boreal forest , mixed softwood , mixed hardwood  and single-species softwood plantations .
However, there are three major issues associated with airborne discrete-return LiDAR system-based forest inventory estimations. First, LiDAR tends to underestimate tree heights, because the pulse hits directly on treetops are generally insufficient [10,11]. Furthermore, pulse returns are difficult to discriminate, either from other nearby treetops or objects other than treetops (e.g., from bare ground, understory vegetation and sides of crowns), a problem which has been avoided in some studies by arbitrarily defining a fixed threshold height [9,10,12,13]. For instance, bare-ground is presumably 1 m below the lowest pulse return to account for the height of understory vegetation [10,13], which can differ among forest ecosystems and silvicultural regimes. Thus, certain preliminary information is necessary to define threshold heights, particularly in northern Maine, where the forests have extensive advance regeneration, due to past and present silvicultural treatments . Finally, extracting height information accurately at the individual tree level may not be possible from LiDAR data, despite a number of studies that have pursued such a goal [15,16]. Thus, LiDAR-based predictions need to be carried out with a different approach, because conventional volume or biomass equations often require both individual tree diameter and height information.
Regarding the second issue, LiDAR pulse footprint sizes and pulse densities may strongly affect prediction accuracy levels in forest inventory estimations. Nilsson  reported that three different footprint sizes (0.75, 1.50 and 3.00 m in diameter) did not affect mean tree height estimations. However, Thomas et al.  suggested that smaller pulse footprint sizes might be suitable for acquiring subdominant canopy information, while Zimble et al.  reported that a low pulse density LiDAR (0.5 pulses m−2) resulted in insufficient pulse direct hits on treetops; thus, height estimations at the stand-level were significantly underestimated compared to field measured tree heights. Popescu and Wynne , as well as Falkowski et al.  suggested that individual tree-based estimation needs a rather high LiDAR pulse density (6–8 pulses m−2) to provide a sufficient number of pulse hits at treetops. However, some prior studies established strong correlations between LiDAR metrics and forest inventory attributes on plot-level based on low pulse density LiDAR (<2 pulse m−2) [18,21,22,23,24]. For example, Treitz et al.  reported that a low pulse density, such as 0.5 pulses m−2, was sufficient for forest inventory attribute prediction regarding tactical forest management. Thus, if the objective is to predict forest attributes at the plot and stand level (instead of the individual tree level), relatively low pulse density LiDAR should be sufficient.
Regarding the third issue, few LiDAR studies have been reported for relatively complex forest structures, such as those that are predominant in northern Maine. While a number of studies have reported that low pulse density LiDAR metrics and field measured forest inventory attributes, particularly volume-, height- and biomass-related attributes, on plot- and stand-levels showed a relatively high coefficient of determination (R2) in other forest ecosystems, limited work has been done in regions dominated by mixed species and multi-canopy stands. Recently, Anderson and Bolstad  evaluated the use of LiDAR in various forest types in the Great Lakes, which are quite similar to those found in Maine, and found a strong relationship between ground-based measurements, regardless of whether the LiDAR data was collected with hardwood species leaf-on or leaf-off.
We assessed the feasibility of predicting various plot- and stand-level forest inventory attributes based on airborne low-density discrete-return LiDAR in a range of stand structures and species composition that are representative of northern Maine’s forest. The primary objectives of this analysis were to: (1) establish empirical relationships between LiDAR data and forest inventory attributes, such as maximum tree height, stem density, quadratic mean diameter (QMD), basal area and stem volume; (2) assess prediction accuracy across a range of silvicultural treatments and species compositions; and (3) evaluate the influence of reference data acquired from research- and operational-grade sampling protocols on attribute predictions.
2. Experimental Section
2.1. Study Area
The study was conducted on the Penobscot Experimental Forest (PEF) near Orono, Maine, USA (44°49′30′′ N, 68°39′00′′ W) (Figure 1). The PEF was established in 1952 by the U.S. Forest Service, and a number of studies regarding timber management, stand dynamics, productivity, biological diversity and more have been conducted within the PEF . The total area of the PEF is 1619 ha, and various silvicultural treatments (e.g., natural area, clearcut, shelterwood and diameter-limit cutting) have been twice replicated for long-term observations. The treatments generally range in a size of 0.5 to 22.4 ha and are representative of typical northern Maine’s silvicultural practices (Table 1). With a few exceptions, most treatments are replicated in the PEF, and field data (e.g., diameter at breast height (DBH)) for each of the replicated treatments are collected at about 600 permanent sampling plots on a 10-year cycle.
Overall, the PEF is defined as a mixed northern conifer dominant forest as a part of the Acadian ecosystem . The major hardwood species in the PEF are red maple (Acer rubrum L.), birches (Betula spp.) and aspens (Populus spp.), while the major softwood species are spruces (Picea spp.), balsam fir (Abies balsamea L. (Mill.)), northern white cedar (Thuja occidentalis L.) and eastern white pine (Pinus strobus L.). The range of elevation above sea level is between 20 and 70 m.
2.2. Inventory Attributes Data
For this study, eleven replicated management units (a total of 22 silvicultural treatment units) that varied from 2.86 to 19.58 ha in size were selected (Figure 1 and Table 1). Within these 22 management units, a total of 117 permanent sampling plots were established with a range of 3–7 fixed, nested, circular permanent sampling plots established in each management unit. On each 0.02-ha (1/20th-acre) permanent sampling plot, diameter at breast height (DBH) was collected from all trees with a DBH greater than 6.35 cm (2.5 inches) between 2003 and 2010, depending on the management unit. On each 0.08-ha (1/5th-acre) permanent sampling plot, DBH was collected from all trees with a DBH greater than 11.25 cm (4.5 inches). On a subsample of permanent sampling plots (n = 117), the total height (HT) and height to crown base were measured on all trees within the 0.08-ha plot.
|Table 1. Description of silvicultural treatments in management units (MUs) in the study area. DBH, diameter at breast height.|
|MU||Area (ha)||Treatment year||Inventory year||Plot (n)||Description of silvicultural treatment||Treatment group|
|4||10.1||1994||2009||4||Fixed diameter-limit cutting. Thresholds are 14.0 cm for balsam fir, 24.1 cm for spruce and hemlock, 26.7 cm for white pine, 19.1 cm for cedar and paper birch and 14.0 cm for other hardwoods.||Diameter-limit (DL)|
|24||9.4||1996||2005||4||Modified diameter-limit cutting. The third modified diameter-limit cut was applied in 1995. Portions of the stand are in the stem exclusion and understory reinitiation stages of development.|
|9||12.2||2003||2003||4||Five-year cutting cycle. The structural goal is to retain 24.1 m2 ha−1 (trees >11.4 cm).||Selection (SEL)|
|12||12.5||1994||2004||5||Ten-year cutting cycle. The structural goal is to retain 20.7 m2 ha−1 (trees >11.4 cm).|
|17||10.9||1994||2005||5||Twenty-year cutting cycle. The structural goal is to retain 16.1 m2 ha−1 (trees > 11.4 cm).|
|13||13.2||1995||2009||8||Crop tree selection.|
|7A||10.6||1979||2003||7||Two-stage uniform shelterwood. Overstory was removed in two harvests; unmerchantable trees >5.08 cm in DBH felled after final overstory removal.||Shelterwood (SHW)|
|23A||5.3||2007||2007||3||Three-stage uniform shelterwood with PCT . Manual PCT to a residual spacing of 2 × 3 m was applied in 1983. The canopy is not closed, and volunteer growth has occurred between crop trees.|
|6||19.6||1995||2010||7||Multi-stage shelterwood with retention. The overstory will be removed in a series of harvests at 10-year intervals, approximately 2 overstory trees acre−1 will be retained through the next rotation.|
|8||17.6||1983||2008||7||Unregulated harvest/commercial clearcutting. This compartment was initially cut with unregulated (“loggers choice”) harvests. The second harvest was a commercial clearcut in 1982. The stands are in the stand initiation and stem exclusion phases of development.||Clearcut (CC)|
|32A||5.2||–||2009||3||Unmanaged natural area (partial cutting had been practiced prior to 1900).||Unmanaged (NAT)|
Based on DBH and HT, stem volume was calculated using a species-specific taper equation [27,28]. Given the differences between plot measurement and acquisition of the LiDAR data in the fall of 2010, the Acadian Variant of the Forest Vegetation Simulator was used  to project DBH and HT to a common year, with the number of projections ranging from 1 to 7 annual cycles. Preliminary results indicated that projected inventory data improved the prediction models in comparison to using data that was not projected. Here after, this sampling method and data are called “research-grade” in this paper. All inventory attributes (maximum tree height, stem density, QMD, basal area and stem volume) were set in the metric unit at the plot-level, and a total volume prediction was scaled to the management unit level (e.g., m3 management unit−1) from the mean of the plot level data and the total acreage of the unit. Thus, a total of 117 plot-level and 22 management-unit data were available for analysis (Table 2).
|Table 2. Examined attributes (mean ± standard deviation) by management unit (MU).|
|MU||Max. tree height (m)||Stem density (Trees ha−1)||QMD (cm)||Basal area (m2 ha−1)||Stem volume (m3 ha−1)||Proportion of softwood basal area|
In addition to the research-grade plots, a total of forty four, 20 basal area factor (BAF) variable sampling plots were established in a total of nine management units between 2010 and 2011. Locations of the plots were the same as the research-grade plots. At each plot, DBH was measured for all tallied trees, while a local height equation was derived using multi-level mixed effects to impute height values (e.g. ), and volume was estimated using the same equations as described before. Hereafter, this sampling approach is called operational-grade plots and data in this paper.
2.3. LiDAR System Specifications
The LiDAR data were acquired along the U.S. Geological Survey National Geospatial Program, LiDAR Base Specification Version 1.0 . Airborne discrete-return laser scanner data were acquired using an Optech Gemini 246 instrument in late October, 2010, and the mean flying altitude above sea level was about 1982 m. LiDAR data was intended to be collected under a leaf-off condition, but most deciduous trees in the PEF kept leaves at that time, due to an abnormal prolonged summer period in 2010. The sensor generated the pulse repetition frequency of 50 KHz, and the laser pulse intensity was 1064 nm, with a scan angle of <20° from the nadir. The mean laser point density was 1.1 pulses m−2 with a footprint of 30 cm, and the sensor collected up to 4 pulse returns.
2.4. LiDAR Data Processing and Model Calibration Predictions
All LiDAR data processing, including the creation of a digital terrain model and LiDAR metrics, were deployed in FUSION v2.90, developed by the U.S. Forest Service Pacific Northwest Research Station . The software has been used in various previous LiDAR research (e.g., [33,34]) and is publicly available. The produced digital terrain model was used to normalize tree heights within FUSION. The software sorted raw LiDAR data into various metrics containing a number of potential predictor variables of inventory attributes. In our case, 97 potential predictor variables were created. To calibrate prediction models, FUSION extracted raw LiDAR data from 117 0.08-ha circular plots coincidental to the research-grade plots in the management units. On the other hand, for prediction models based on operational-grade sampling, empirical relationships were established between raw LiDAR data extracted from 44 0.08-ha plots coincidental to the research-grade plots, because the size of the plots varied.
Although understory vegetation heights varied largely depending on silvicultural treatments in each management unit, we disregarded pulse return within 2 m above ground, as preliminary results indicated a better model fit (greater R2 values) during the LiDAR data extraction. A few example predictor variables in the LiDAR metrics were maximum height, the number of first return pulses in the 90th percentile height and the standard deviation of first return pulses. Consequently, two LiDAR metrics were generated based on research- and operational-grade samples.
However, predictor variables in the LiDAR metrics tend to be highly correlated with others [1,35,36]. In our LiDAR metrics, about 40 predictor variables were highly correlated with others, and some of them did not meet normal distribution criteria. These issues violate the assumption inherent to linear regression models. In addition, variable selection with high dimensionality metrics is not a simple process, and typical data transformations might not be effective for highly skewed or bimodal data. Although Akaike’s Information Criteria (AIC) is a popular approach for variable selection in stepwise regression, the developed regression models tend to have model overfit issues, which are generally not stable when outside of the calibration data. Therefore, the development of inventory prediction models based on simple and multiple linear regressions would not be suitable for this type of dataset.
Alternatively, the random forest technique proposed by Breiman , a nonparametric approach, may be a more effective technique. Random forest was developed based on the regression trees algorithm, where predictor variables are split to grow a number of nodes to select the best predictor variables. In the random forest approach, the regression tree process is continued multiple times and compared against a bootstrapped validation dataset. A key advantage in random forest is that a greater number of predictor variables of various types (categorical, continuous, binary) can be handled, and the relative importance of each predictor variable can be estimated during the model calibration process. In this analysis, the random forest algorithm was run iteratively, in that the model initially included all covariates, the least influential covariate dropped and the model reran until there were only 5 covariates, the preliminary analysis of which had suggested that it was most effective for prediction accuracy.
Stone et al.  reported that inventory prediction models, such as a volume prediction based on random forest, had significantly lower R2 values than prediction models based on other methods, such as regression trees. However, the developed models were based on a small number of reference plot data, and some variables in this study required data transformations for meeting a normal distribution, which random forest might more effectively handle. The “randomforest” package , available in R v2.15 , was used to calibrate the inventory attribute prediction models in this analysis. Each of the calibrated models was evaluated using the coefficient of determination (R2), mean bias and root mean square error (RMSE) between field-measured and LiDAR-predicted inventory attributes on the plot and management unit levels. Negative and positive values in mean bias indicate overestimation and underestimation of inventory attribute predictions by LiDAR, respectively. For each inventory attribute, prediction models were calibrated based on 117 research-grade data in the random forest. Furthermore, to evaluate the influence of reference data acquired from 44 research- and 44 operational-grade sampling protocols, two stem volume prediction models were developed in the random forest.
To examine the performance of the various models, the bias (field observed-LiDAR predicted) was examined graphically with the use of Lowess regression splines. To simplify the interpretation of the differences between the original eleven different silvicultural treatments, the treatments were narrowed down to five broad categories, which included diameter-limit, selection, shelterwood, clearcut and unmanaged (Table 1). To examine the influence of species composition, the percent of softwood vs. hardwood basal area was computed, and the plots were typed as either softwood-dominant (percent of softwood species ≥70) or mixedwood (percent of softwood species <70).
Finally, for producing a volume spatial distribution map, a wall-to-wall of 900-m2 grid cells was overlaid on the PEF area. This size was chosen, because it is similar to the size of the research-grade plots, and total volumes in each management unit (m3 MU−1) were derived as following equation:
Overall, the random forest technique satisfactorily produced a volume prediction model, but the rest of the inventory prediction models had notably lower accuracy levels compared to previously reported studies (Table 3). We only report the results of stem density, QMD and basal area predictions in Table 3 and Table 4, while maximum tree height and stem volume are described in detail below. In general, the three most important variables were LiDAR-measured height variables rather than pulse return counts.
|Table 3. Developed prediction models with the three most key predictor variables with respect to mean square error in random forest with the coefficient of determination (R2), mean bias (MB) with standard deviation (SD) and root mean square error (RMSE). Negative and positive values in MB indicate overestimation and underestimation by LiDAR, respectively..|
|Attributes||Key variables (mean square error)||R2 (Adj R2)||MB (SD)||RMSE|
tree height (m)
|Maximum height||0.869 (0.867)||1.89 (± 2.06)||2.80|
|Fifth percentile height (3.302),|
Height kurtosis (5.982),
Height L-skewness (6.198)
|0.287 (0.280)||9 (± 5013)||4993|
|QMD (cm)||Percent of the first return above mean (6.5909)|
Percent of the first return above 1 m (7.8544)
Twenty-fifth percentile height (8.3618)
|0.489 (0.434)||−0.05 (± 3.69)||3.68|
|Percent all returns above 1 m (7.262)|
Height L-kurtosis (7.564)
Ninety-ninth percentile height (7.614)
|0.344 (0.339)||0.03 (± 13.07)||13.01|
|Ninetieth percentile height (7.795)|
Twentieth percentile height (8.724)
Seventy-fifth percentile height (9.757)
|0.721 (0.719)||1.81 (± 66.96)||66.70|
3.1. Maximum Tree Height
Our preliminary analysis indicated that a LiDAR-derived maximum height, a variable in the LiDAR metrics, was strongly correlated to field measured maximum height. Thus, we did not develop a maximum height prediction model through random forest.
In general, LiDAR underestimated the maximum tree height by 1.89 ± 2.06 m, regardless of silvicultural treatments and species composition, while an agreement between field- and LiDAR-measured maximum height was strong (Table 3). In particular, the diameter limit and shelterwood units had a constant trend over the LiDAR measured maximum heights, as both RMSEs were relatively small (Table 4 and Figure 2a). The unmanaged units had the largest mean bias and RMSE and the largest variation between underestimation and overestimation. Furthermore, LiDAR tended to greatly underestimate heights in softwood plots (Figure 2b) with greater mean bias and RMSE than mixedwood plots.
|Table 4. Mean bias (MB) with standard deviation (SD) and root mean square error (RMSE) by silvicultural treatments and species composition. Mixedwood plots had a percent of basal area of softwood <70, and softwood-dominant plots had a percent of basal area of softwood ≥70.|
|Silvicultural treatments/Species composition||Plot (n)||Max. height MB ± SD (RMSE) (m)||Stem density MB ± SD (RMSE) (trees ha−1)||QMD MB ± SD (RMSE) (cm)||Basal area MB ± SD (RMSE)|
|Stem volume MB ± SD (RMSE) (m3 ha−1)|
|Diameter-limit||20||2.27 ± 1.19 (2.55)||−1415 ± 2843 (3111)||0.05 ± 1.95 (1.90)||−3.63 ± 7.08 (7.80)||−1.12 ± 36.04 (35.14)|
|Selection||49||2.73 ± 1.81 (3.26)||2119 ± 5755 (6078)||−0.96 ± 3.21 (3.32)||4.40 ± 15.86 (16.30)||2.70 ± 40.76 (40.43)|
|Shelterwood||30||0.81 ± 2.15 (2.26)||−2712 ± 4132 (4884)||1.61 ± 4.51 (4.71)||−7.60 ± 8.88 (11.58)||5.56 ± 108.05 (106.37)|
|Clearcut||12||1.00 ± 1.08 (1.44)||1028 ± 3377 (3392)||−1.80 ± 1.36 (2.22)||5.61 ± 8.24 (9.68)||−26.63 ± 32.46 (40.93)|
|Unmanaged||6||3.15 ± 3.78 (4.67)||−925 ± 3287 (3140)||2.33 ± 6.48 (6.36)||3.62 ± 8.38 (8.46)||42.47 ± 95.22 (96.75)|
|Mixedwood||31||1.59 ± 2.02 (2.55)||−589 ± 4496 (4462)||−0.65 ± 2.89 (2.91)||0.72 ± 12.02 (11.85)||−12.34 ± 51.53 (52.17)|
|Softwood||86||2.15 ± 2.06 (2.97)||224 ± 5196|
|0.17 ± 3.93 (3.91)||−0.22 ± 13.49 (13.41)||6.91 ± 71.30 (71.22)|
|All plots||117||2.00 ± 2.05 (2.87)||8.06 ± 5013 (4993)||−0.05 ± 3.69 (3.68)||0.03 ± 13.07 (13.01)||1.81 ± 66.96 (66.70)|
3.2. Stem Volume
In general, LiDAR underestimated the stem volume by 1.81 ± 66.96 m3 ha−1 across silvicultural treatments and species composition, while the plot-level volume prediction model based on the 117 research-grade plots achieved a relatively strong agreement between field-measured and LiDAR-predicted volume (Table 3). The prediction bias in the clearcut and diameter limit units was fairly constant, as those RMSEs were relatively small, while predictions, particularly in the shelterwood and unmanaged units, were varied over the predicted volume, as those RMSEs were large (Table 4 and Figure 3a). In general, the model underestimated the volume in the selection and unmanaged units, while it overestimated in the diameter limit and clearcut units. The prediction in the shelterwood units varied between underestimation and overestimation with increasing predicted volume. Except for the selection units, prediction biases tended to increase with greater softwood species composition (Figure 3b).
An agreement between the LiDAR prediction model based on the 44 operational-grade sampling plots and the matched locations of the 44 research-grade sampling plots, in the nine management units, was relatively high (Table 5). The difference in those two R2 values was about 0.07 with an RMSE difference of 14.81 m3 ha−1. The operational-grade model had prediction biases between overestimation and underestimation in the diameter limit and selection units (Figure 4a). The research-grade model had prediction biases from underestimation to overestimation in the selection units and from overestimation to underestimation in the diameter limit units (Figure 4b). In general, the model based on the research-grade plots showed better accuracy and precision in the diameter limit and selection units (Table 6). Furthermore, the research-grade model had smaller mean bias in the mixedwood and softwood plots, although the RMSE for mixedwood plots was larger than the operational-grade model.
|Table 5. Developed stem volume prediction models based on research- and operational-grade plot data with the three most key predictor variables regarding the coefficient of determination (R2, adjusted R2, mean bias (MB) with standard deviation (SD) and root mean square error (RMSE).|
|Sampling type||Key variables (mean square error)||R2 (Adj. R2)||MB (SD) (m3 ha−1)||RMSE (m3 ha−1)|
|Research-grade||Mean height (6.777)
Seventy-fifth percentile height (6.784)
Fortieth percentile height (6.873)
|0.828 (0.824)||0.20 (±36.74)||36.33|
|Operational-grade||Thirtieth percentile height (6.349)
Twenty-fifth percentile height (6.397)
Eightieth percentile height (7.344)
|0.755 (0.749)||−4.21 (±51.22)||50.81|
|Table 6. Mean bias (MB) with standard deviation (SD) and root mean square error (RMSE) by silvicultural treatments and species composition. The prediction models were calibrated based on 44 research- and 44 operational-grade plot data. Mixedwood plots had a percent of basal area of softwood <70, and softwood-dominant plots had a percent of basal area of softwood ≥70.|
|Silvicultural treatments Species composition||Plot (n)||Stem Volume MB ± SD (RMSE)|
|Diameter-limit||18||−3.09 ± 26.36 (25.88)||−9.90 ± 53.04 (52.49)|
|Selection||26||2.73 ± 43.42 (42.67)||−0.28 ± 50.60 (49.62)|
|Mixedwood||8||−3.96 ± 37.80 (35.58)||13.43 ± 26.10 (27.86)|
|Softwood||36||1.07 ± 36.97 (36.49)||−8.13 ± 54.77 (54.62)|
|All plots||44||0.20 ± 36.74 (36.33)||−4.21 ± 51.22 (50.81)|
At last, an agreement between field and model estimates of total volume in the management unit was strong (R2 = 0.92). A volume distribution map based on the model with the research-grade plots is presented in Figure 5.
4.1. Predictor Variables in LiDAR Metrics
Overall, maximum tree height and stem volume prediction models showed relatively high correlation between field measured values and LiDAR metrics (Table 3). Although some previous studies only used the first return and the last return data [13,21,22,40,41,42] for inventory attribute predictions, random forest allowed for the use of all return information in this study. To explain the complex vertical structures observed at the PEF, we expected that the first and the last returns information would not be sufficient, as the second, third and fourth returns would sense variability under overstory canopy structures. For the volume prediction model, certain percentile heights were necessary to account for multiple canopy layers in plots, and it would be important to acquire not only overstory canopy height distribution, but also lower height (e.g., the 20th percentile height) data to distinguish between ground and understory vegetation. However, we do not have stem volume data specific to trees in subcanopy layers, so we could not investigate how multiple returns associate with the volume distribution in the subcanopy layer.
While random forest was deployed to produce the LiDAR metrics, this study disregarded pulse returns within 2 m above ground. Such a threshold height depends on forest structures in the management area. For example, García et al.  assigned the threshold height of 30 cm, while Næsset  assigned the threshold height of 1 m. We compared prediction model fits (R2) based on different threshold values (0–5 m) during the preliminary analysis, and the prediction models based on the 2 m threshold had the highest model fit. This preliminary result also inferred that the pulse returns from within 2 m above ground tended to be background noise, due to thick understory vegetation in the PEF. While Su and Bork  successfully predicted tree heights in a Populus tremuloides forest in Alberta, Canada, they could not attain same accuracy level for understory vegetation heights, because LiDAR pulses could not sufficiently penetrate to sense shrubs and herbs under the overstory.
Although LiDAR intensity-related variables in our LiDAR metrics were included while deploying the random forest for model calibrations, these variables did not contribute to improve model fits. The likely reason was that LiDAR intensity values were available in this study, but we did not have an appropriate tool and other auxiliary data to calibrate for flying altitudes, terrain conditions and atmospheric conditions for the intensity values. However, while the intensity values have the potential to discriminate between hardwood and softwood species  or live and dead standing trees , they may not improve accuracy levels for the forest inventory attributes examined in this analysis .
4.2. Silvicultural Treatments and Species Composition
The unmanaged units tended to result in large prediction errors (Table 4). For instance, the unmanaged units had the highest bias in the maximum height and volume predictions. Although the total area of unmanaged units is smaller than the other four management units, it tends to have the highest variability regarding vertical structure and species composition. Furthermore, management units with softwood species composition greater than 80% tended to result in large prediction errors. For example, the volume prediction tended towards underestimation in the softwood species dominant plots. On the other hand, the prediction errors were fairly constant in mixedwood plots, when compared with softwood plots. This infers that the low pulse density LiDAR tended to hit the sides of the conical shape of softwood trees rather than the treetops. However, the number of mixedwood plots was small in this study. The PEF is a relatively complex forest and descriptive statistics (e.g., mean and standard deviation, Table 2) indicated high variability between plots in each of the management units. In general, the plots with the highest softwood composition had multiple layer canopy structures, which can be problematic for prediction using LiDAR metrics. One reason is that the multiple canopy layers make pulses difficult to reach the ground, which would result in creations of inaccurate digital elevation models . In particular, balsam fir is a prolific species and tends to establish a number of advance seedlings under a range of overstory conditions in the PEF . Thus, this creates a rather complex vertical structure and can make it quite difficult to develop forest inventory prediction models based solely on remotely sensed attributes.
4.3. Maximum Tree Height
The maximum tree height in plots was generally underestimated, and this result is consistent with findings from other studies [10,11,45,46]. A number of laser pulses likely returned from below the treetops, because of a conical crown shape of softwood trees ; thus, prediction in the softwood dominant plot had a larger underestimation than the mixedwood plots. Furthermore, when this LiDAR data were acquired, most hardwood trees kept their leaves; thus, an ellipsoidal crown shape of hardwood trees would have intercepted and returned laser pulses better than under a leaf-off condition. The RMSE of 2.75 m between field-measured and the LiDAR-measured maximum heights in this study is similar to those observed by Means et al.  and Jensen et al. , who also used a low-pulse density LiDAR. In contrast, Persson et al.  achieved an RMSE of 0.63 m when a relatively higher pulse density LiDAR was used. In general, higher pulse density LiDAR is necessary to achieve better accuracy levels for maximum height predictions [19,43]. Magnusson et al.  pointed out that achievable accuracy levels in tree height predictions depend also on canopy structure. For example, uniformly distributed canopy height structural stands may not require the use of high-pulse density LiDAR. In addition, the creation of digital terrain models is a difficult task in which understory vegetation grows thick , such as in the stands examined in this study. For example, Clark et al.  had a high RMSE for tree height estimations in a tropical rainforest, despite using high-pulse density LiDAR. This is likely the reason that LiDAR largely underestimated the maximum tree height under a closed canopy with thick understory vegetation conditions in the unmanaged units, because LiDAR pulses might not penetrate through canopy layers and understory vegetation to reach and return from the surface of the ground.
4.4. Stem Volume
The developed plot-level stem volume (m3 ha−1) had the highest R2 value of the various equations evaluated in this study (0.72), which was relatively similar to other studies, such as Aardt et al.  and Hawbaker et al. . Like this analysis, both of these studies were based on low-pulse density LiDAR. Magnusson et al.  indicated that relative RMSE in volume predictions increased as pulse density decreased. However, the accuracy of volume prediction models is likely influenced by not only pulse density, but also the stand types examined. For example, Jaskierniak et al.  developed models with R2 values of 0.59–0.80 based on 2 pulses m−2 in an eucalyptus forest in Australia, while Means et al.  developed models with high R2 values based on a low pulse density in a Douglas-fir (Pseudotsuga menziesii (M.) Franco.)-dominated forest in Oregon. In contrast, Magnusson et al.  developed models with a R2 greater than 0.90 in Norway spruce- and Scots pine-dominated forests in southern Sweden. When compared to the PEF, the stand structures in these aforementioned studies are relatively simple. Like this study, Aardt et al.  and Hawbaker et al.  conducted the study in mixed softwood-hardwood forests in Virginia and Wisconsin, respectively, which would have stand structures similar to the PEF. Woods et al.  also worked in a mixed softwood-hardwood forests in Ontario, Canada, and were able to achieve a much lower RMSE than our study. Woods et al.  did this by stratifying their study area into four broad forest types based on species composition rather than past silvicultural treatments. Likewise, Anderson and Bolstad  found that stratification of models by forest type was necessary to improve prediction accuracy.
In this study, the volume prediction, as well as other inventory attributes were particularly problematic in the shelterwood and unmanaged units. Despite twenty nine and six research-grade 0.08-ha plots being established in these management units, respectively, the high variability between plots suggests that this might be an inadequate sample. Shelterwood systems tend to leave a small number of large trees in the overstory with the intent of promoting a great number of young trees and seedlings in the understory. Likely, a greater number of field plots or larger size plots would be needed to account for this large variability [8,49].
In general, the mixedwood plots had smaller prediction biases than the softwood plots for all inventory attributes. However, while Anderson and Bolstad  predicted biomass in a mixed softwood-hardwood forest in Wisconsin, they reported an opposite result: that they had less prediction bias in the softwood forests than mixedwood forests. Complexities of stand structures and species composition were somewhat similar to our study site, but the number of mixedwood plots was small in this study; thus, further investigation is necessary to resolve such a disagreement.
When comparing the research- and operational-grade plots, overall prediction errors were smaller based on the research-grade sampling plots. Therefore, although such a comparison has not been reported previously to our knowledge, this study suggests that reference data for model calibrations be based on fixed radius plots with a subsample of measured tree heights rather than using variable radius plots with limited or no height measurements.
Although a comparison between the field- and LiDAR-based total volume prediction (the model calibrated by research-grade plot data) at the management unit-level showed general agreement, both methods were quite different (R2 = 0.92). Given the ability to better account for within-stand variability, the LiDAR-based volume estimates should be considered superior to the volume estimates based on conventional field measurements.
Development of the inventory attribute prediction model based on a nonparametric regression technique allowed us to explore all potential LiDAR predictor variables and account for highly nonlinear relationships. In general, the low-density LiDAR used in this study was able to capture the variability, despite a wide range of stand structure and species composition mixtures examined. However, there were certain stand structures and species composition mixtures where low-density LiDAR was ineffective. Although the costs of LiDAR data acquisition for large areas are still relatively high, this study highlights that the use of LiDAR-based inventory attribute predictions is a valuable option for achieving efficient and effective forest assessment from a variety of spatial scales, even in regions dominated by naturally-regenerated, mixed species stands.
The University of Maine Agriculture and Forest Experimental Station and Cooperative Forestry Research Unit financially supported Rei Hayashi’s graduate research assistantship. LiDAR data were provided by the Maine State Planning Office. We thank the U.S. Forest Service for allowing us to use the PEF data. Furthermore, we would like to thank the two anonymous reviewers for their comments that improved the paper.
Conflicts of Interest
The authors declare no conflict of interest.
- Stone, C.; Penman, T.; Turner, R. Determining an optimal model for processing lidar data at the plot level: Results for a Pinus radiata plantation in New South Wales, Australia. N. Z. J. For. Sci. 2011, 41, 191–205.
- Akay, A.E.; Oğuz, H.; Karas, I.R.; Aruga, K. Using LiDAR technology in forestry activities. Environ. Monit. Assess. 2009, 151, 117–125, doi:10.1007/s10661-008-0254-1.
- Hudak, A.T.; Evans, J.S.; Smith, A.M.S. LiDAR utility for natural resource managers. Remote Sens. 2009, 1, 934–951, doi:10.3390/rs1040934.
- Lefsky, M.A.; Cohen, W.B.; Parker, G.G.; Harding, D.J. Lidar remote sensing for ecosystem studies. BioScience 2002, 52, 19–30, doi:10.1641/0006-3568(2002)052[0019:LRSFES]2.0.CO;2.
- Woods, M.; Pitt, D.; Penner, M.; Lim, K.; Nesbitt, D.; Etheridge, D.; Treitz, P. Operational implementation of a LiDAR inventory in boreal Ontario. For. Chron. 2011, 87, 512–528.
- Evans, J.; Hudak, A.; Faux, R.; Smith, A.M. Discrete return lidar in natural resources: Recommendations for project planning, data processing, and deliverables. Remote Sens. 2009, 1, 776–794, doi:10.3390/rs1040776.
- Hummel, S.; Hudak, A.T.; Uebler, E.H.; Falkowski, M.J.; Megown, K.A. A comparison of accuracy and cost of LiDAR vs. stand exam data for landscape management on the Malheur National Forest. J. For. 2011, 109, 267.
- Anderson, R.S.; Bolstad, P.V. Estimating aboveground biomass and average annual wood biomass increment with airborne leaf-on and leaf-off LiDAR in Great Lakes forest types. North. J. Appl. For. 2013, 30, 16–22, doi:10.5849/njaf.12-015.
- Goerndt, M.E.; Monleon, V.J.; Temesgen, H. Relating forest attributes with area- and tree-based light detection and ranging metrics for western Oregon. West. J. Appl. For. 2010, 25, 105–111.
- Næsset, E. Determination of mean tree height of forest stands using airborne laser scanner data. ISPRS J. Photogramm. Remote Sens. 1997, 52, 49–56, doi:10.1016/S0924-2716(97)83000-6.
- Clark, M.L.; Clark, D.B.; Roberts, D.A. Small-footprint lidar estimation of sub-canopy elevation and tree height in a tropical rain forest landscape. Remote Sens. Environ. 2004, 91, 68–89, doi:10.1016/j.rse.2004.02.008.
- Jaskierniak, D.; Lane, P.N.; Robinson, A.; Lucieer, A. Extracting LiDAR indices to characterise multilayered forest structure using mixture distribution functions. Remote Sens. Environ. 2011, 115, 573–585.
- García, M.; Riaño, D.; Chuvieco, E.; Danson, F.M. Estimating biomass carbon stocks for a Mediterranean forest in central Spain using LiDAR height and intensity data. Remote Sens. Environ. 2010, 114, 816–830, doi:10.1016/j.rse.2009.11.021.
- McWilliams, W.H.; Butler, B.J.; Caldwell, L.E.; Griffith, D.M.; Hoppos, M.L.; Laustsen, K.M. The Forests of Maine; U.S. Department of Agriculture, Forest Service, Northeastern Research Station: Newton Square, PA, USA, 2005; p. 188.
- Falkowski, M.J.; Smith, A.M.S.; Hudak, A.T.; Gessler, P.E.; Vierling, L.A.; Crookston, N.L. Automated estimation of individual conifer tree height and crown diameter via two-dimensional spatial wavelet analysis of lidar data. Can. J. Remote Sens. 2006, 32, 153–161, doi:10.5589/m06-005.
- Popescu, S.C. Estimating biomass of individual pine trees using airborne lidar. Biomass Bioenergy 2007, 31, 646–655, doi:10.1016/j.biombioe.2007.06.022.
- Nilsson, M. Estimation of tree heights and stand volume using an airborne lidar system. Remote Sens. Environ. 1996, 56, 1–7, doi:10.1016/0034-4257(95)00224-3.
- Thomas, V.; Treitz, P.; McCaughey, J.H.; Morrison, I. Mapping stand-level forest biophysical variables for a mixedwood boreal forest using lidar: An examination of scanning density. Can. J. For. Res. 2006, 36, 34–47, doi:10.1139/x05-230.
- Zimble, D.A.; Evans, D.L.; Carlson, G.C.; Parker, R.C.; Grado, S.C.; Gerard, P.D. Characterizing vertical forest structure using small-footprint airborne LiDAR. Remote Sens. Environ. 2003, 87, 171–182, doi:10.1016/S0034-4257(03)00139-1.
- Popescu, S.C.; Wynne, R.H. Seeing the trees in the forest: Using lidar and multispectral data fusion with local filtering and variable window size for estimating tree height. Photogramm. Eng. Remote Sens. 2004, 70, 589–604, doi:10.14358/PERS.70.5.589.
- Hawbaker, T.J.; Gobakken, T.; Lesak, A.; Trømborg, E.; Contrucci, K.; Radeloff, V. Light detection and ranging-based measures of mixed hardwood forest structure. For. Sci. 2010, 56, 313–326.
- Means, J.E.; Acker, S.A.; Fitt, B.J.; Renslow, M.; Emerson, L.; Hendrix, C.J. Predicting forest stand characteristics with airborne scanning lidar. Photogramm. Eng. Remote Sens. 2000, 66, 1367–1372.
- Jensen, J.L.R.; Williams, C.J.; DeGroot, J.; Humes, K.S.; Conner, T. Estimation of biophysical characteristics for highly variable mixed-conifer stands using small-footprint lidar. Can. J. For. Res. 2006, 36, 1129–1138, doi:10.1139/x06-007.
- Næsset, E. Practical large-scale forest stand inventory using a small-footprint airborne scanning laser. Scand. J. For. Res. 2004, 19, 164–179, doi:10.1080/02827580310019257.
- Treitz, P.; Lim, K.; Woods, M.; Pitt, D.; Nesbitt, D.; Etheridge, D. LiDAR sampling density for forest resource inventories in Ontario, Canada. Remote Sens. 2012, 4, 830–848, doi:10.3390/rs4040830.
- Sendak, P.E.; Brissette, J.C.; Frank, R.M. Silviculture affects composition, growth, and yield in mixed northern conifers: 40-year results from the Penobscot Experimental Forest. Can. J. For. Res. 2003, 33, 2116–2128, doi:10.1139/x03-140.
- Li, R.; Weiskittel, A.; Dick, A.R.; Kershaw, J.A.; Seymour, R.S. Regional stem taper equations for eleven conifer species in the Acadian Region of North America: Development and assessment. North. J. Appl. For. 2012, 29, 5–14, doi:10.5849/njaf.10-037.
- Weiskittel, A.; Li, R. Development of Regional Taper and Volume Equations: Hardwood Species; University of Maine, School of Forest Resources: Orono, ME, USA, 2012; pp. 87–95.
- Weiskittel, A.; Russell, M.; Wagner, R.; Seymour, R. Refinement of the Forest Vegetation Simulator Northeast Variant Growth and Yield Model: Phase III; University of Maine, School of Forest Resources: Orono, ME, USA, 2012; pp. 96–104.
- Robinson, A.P.; Wykoff, W.R. Imputing missing height measures using a mixed-effects modeling strategy. Can. J. For. Res. 2004, 34, 2492–2500, doi:10.1139/x04-137.
- Heidemann, H.K. Lidar Base Specification Version 1.0: U.S. Geological Survey Techniques and Methods. Book 11, Collection and Delineation of Spatial Data; US Department of the Interior, US Geological Survey: Sioux Falls, SD, USA, 2012; p. 63.
- McGaughey, R.J. FUSION/LDV: Software for LIDAR Data Aalysis and Vsualization, 3.30; USDA Forest Service—Pacific Northwest Research Station: Portland, OR, USA, 2013.
- Gonzalez-Ferreiro, E.; Dieguez-Aranda, U.; Barreiro-Fernandez, L.; Bujan, S.; Barbosa, M.; Suarez, J.C.; Bye, I.J.; Miranda, D. A mixed pixel- and region-based approach for using airborne laser scanning data for individual tree crown delineation in Pinus radiata D. Don plantations. Int. J. Remote Sens. 2013, 34, 7671–7690, doi:10.1080/01431161.2013.823523.
- Bolton, D.K.; Coops, N.C.; Wulder, M.A. Measuring forest structure along productivity gradients in the Canadian boreal with small-footprint Lidar. Environ. Monit. Assess. 2013, 185, 6617–6634, doi:10.1007/s10661-012-3051-9.
- Hudak, A.T.; Crookston, N.L.; Evans, J.S.; Hall, D.E.; Falkowski, M.J. Nearest neighbor imputation of species-level, plot-scale forest structure attributes from LiDAR data. Remote Sens. Environ. 2008, 112, 2232–2245, doi:10.1016/j.rse.2007.10.009.
- Li, Y.Z.; Andersen, H.E.; McGaughey, R. A comparison of statistical methods for estimating forest biomass from light detection and ranging data. West. J. Appl. For. 2008, 23, 223–231.
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32, doi:10.1023/A:1010933404324.
- Liaw, A.; Wiener, M. Classification and regression by randomForest. R News 2002, 2, 18–22.
- R Development Core Team. A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2013.
- Kim, Y.S.; Yang, Z.Q.; Cohen, W.B.; Pflugmacher, D.; Lauver, C.L.; Vankat, J.L. Distinguishing between live and dead standing tree biomass on the North Rim of Grand Canyon National Park, USA using small-footprint lidar data. Remote Sens. Environ. 2009, 113, 2499–2510, doi:10.1016/j.rse.2009.07.010.
- Parker, R.C.; Glass, P.A. High- vs. low-density LiDAR in a double-sample forest inventory. South. J. Appl. For. 2004, 28, 205–210.
- Næsset, E.; Økland, T. Estimating tree height and tree crown properties using airborne scanning laser in a boreal nature reserve. Remote Sens. Environ. 2002, 79, 105–115, doi:10.1016/S0034-4257(01)00243-7.
- Su, J.G.; Bork, E.W. Characterization of diverse plant communities in Aspen Parkland rangeland using LiDAR data. Appl. Veg. Sci. 2007, 10, 407–416.
- Olson, M.G.; Wagner, R.G. Long-term compositional dynamics of Acadian mixedwood stands under different silvicultural regimes. Can. J. For. Res. 2010, 40, 1993–2002, doi:10.1139/X10-145.
- Magnusson, M.; Fransson, J.E.; Holmgren, J. Effects on estimation accuracy of forest variables using different pulse density of laser data. For. Sci. 2007, 53, 619–626.
- Magnussen, S.; Boudewyn, P. Derivations of stand heights from airborne laser scanner data with canopy-based quantile estimators. Can. J. For. Res. 1998, 28, 1016–1031, doi:10.1139/x98-078.
- Persson, Å.; Holmgren, J.; Söderman, U. Detecting and measuring individual trees using an airborne laser scanner. Photogramm. Eng. Remote Sens. 2002, 68, 925–932.
- Van Aardt, J.A.N.; Wynne, R.H.; Oderwald, R.G. Forest volume and biomass estimation using small-footprint lidar-distributional parameters on a per-segment basis. For. Sci. 2006, 52, 636–649.
- Gobakken, T.; Næsset, E. Assessing effects of positioning errors and sample plot size on biophysical stand properties derived from airborne laser scanner data. Can. J. For. Res. 2009, 39, 1036–1052, doi:10.1139/X09-025.
© 2014 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).