1. Introduction
Accurate information on forest composition and structure is critical to ensure the effective sustainable management of forest ecosystems [
1]. This information is used both for the accurate estimation of attributes describing the amount and type of forest resource, principally done through forest inventories, as well as for the mapping of the forest resource, which provides information on the areal extent [
2]. These pieces of information are then used to develop a comprehensive understanding and inform management of forest ecosystems [
3] including for example silvicultural practices [
4,
5], volume and biomass estimation [
6], assessment of biodiversity [
7] and other ecosystem goods and services [
8], carbon management (e.g., [
9]), and forest health assessment [
10]. Whilst there is a broad range of applications that utilize forest inventory data, information needs vary across the forest planning sector [
1]. At the operational level, information is required to support activities such as harvest and road layout. At the strategic level, information is required to support long-term sustainable management, growth and yield forecasting, timber supply analysis, and support a multitude of resource decisions relevant to forest protection and wildlife management [
11].
Airborne laser scanning (ALS) data can support both strategic and operational forest information needs. ALS measures the three-dimensional distribution of vegetation within forest canopies and, as a result, is particularly well suited for describing structural vegetation attributes [
12]. ALS has a demonstrated utility for forest inventory across a range of forest environments [
13,
14,
15,
16], and can provide a direct measurement of a range of forest inventory attributes such as tree height, location, and canopy cover, as well as providing wall-to-wall predictor variables in the form of ALS metrics that allow for the development of models to estimate for example volume, biomass, and basal area [
17]. A multitude of studies have found a high correlation between field- and ALS-measured tree heights in a variety of forest environments (e.g., [
6,
17,
18]). In addition, by examining the number and height of the return pulses within a given area, information on the vertical profile of light penetrating the plant canopy can be derived, providing additional information on forest structure, such as crown shape and density.
Analysis of ALS data for use in forestry follows two basic approaches: an individual tree detection (ITD) approach [
19], or an area-based approach (ABA) [
20]. Whilst the area-based approach is now operationally used in a range of forest environments [
13,
17], ITD approaches are still in research mode [
21,
22,
23]. Using ITD approaches, individual treetops are located from either the ALS raw data point clouds directly [
24], or from a canopy height model [
25,
26]. As discussed by Breidenbach
et al. [
21], ITD approaches are inherently intuitive, with trees often clearly seen in ALS point clouds, especially when the density of pulses is much greater than individual crown sizes. However, ITD approaches are prone to bias, as a result of over or under segmentation of tree crowns, whereby some trees are undetected, whilst others are split into multiple trees. These issues result in omission and commission errors, which can have a significant impact on overall estimate for forest inventory [
27].
Area-based approaches (ABA), rather than explicitly extracting or identifying each individual tree crown, are based on aggregations of ALS point clouds to develop a series of canopy density and height metrics [
20]. These metrics become independent variables that are used to predict the desired forest inventory attributes. The fact that the variables are modelled, rather than directly derived using ITD approaches, increases the number, and types, of models that can be developed. Past research has demonstrated the successful application of a wide range of modelling approaches (regression, nearest neighbor, decision trees, random forests, and model-based) to estimate a range of inventory attributes [
1,
28,
29]. Most importantly, by developing statistical models, bias can be minimized and, as a result, the ABA is widely used operationally [
30].
Both individual tree and area based approaches therefore have different outputs, errors and costs [
31]. While the area-based techniques have now become common place, advances in forest growth and dynamics models, as well as the needs of industry for example, for individual stem and piece sizes of a stand, is requiring the development of tree lists or stand tables for effective management [
1]. Bergseng
et al. [
31] concluded that large economic losses associated with poor inventory information could be avoided if area-based techniques are used to derive information on diameter distributions in addition to plot-level averages. They concluded that prediction of mean tree attributes often did not provide a sufficient enough description of the stand for decision-making regarding timing of harvest. In addition, the increasing use of ecosystem-based management approaches which often promote selective harvesting resulting in uneven-aged forest stands is even less well suited to average tree models derived from the ABA method. Tree lists containing unbiased estimates of individual tree specifications arguable allows the greatest flexibility as it can be varied, manipulated or aggregated as needed, to suit different objectives [
1].
As a result of the need for individual tree-based attributes and finer scale descriptions of stands, beyond an area based statistical method, a number of new approaches to relate the ITD and ABA have been developed. For example, Breidenbach
et al. [
21] proposed a semi-ITC (individual tree crown) method that imputes field data within from the nearest neighboring crown segment. The approach uses a most similar neighbor inference (MSN) to predict the attributes of the individual crown segments allowing unbiased estimates of volume and volume by species at the plot level. Lindberg
et al. [
32] developed an approach to produce individual tree lists which were consistent with unbiased estimates that were produced with ABA. Vastaranta
et al. [
33] improved the stem volume predictions by integrating both methods and acquiring plot-level training data for ABA with ITD. Xu
et al. [
34] demonstrated a method to calibrate the ABA-derived diameter distribution by combining ABA and ITD. They present two approaches based on replacement and histogram matching, with more accurate results found with the second approach. Key to all these approaches however, is the requirement to perform individual tree detection, in order to build distributions prior to modifying them by area-based estimates, or pre-determined tree lists available for the site. A number of studies have demonstrated that information on individual trees, specifically DBH (diameter at breast height), can be extracted exclusively using ABA [
35,
36,
37]. Typically unimodal in managed stands [
38], DBH distributions are usually modelled by predicting parameters from a Weibull probability density function including percentile-based modelling, non-parametric imputations, or parameter recovery [
39,
40,
41]. The density distributions are then scaled to match the predicted total number of trees or total basal area often with unbiased estimates at the plot level [
37]. Existing studies that have predicted DBH distribution with ALS data are summarized in
Table 1. Since we focus on ABA only, approaches that used or incorporated ITD results [
21,
32,
34] as part of the derivation were not included in the table.
Previous studies have largely focused on describing DBH distributions because of the importance of DBH as an informative descriptor of stand structure [
39]. Fewer studies have modelled the distribution of tree heights [
27,
28], or derived individual tree volume [
32]. Knowledge of within-stand volume distribution can inform on product mix and log allocation opportunities, allowing for improved pre-harvest planning.
In this study, our objective was to demonstrate the application of an approach to estimating individual tree volumes for highly productive, multi-species, multi-age, temperate coastal forest stands, based on an ABA to model fitting and plot level optimization. The method presented builds on existing studies [
37,
42,
43] and uses the within-plot tree size distribution as the main vehicle to downscale from area-based to individual tree-based predictions of volume. Previous studies have primarily tested the approach in managed boreal forest stands of northern Europe (
Table 1); the complexity and high productivity of the coastal, temperate forest environment of our study area provides a novel context for testing of the downscaling approach.
4. Discussion
Tree size distributions and individual tree lists are important attributes in forest inventories, especially in an operational context. The capacity to make accurate predictions of not only the total stand volume, but also of the frequency distribution of individual tree volumes, provides valuable information that can be used in management of forest resources. While ALS data is becoming increasingly popular for estimating a number of forest stand attributes, the predictions are, in the majority of cases, undertaken using an ABA and providing information on total or average values of stand attributes (e.g., volume per ha, mean height). The alternative—the ITD approach—has not yet reach operational status as it usually provides biased results and requires more complex data processing routines [
70,
71].
In this study we evaluated the use of ALS point clouds to predict individual tree volume distributions, providing an enhanced attribute set for traditional ABA in complex, high productivity forest stands in Pacific Northwest. Our results suggest that ALS data can be used to downscale plot-level information and estimate attributes of individual trees. To do this, we used parameter prediction and then retrieved tree count on each plot (
n). Distinct from existing approaches, our generated tree lists for each plot consisted of individual tree volumes rather than DBH or basal area values [
34,
36,
42].
Our approach was based on three steps, starting with modelling of the total plot volume. We found that the approach presented by Bouvier
et al. [
55] provided an accurate total volume estimate, although the metrics we used to characterize stand height, height heterogeneity, and canopy cover were different than those used by Bouvier
et al. [
55]. Furthermore, we used a different metric to characterize the vertical complexity of the plot: vertical rumple. In our case, the achieved model accuracy for estimating plot volume (
R2 = 0.867; RMSE% = 34.8%, leave-one-out cross validation) was comparable with results reported by others [
55,
72].
Estimating the Weibull distribution parameters was the second step in the presented workflow. Similar to results presented in previous studies, we achieved higher prediction accuracy for λ (scale parameter) than for
k (shape parameter) [
36]. However, in contrast to previous studies [
36,
42], we used a single explanatory variable in models predicting each parameters: a measure of canopy cover was used to estimate
k, whilst a metric describing stand height was used to estimate λ. The two Weibull parameters modify the form of the PDF in a different way, allowing for its great flexibility. The
k parameter is responsible for the overall form of the distribution, which can resemble exponential, Rayleigh, or normal distribution while the parameter value changes from 1 to 4. The change in λ determines the range of the distribution. Interesting to note is the relationship between canopy cover (percentage of points above mean height of the points) and the
k parameter. Our analysis showed that with the increase of the percentage of points above the mean height, the form of the Weibull distribution function changes from inverse J-shape to normal-like. The significance of the 90th percentile of ALS points to predict the scale parameter can likely be explained with the increasing range of individual tree volumes with increasing stand height.
The comparison of the predicted and reference Weibull density functions, generated with the same sequence of individual tree volumes, allowed us to assess the accuracy of modelled
k and λ parameters. We observed that even with relatively strong modelling results and statistical test results indicating agreement between compared distributions, the bias and RMSE values can be large for some plots. This demonstrates how small discrepancies in model forms can lead to extreme relative differences. For example, for Plot 10 (
Figure 4), the overall shape of the distributions match, however the predicted distribution is unrealistic for low volumes, leading to large relative differences (
Table 5).
In the final step of the workflow, we predicted tree count (
n) for each plot by using the parameter retrieval method with predicted total volume (
VALS) and mean tree volume (
E(v)) derived using the predicted Weibull distribution parameters. This approach produced estimates of
n with absolute bias of −1.6 trees and a relative RMSE of 24.4% (or 149 tree·ha
−1). In other studies using Weibull PDF to model the distribution of tree size (typically via basal area), the number of trees was predicted directly using modelling with ALS-based metrics as explanatory variables [
32,
36]. Although existing studies have shown that such prediction of stem count is often accurate [
36,
37], the accuracy varies depending on forest structure and can result in relative RMSE exceeding 50% [
73]. Initial tests implemented at an early stage of our analysis indicated that these direct predictions in these complex forest systems are also prone to error (RMSE% = 32.7% or 200.5 trees·ha
−1).
As a comparison, Maltamo
et al. [
37] used the total number of trees, basal area, and volume to generate DBH distributions. All three variables were first predicted with ALS-based models using ABA. In our case the analysis were based on volume only and resulted not only with frequency distributions but also with minimally biased predictions of tree count. This allowed us to exclude the usage of the inaccurate tree count estimates to scale the density functions. As a result this approach presents an alternative for estimating tree count with ABA, when direct modelling does not provide accurate results.
The final estimates of tree volume distributions were—for the majority of plots—similar to the reference distributions (
Table 5). The values of the error indices (
eR and
eP) were similar or lower to those reported for studies focusing on modeling DBH distributions [
47,
48] and mean value of
eP was almost identical to the value reported by Packalén and Maltamo [
43]. Our results are also similar to studies that integrate ITD and ABA directly [
34], however as discussed previous studies focused on modelling distribution of DBH, not volume.
Recent trends integrating ITD and ABA to improve the accuracy of modelling forest attributes [
29,
32,
33,
34] will likely continue to provide promising results especially as density of lidar datasets increase. However, as showed by Xu
et al. [
34], the modelling of the tree size distributions based on ABA does not always benefit from additional information provided with ITD. Xu
et al. [
34] presented two integration methods—(1) one where the ABA-tail of large trees in DBH distribution is substituted with trees detected with ITD, which depends principally on the accuracy of the ITD-derived diameter distribution; and (2) the histograms of DBH distributions derived with ABA and ITD are matched, which relies on the accuracy of the ABA-derived diameter distribution. Xu
et al. [
34] showed that the second approach, based on ABA, obtained better accuracy than the first and that the influence of the ITD-derived diameter distribution during histogram matching was only slight.
While we readily acknowledge that the number of sample plots in our study was small, our results show that even with simple distribution modelling based on Weibull probability density function, the predicted distributions were accurate, with accuracies comparable to more complex methodologies, including those based on integrations of ABA and ITD. However, by regressing the differences in estimates of
n we confirmed that standard Weibull PDF is not suitable for characterizing multimodal distributions. Such multimodal distributions were in fact observed on a few of our plots, with the majority of trees representing smaller volume classes, interspersed with a small number of trees of larger volume. Such volume structure is typical for multi-layered stands, where the top layer is composed of old trees, often of several cubic meters in volume. Modelling of such multimodal distributions may be improved with finite mixture modelling (
i.e., a combination of two Weibull distributions) [
36] or k-MSN imputations [
48].