Enriching Als-derived Area-based Estimates of Volume through Tree-level Downscaling

Information on individual tree attributes is important for sustainable management of forest stands. Airborne Laser Scanning (ALS) point clouds are an excellent source of information for predicting a range of forest stand attributes, with plot and single tree volume being among the most important. Two approaches exist for estimating volume: area-based approach (ABA) and individual tree detection (ITD). The ABA is now routinely applied in operational forestry applications, and results in generalized plot-or stand-level attribute predictions. Alternatively, ITD-based estimates provide detailed information for individual trees, but are typically biased due to challenges associated with individual tree detection. In this study, we applied an ABA to estimate tree counts and individual tree volumes by downscaling plot-level predictions of total volume derived using ALS data in a highly productive and complex coastal temperate forest environment in British Columbia, Canada, characterized by large volumes and multi-species and multi-age stand structures. To do so, a two-parameter Weibull probability density function (PDF) was used to describe the within-plot tree volume distribution. The ABA approach was then used to model the total plot volume and the two Weibull PDF parameters. Next, the parameters were used to calculate mean tree volume and derive the number of trees and the individual tree volume distribution. 2609 Tree count estimates were minimally biased with RMSE of 149 trees· ha −1 or 24.4%. The volume distributions showed good agreement with reference data (mean Reynold's error index = 71.7). We conclude that the approach was suitable for enriching ABA-derived forest stand attributes in the majority of the studied forest stands; however the accuracy was lower in multi-layered stands that had a multimodal individual tree volume distribution.


Introduction
Accurate information on forest composition and structure is critical to ensure the effective sustainable management of forest ecosystems [1].This information is used both for the accurate estimation of attributes describing the amount and type of forest resource, principally done through forest inventories, as well as for the mapping of the forest resource, which provides information on the areal extent [2].These pieces of information are then used to develop a comprehensive understanding and inform management of forest ecosystems [3] including for example silvicultural practices [4,5], volume and biomass estimation [6], assessment of biodiversity [7] and other ecosystem goods and services [8], carbon management (e.g., [9]), and forest health assessment [10].Whilst there is a broad range of applications that utilize forest inventory data, information needs vary across the forest planning sector [1].At the operational level, information is required to support activities such as harvest and road layout.At the strategic level, information is required to support long-term sustainable management, growth and yield forecasting, timber supply analysis, and support a multitude of resource decisions relevant to forest protection and wildlife management [11].
Airborne laser scanning (ALS) data can support both strategic and operational forest information needs.ALS measures the three-dimensional distribution of vegetation within forest canopies and, as a result, is particularly well suited for describing structural vegetation attributes [12].ALS has a demonstrated utility for forest inventory across a range of forest environments [13][14][15][16], and can provide a direct measurement of a range of forest inventory attributes such as tree height, location, and canopy cover, as well as providing wall-to-wall predictor variables in the form of ALS metrics that allow for the development of models to estimate for example volume, biomass, and basal area [17].A multitude of studies have found a high correlation between field-and ALS-measured tree heights in a variety of forest environments (e.g., [6,17,18]).In addition, by examining the number and height of the return pulses within a given area, information on the vertical profile of light penetrating the plant canopy can be derived, providing additional information on forest structure, such as crown shape and density.
Analysis of ALS data for use in forestry follows two basic approaches: an individual tree detection (ITD) approach [19], or an area-based approach (ABA) [20].Whilst the area-based approach is now operationally used in a range of forest environments [13,17], ITD approaches are still in research mode [21][22][23].Using ITD approaches, individual treetops are located from either the ALS raw data point clouds directly [24], or from a canopy height model [25,26].As discussed by Breidenbach et al. [21], ITD approaches are inherently intuitive, with trees often clearly seen in ALS point clouds, especially when the density of pulses is much greater than individual crown sizes.However, ITD approaches are prone to bias, as a result of over or under segmentation of tree crowns, whereby some trees are undetected, whilst others are split into multiple trees.These issues result in omission and commission errors, which can have a significant impact on overall estimate for forest inventory [27].
Area-based approaches (ABA), rather than explicitly extracting or identifying each individual tree crown, are based on aggregations of ALS point clouds to develop a series of canopy density and height metrics [20].These metrics become independent variables that are used to predict the desired forest inventory attributes.The fact that the variables are modelled, rather than directly derived using ITD approaches, increases the number, and types, of models that can be developed.Past research has demonstrated the successful application of a wide range of modelling approaches (regression, nearest neighbor, decision trees, random forests, and model-based) to estimate a range of inventory attributes [1,28,29].Most importantly, by developing statistical models, bias can be minimized and, as a result, the ABA is widely used operationally [30].
Both individual tree and area based approaches therefore have different outputs, errors and costs [31].While the area-based techniques have now become common place, advances in forest growth and dynamics models, as well as the needs of industry for example, for individual stem and piece sizes of a stand, is requiring the development of tree lists or stand tables for effective management [1].Bergseng et al. [31] concluded that large economic losses associated with poor inventory information could be avoided if area-based techniques are used to derive information on diameter distributions in addition to plot-level averages.They concluded that prediction of mean tree attributes often did not provide a sufficient enough description of the stand for decision-making regarding timing of harvest.In addition, the increasing use of ecosystem-based management approaches which often promote selective harvesting resulting in uneven-aged forest stands is even less well suited to average tree models derived from the ABA method.Tree lists containing unbiased estimates of individual tree specifications arguable allows the greatest flexibility as it can be varied, manipulated or aggregated as needed, to suit different objectives [1].
As a result of the need for individual tree-based attributes and finer scale descriptions of stands, beyond an area based statistical method, a number of new approaches to relate the ITD and ABA have been developed.For example, Breidenbach et al. [21] proposed a semi-ITC (individual tree crown) method that imputes field data within from the nearest neighboring crown segment.The approach uses a most similar neighbor inference (MSN) to predict the attributes of the individual crown segments allowing unbiased estimates of volume and volume by species at the plot level.Lindberg et al. [32] developed an approach to produce individual tree lists which were consistent with unbiased estimates that were produced with ABA.Vastaranta et al. [33] improved the stem volume predictions by integrating both methods and acquiring plot-level training data for ABA with ITD.Xu et al. [34] demonstrated a method to calibrate the ABA-derived diameter distribution by combining ABA and ITD.They present two approaches based on replacement and histogram matching, with more accurate results found with the second approach.Key to all these approaches however, is the requirement to perform individual tree detection, in order to build distributions prior to modifying them by area-based estimates, or pre-determined tree lists available for the site.A number of studies have demonstrated that information on individual trees, specifically DBH (diameter at breast height), can be extracted exclusively using ABA [35][36][37].Typically unimodal in managed stands [38], DBH distributions are usually modelled by predicting parameters from a Weibull probability density function including percentile-based modelling, non-parametric imputations, or parameter recovery [39][40][41].The density distributions are then scaled to match the predicted total number of trees or total basal area often with unbiased estimates at the plot level [37].Existing studies that have predicted DBH distribution with ALS data are summarized in Table 1.Since we focus on ABA only, approaches that used or incorporated ITD results [21,32,34] as part of the derivation were not included in the table.
Previous studies have largely focused on describing DBH distributions because of the importance of DBH as an informative descriptor of stand structure [39].Fewer studies have modelled the distribution of tree heights [27,28], or derived individual tree volume [32].Knowledge of within-stand volume distribution can inform on product mix and log allocation opportunities, allowing for improved pre-harvest planning.
In this study, our objective was to demonstrate the application of an approach to estimating individual tree volumes for highly productive, multi-species, multi-age, temperate coastal forest stands, based on an ABA to model fitting and plot level optimization.The method presented builds on existing studies [37,42,43] and uses the within-plot tree size distribution as the main vehicle to downscale from area-based to individual tree-based predictions of volume.Previous studies have primarily tested the approach in managed boreal forest stands of northern Europe (Table 1); the complexity and high productivity of the coastal, temperate forest environment of our study area provides a novel context for testing of the downscaling approach.

Study Area
The focus of the study are actively managed forest stands on Vancouver Island, British Columbia, Canada.The stands are dominated by coastal Douglas-fir (Pseudotsuga menziesii var.menziesii) on the drier mesic areas, while western hemlock (Tsuga heterophylla) and western redcedar (Thuja plicata) are more common in the wetter sites, with red alder (Alnus rubra) also present within certain site conditions [44].The focus of the area is an 8 × 8 km area located on Oyster River.The site has a mean elevation of 240 m above sea level and is located within the dry maritime Coastal Western Hemlock biogeoclimatic subzone (CWHxm), characterized by mild winters and cool summers with an annual precipitation of 1500 mm and a mean annual temperature of 9.1 °C [45] (Figure 1).The region has been actively managed for timber production since the early 1900's with both harvesting and fire occurring at the site resulting in a patchwork of second growth stands at different successional stages ranging between 0 and 90 years [46].Stands are generally managed on a 50-90 year harvest rotation [44]; however, stands are typically left to develop naturally, with minimal silvicultural intervention or stand tending after successful stand establishment.Abbreviations: NS-Norway spruce (Picea abies (L.) Karst.);SP-Scots pine (Pinus sylvestris L.).Error indices (e R and e P ) are calculated with Equations ( 14) and (15).* Natural hardwood stands (sugar maple (Acer saccharum Marsh.), red oak (Quercus rubra L.), white birch (Betula papyrifera Marsh.), yellow birch (Betula alleghaniensis Britt.), beech (Fagus grandifoli) and trembling aspen (Populus tremuloides Michx.), natural conifer stands (northern white cedar (Thuja occidentalis (L.)), white spruce (Picea glauca (Moench) Voss), black spruce (Picea mariana (Mill.)BSP), red pine (Pinus resinosa Ait.), white pine (Pinus strobes L.), jack pine (Pinus banksiana Lamb.), and balsam fir (Abies balsamea (L.)).** Aerial photograph-based variables were also included in modelling.

Plot Selection and Inventory Measurements
Field data from 15 forest stands representing a range of structural stages were used as response data, and for validation [6,46].Within each stand, a single 30 × 30 m square plot was positioned and located using differential GPS (dGPS).A range of attributes were measured including diameter at breast height (DBH), height, and tree species, for all trees larger than 10 cm in DBH.All height measurements were determined using a laser hypsometer.Individual tree volume was calculated using DBH, height, and species information in species-specific allometric equations [49] and biomass-to-volume conversion factors [50].Plot level summaries were then calculated that included total plot volume (Table 2).

ALS Point Clouds
ALS data were acquired in June of 2004 and August of 2008 with a maximum of 4 returns recorded per laser pulse using similar systems [6,51].The average point density at the plots was similar with 3.31 and 4.62 points per meter respectively.As discussed in previous research, given the age of the stands, we assume height and diameter growth to be relatively minor over the 4 year span between ALS acquisitions [6,51].ALS point clouds were filtered to separate ground and non-ground returns [52], normalized to aboveground heights, and clipped to the spatial extents of the field plots (30 × 30 m).For each clipped point cloud, a number of metrics were calculated using FUSION [53] and custom R scripts [54].The metrics included measures of central tendency (e.g., mean, median, mode), measures of dispersion (variance, standard deviation, interquartile range), percentiles, proportions, and densities.All metrics were calculated for first, last, and all returns with a vertical threshold of 2 m.

Modelling Approach
Our modelling approach consisted of three major steps, as summarized in Figure 2. First we modelled total plot volume using the ABA and following existing methods described by [55][56][57].Second, we fitted a Weibull density function to the field-measured tree volumes on each plot and modelled the distribution parameters using a parameter prediction method and ALS-derived metrics as independent variables [36,42].Finally, we retrieved the tree count on each plot and derived the individual tree volume list.The parameter retrieval was based on two inputs: total volume (VALS) and mean tree volume (E(vALS)).

Figure 2.
An outline of the method used in this study.

Modelling Total Plot Volume Using ABA
The large number of ALS metrics that can be calculated from the ALS point cloud provides flexibility in choosing the best combination of explanatory variables during modelling.However, due to our limited number of plots, rather than using all of the metrics as input to a non-parametric modelling approach (e.g., Random Forest), we used ordinary least squares regression and a parsimonious set of meaningful, independent ALS metrics.Recently Bouvier et al. [55] proposed a modelling approach whereby only four variables are used to model stand volume.These variables characterized stand height (μCH), heterogeneity of stand height (σCH), canopy cover (Pf) and LAI (Leaf Area Index) profile.We followed the approach of Bouvier et al. [55]; however, instead of a priori choosing which ALS metric to use to characterize height, height heterogeneity, and canopy cover, we chose the most significant variable for the model, based on the correlations to total plot volume (VREF) and overall model performance described with percent of explained variance.Furthermore, instead of using the LAI profile to describe the vertical structure of the stand, we used vertical rumple (VR) [58].VR is based on the same principle as the original rumple metric [59], which was designed to measure the horizontal complexity of the canopy surface [60].However, VR measures the complexity of the vertical structure of the stand, not the top of modelling the canopy.To calculate VR, the normalized point cloud is first converted into voxel space.After that, the VR is calculated as follows: where hMAX is the maximum height of voxels (m), c is the number of voxels at height interval i and h is the height value of the height interval i (m).
As shown in other studies, the multiplicative power model form has been often adopted for stand biomass or volume predictions [56,61].In our case the model had following form (with notations adapted from Bouvier et al. [55]): where VALS is the fitted value of total volume per plot, βs are regression model parameters and εV is the residual error (N(0,σ)).Model parameters were estimated with the linear form of the equation using log-log transformation: Using this transformation allows model linearity, which markedly simplifies the analysis allowing the use of standard least-squares regression techniques.However, during the back-transformation, a systematic bias is introduced by the log transformation and has to be corrected [62].The prediction result (VALS) was therefore multiplied by a correction factor (CF) that is based on the standard error of estimate (SEE) of the regression [62]: where N is the number of plots, p is the number of model parameters.
For each of the model explanatory variables we examined the best available ALS-derived metric applying stepwise regression techniques.Height percentiles and maximum height were considered to represent stand height (μCH), whereas standard deviation, variance, interquartile distance were considered to represent height heterogeneity (σCH).As a representation of canopy cover (Pf) percentage of points above a 2 m threshold, percentage of points above mean and percentage of points above mode were considered.We assessed if collinearity between the variables existed by calculating correlation coefficients as well as the contribution for the model of each individual variable.

Modelling Individual Tree Volume Distribution
We used a two-parameter Weibull probability density function (PDF) to describe the individual tree volume distribution on each plot.A Weibull model was chosen as it is highly adaptive, ranging from an inversed J-shape, to unimodal skewed and unimodal symmetrical curve.Because of this flexibility Weibull models have often been used to characterize distributions of a range of forest attributes in particular individual tree size distributions, which can have various shape depending on the stand structure and age [36,42,63,64].Weibull PDF has a following form: where k is the shape parameter and λ is the scale parameter.
We first used a maximum likelihood estimation method [65] to estimate Weibull shape (k) and scale (λ) parameters of individual tree volumes on each plot.We then used linear regression to model the parameters (k; λ) using ALS-derived metrics as independent variables; including percentiles and densities [35,36,42]; together with metrics describing canopy cover (Pf) and measures of vertical heterogeneity (σCH).

Estimating Tree Count and Individual Tree Volumes
By estimating k and λ parameters, the shape of the PDF can be predicted.However, to show the overall frequency of individual tree volumes, tree counts are also required as the basic form of the Weibull PDF describes density and not absolute stem numbers.As demonstrated by Gobakken and Naesset [35] and Thomas et al. [36], tree count can be predicted using ALS-derived metrics as independent variables.However, such predictions do not always result with desirable accuracy, often require stratification, or cannot be applied in forest stands of complex structure.Therefore, we retrieved tree count (n) using other predicted stand attributes.
To do so we utilized the predicted total tree volume per plot (VALS) and Weibull distribution parameters (kALS, λALS).We then derived tree count (nALS) on each plot by converting and then solving the formula: where E(v) is the mean tree volume of the stand, given by the Weibull mean: The final formula to calculate tree count per plot becomes: The output of these calculations was then used to scale the density and derive individual tree volumes (vALS) on each plot.
This approach is similar to that of Maltamo et al. [37], Kangas and Maltamo [66], and Deville and Sarndal [67].Maltamo et al. [37] first predicted Weibull PDF parameters as well as stand volume, total basal area, and stem count.They then scaled the predicted Weibull distribution according to the predicted stem count.The distribution was further modified to ensure compatibility with estimates of basal area and volume.We adapted this approach however instead of using estimates of tree count for initial scaling, we predict both tree count and individual tree volume list by solving a set of equations listed above.

Validation
Validation was performed against field data for all four estimated variables (V, k, λ and n) in our analysis.We calculated the bias (absolute and relative) and RMSE (absolute and relative) and used a paired Wilcoxon signed rank test to evaluate the null hypothesis that the median difference between the compared values was zero (at α = 0.05).Bias and RMSE were calculated as follows: where N is the number of validation plots, yi is the reference value for plot i, ŷi is the predicted value for plot i, and ȳ is the mean of the reference variable.
The modelling results of V, k and λ were assessed using leave-one-out cross validation.Agreement between the reference and the predicted Weibull distributions was assessed by comparing the PDFs and calculating bias, RMSE (both absolute and relative), and p-value of the paired Wilcoxon signed rank test.Density functions were generated for a sequence of individual tree volumes, ranging from 0.1 to 3 m 3 , with an increment of 0.1.To assess the agreement between reference and predicted volume distributions we compared the individual tree volumes using Mann-Whitney rank sum test (also called Wilcoxon rank sum test).In this case we tested the null hypothesis that the two population distribution functions are identical against the alternative hypothesis that they differ by location (i.e., average value of the distribution).
To compare the distributions of reference and predicted tree volumes on each plot (vi) we calculated the Reynolds error index (eR) [68] and error index (eP) proposed by Packalé n and Maltamo [43]: where fREFi is the reference and fALSi is the predicted stem count in volume class i; m is the total number of classes and nREF is the true and nALS predicted stem count of all volume classes.The used class width was 0.1 m 3 .Both error indices are similar, however relative frequencies are used in eP, which allows to ignore the error in stem number estimate (nALS) [43].In case of both indices a value of 0 indicates a perfect fit.The eP values can range between 0 and 1, with 1 indicating the distributions do not overlap at all.
In order to assess the sources of errors in predicted tree count (nALS) we regressed the differences in tree count for each plot (∆n) against variables that could potentially influence the tree count prediction accuracy.Since the final result of the analysis relies on the accuracy of the previous steps (i.e., prediction of VALS, kALS, λALS) a number of explanatory variables were included.First, variables describing the stand (DBH, height, volume, species composition) were selected to examine if any relation exists between ∆n and properties of the stand.Second characteristics describing the individual tree volume distribution and the quality of the reference Weibull distribution fit were added.We used Hartigans' dip test for unimodality [69] to characterize the complexity of the individual tree volume distribution.The log-likelihood value of the reference Weibull curve fit [65] was used to provide a metric describing the overall quality of the reference k and λ parameters.Finally differences between the reference and predicted Weibull parameters (∆k and ∆λ) were used.

Total Plot Volume
A number of explanatory variables describing stand height (μCH), height heterogeneity (σCH) and canopy cover (Pf), as well as the usefulness of VR for modelling plot volume were investigated.VR was calculated using voxel size of 1 × 1 × 1 m and height interval (i, Equation ( 1)) of 1 m.The final model explained 86.7% of variance and was significant at α = 0.05.The 95th percentile of first return heights (p95FE) was selected to represent μCH, interquartile distance of first returns (IQDFE) was chosen as σCH, and percentage of first returns above mode as Pf.Together with VR all of these variables were significant in the model (Table 3).Comparison of the predicted total plot volume values with the field data showed a small bias (17.2 m 3 • ha -1 ) and relative RMSE of 34.8% (Table 4, Figure 3A).The Shapiro-Wilk test confirmed that residual values were normally distributed (p = 0.416).Results of the paired Wilcoxon test indicated that we could not reject the null hypothesis that the median value of the difference between field and predicted total plot volume was 0 (p = 0.89).Values for total volume (VALS) and Weibull parameters (kALS and λALS) predicted using leave-one-out cross validation.Tree count (n) predicted with parameter retrieval.

Weibull Parameters
Despite many candidate ALS metrics available for predicting the Weibull distribution parameters (k and λ), in both cases only single explanatory variables were chosen with stepwise regression (Table 3).A canopy cover metric (percentage of all points above mean) was used to predict k, whereas the 90th percentile of height of all returns was used to predict λ.The R 2 was higher for λ (0.77) than for k (0.56).Similarly, the bias and RMSE values were lower for λ (Table 4, Figure 3B,C).In both cases, an exponential model form was chosen.The agreement between reference and predicted Weibull parameters varied across the plots (Figure 4).Likewise agreement between the reference and modelled distribution parameters, estimated using the sample density distribution, showed large variability.The results of the paired Wilcoxon signed rank test indicated that for 5 of the plots the null hypothesis could be rejected (Table 5), however the values of relative bias and RMSE were often very large and exceeded 100% (with one extreme case for plot 10).Plot: Forests 2015, 6 2622

Individual Tree Volume Frequency Distribution
Predictions of tree count for each plot, based on parameter retrieval with total plot volume (VALS) and mean Weibull value derived with predicted k and λ, were slightly overestimated (relative bias of 6.81%) with relative RMSE of 24.4% (Table 4, Figure 3D).In this case, we did not reject the null hypothesis of the Wilcoxon test that the median value of the differences was zero (α = 0.05).For comparison, the best multiple linear regression model predicted tree count with absolute and relative bias of 0 and 14.64% respectively, and absolute and relative RMSE of 200.5 trees per ha and 32.7% respectively.
The agreement between frequency distributions on each plot varied.The mean values of the error indices were 71.7 and 0.36, with standard deviations of 22.40 and 0.11 for eR and eP respectively (Table 6).The regression analysis of the ∆n showed that there were only two significant variables for the model: Hartigans' test statistic for unimodality and ∆λ (differences between reference and predicted values of Weibull's scale parameter).The model had an R 2 = 0.41 and the positive value of the parameter for Hartigans' statistic (614.5)indicated a positive relationship between ∆n and the explanatory variable.The parameter for ∆λ had value of −33.3 indicating a negative relationship between ∆n and ∆λ.We found no other variables significant for the model.Table 6.Results of the comparison of the reference and predicted tree volumes.Error indices (eR and eP) calculated with Equations ( 12) and (13).

Discussion
Tree size distributions and individual tree lists are important attributes in forest inventories, especially in an operational context.The capacity to make accurate predictions of not only the total stand volume, but also of the frequency distribution of individual tree volumes, provides valuable information that can be used in management of forest resources.While ALS data is becoming increasingly popular for estimating a number of forest stand attributes, the predictions are, in the majority of cases, undertaken using an ABA and providing information on total or average values of stand attributes (e.g., volume per ha, mean height).The alternative-the ITD approach-has not yet reach operational status as it usually provides biased results and requires more complex data processing routines [70,71].
In this study we evaluated the use of ALS point clouds to predict individual tree volume distributions, providing an enhanced attribute set for traditional ABA in complex, high productivity forest stands in Pacific Northwest.Our results suggest that ALS data can be used to downscale plot-level information and estimate attributes of individual trees.To do this, we used parameter prediction and then retrieved tree count on each plot (n).Distinct from existing approaches, our generated tree lists for each plot consisted of individual tree volumes rather than DBH or basal area values [34,36,42].
Our approach was based on three steps, starting with modelling of the total plot volume.We found that the approach presented by Bouvier et al. [55] provided an accurate total volume estimate, although the metrics we used to characterize stand height, height heterogeneity, and canopy cover were different than those used by Bouvier et al. [55].Furthermore, we used a different metric to characterize the vertical complexity of the plot: vertical rumple.In our case, the achieved model accuracy for estimating plot volume (R 2 = 0.867; RMSE% = 34.8%,leave-one-out cross validation) was comparable with results reported by others [55,72].
Estimating the Weibull distribution parameters was the second step in the presented workflow.Similar to results presented in previous studies, we achieved higher prediction accuracy for λ (scale parameter) than for k (shape parameter) [36].However, in contrast to previous studies [36,42], we used a single explanatory variable in models predicting each parameters: a measure of canopy cover was used to estimate k, whilst a metric describing stand height was used to estimate λ.The two Weibull parameters modify the form of the PDF in a different way, allowing for its great flexibility.The k parameter is responsible for the overall form of the distribution, which can resemble exponential, Rayleigh, or normal distribution while the parameter value changes from 1 to 4. The change in λ determines the range of the distribution.Interesting to note is the relationship between canopy cover (percentage of points above mean height of the points) and the k parameter.Our analysis showed that with the increase of the percentage of points above the mean height, the form of the Weibull distribution function changes from inverse J-shape to normal-like.The significance of the 90th percentile of ALS points to predict the scale parameter can likely be explained with the increasing range of individual tree volumes with increasing stand height.
The comparison of the predicted and reference Weibull density functions, generated with the same sequence of individual tree volumes, allowed us to assess the accuracy of modelled k and λ parameters.We observed that even with relatively strong modelling results and statistical test results indicating agreement between compared distributions, the bias and RMSE values can be large for some plots.This demonstrates how small discrepancies in model forms can lead to extreme relative differences.For example, for Plot 10 (Figure 4), the overall shape of the distributions match, however the predicted distribution is unrealistic for low volumes, leading to large relative differences (Table 5).
In the final step of the workflow, we predicted tree count (n) for each plot by using the parameter retrieval method with predicted total volume (VALS) and mean tree volume (E(v)) derived using the predicted Weibull distribution parameters.This approach produced estimates of n with absolute bias of −1.6 trees and a relative RMSE of 24.4% (or 149 tree• ha −1 ).In other studies using Weibull PDF to model the distribution of tree size (typically via basal area), the number of trees was predicted directly using modelling with ALS-based metrics as explanatory variables [32,36].Although existing studies have shown that such prediction of stem count is often accurate [36,37], the accuracy varies depending on forest structure and can result in relative RMSE exceeding 50% [73].Initial tests implemented at an early stage of our analysis indicated that these direct predictions in these complex forest systems are also prone to error (RMSE% = 32.7% or 200.5 trees• ha −1 ).
As a comparison, Maltamo et al. [37] used the total number of trees, basal area, and volume to generate DBH distributions.All three variables were first predicted with ALS-based models using ABA.In our case the analysis were based on volume only and resulted not only with frequency distributions but also with minimally biased predictions of tree count.This allowed us to exclude the usage of the inaccurate tree count estimates to scale the density functions.As a result this approach presents an alternative for estimating tree count with ABA, when direct modelling does not provide accurate results.
The final estimates of tree volume distributions were-for the majority of plots-similar to the reference distributions (Table 5).The values of the error indices (eR and eP) were similar or lower to those reported for studies focusing on modeling DBH distributions [47,48] and mean value of eP was almost identical to the value reported by Packalé n and Maltamo [43].Our results are also similar to studies that integrate ITD and ABA directly [34], however as discussed previous studies focused on modelling distribution of DBH, not volume.
Recent trends integrating ITD and ABA to improve the accuracy of modelling forest attributes [29,[32][33][34] will likely continue to provide promising results especially as density of lidar datasets increase.However, as showed by Xu et al. [34], the modelling of the tree size distributions based on ABA does not always benefit from additional information provided with ITD.Xu et al. [34] presented two integration methods-(1) one where the ABA-tail of large trees in DBH distribution is substituted with trees detected with ITD, which depends principally on the accuracy of the ITD-derived diameter distribution; and (2) the histograms of DBH distributions derived with ABA and ITD are matched, which relies on the accuracy of the ABA-derived diameter distribution.Xu et al. [34] showed that the second approach, based on ABA, obtained better accuracy than the first and that the influence of the ITD-derived diameter distribution during histogram matching was only slight.
While we readily acknowledge that the number of sample plots in our study was small, our results show that even with simple distribution modelling based on Weibull probability density function, the predicted distributions were accurate, with accuracies comparable to more complex methodologies, including those based on integrations of ABA and ITD.However, by regressing the differences in estimates of n we confirmed that standard Weibull PDF is not suitable for characterizing multimodal distributions.Such multimodal distributions were in fact observed on a few of our plots, with the majority of trees representing smaller volume classes, interspersed with a small number of trees of larger volume.Such volume structure is typical for multi-layered stands, where the top layer is composed of old trees, often of several cubic meters in volume.Modelling of such multimodal distributions may be improved with finite mixture modelling (i.e., a combination of two Weibull distributions) [36] or k-MSN imputations [48].

Conclusions
In this paper, we demonstrated a method to enhance an ABA estimate of plot volume, by predicting within-stand tree count and individual tree volume distribution in complex, highly productive forest stands composed mainly of Douglas-fir and western redcedar.We applied an approach to extract individual tree attributes, which was based on estimating the parameters of Weibull probability density function.The approach can be applied to ALS data without having to isolate and attribute individual trees within the stand.The main advantage of the presented approach is that the sum of individual tree volume equals the plot-level volume estimate modelled with ABA.Furthermore, the within-stand tree count we estimate has minimal bias, in contrast to what is typically achieved with an ITD approach [71].However, we have demonstrated that the unimodal density distribution function is not always suitable for multi-layered, multi-age, complex forest stands.The approach provides additional information for forest stands by enriching standard ABA estimates, offering improved details on stand structure and tree counts by volume class-Information that is of great importance for sustainable forest management.

Figure 1 .
Figure 1.ALS data extent (2004 and 2008) and field plots locations.Coordinates in Universal Transverse Mercator zone 10.

Figure 3 .
Figure 3. Scatterplots of observed and predicted total volume (V, panel A), Weibull shape (k, panel B) and scale (λ, panel C) parameters and tree count (n, panel D).Values for total volume (VALS) and Weibull parameters (kALS and λALS) predicted using leave-one-out cross validation.Tree count (n) predicted with parameter retrieval.

Figure 4 .
Figure 4. Individual tree volume distribution (grey bars) overlaid with Weibull PDF curvesshown for four exemplar plots.Parameters for the reference curves were derived using field measured data, whereas the predicted parameters were modeled using ALS data.Individual tree volume (v) shown on x-axis, density shown on y-axis.

Table 1 .
Summary of existing studies that have demonstrated different DBH (diameter at breast height) distribution extraction methods with ABA only.

Table 2 .
Characteristics of the field plot data.

Table 3 .
Developed predictive models for total plot volume (V), Weibull shape (k) and scale (λ) parameters.Adjusted R 2 value and p-value indicating significance of the model are reported.
* model performance assessed with leave-one-out cross validation.

Table 5 .
Results of the comparison of reference and predicted density functions.