^{1}

^{*}

^{2}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (

This study compares methods to estimate stem volume, stem number and basal area from Airborne Laser Scanning (ALS) data for 68 field plots in a hemi-boreal, spruce dominated forest (Lat. 58°N, Long. 13°E). The stem volume was estimated with five different regression models: one model based on height and density metrics from the ALS data derived from the whole field plot, two models based on similar combinations derived from 0.5 m raster cells, and two models based on canopy volumes from the ALS data. The best result was achieved with a model based on height and density metrics derived from 0.5 m raster cells (Root Mean Square Error or RMSE 37.3%) and the worst with a model based on height and density metrics derived from the whole field plot (RMSE 41.9%). The stem number and the basal area were estimated with: (i) area-based regression models using height and density metrics from the ALS data; and (ii) single tree-based information derived from local maxima in a normalized digital surface model (nDSM) mean filtered with different conditions. The estimates from the regression model were more accurate (RMSE 52.7% for stem number and 21.5% for basal area) than those derived from the nDSM (RMSE 63.4%–91.9% and 57.0%–175.5%, respectively). The accuracy of the estimates from the nDSM varied depending on the filter size and the conditions of the applied filter. This suggests that conditional filtering is useful but sensitive to the conditions.

During the last decade, airborne laser scanning (ALS) data have been established as a standard data source for high precision topographic data acquisition and have also been used for estimation of forest variables [^{2}, and use the measures as independent variables in regression models to estimate forest variables such as mean tree height and stem volume [

The measures derived from the ALS data may be height percentiles and measures of the density of the vegetation as the fraction of ALS reflections from vegetation relative to the total amount of ALS reflections [

If the ALS data are dense enough, individual tree crowns may be identified from the data [

The analyses based on nDSMs are faster and more robust than those based directly on ALS returns. However, the nDSMs still provide information about local variations in the forest that are related to individual trees. As demonstrated in [^{2} raster cell, these measures are largely influenced by the parts of the raster cell with the highest pulse density.

The purpose of this study is to compare methods to estimate stem volume, stem number and basal area. The first comparison is between measures derived from ALS data in 0.5 m raster cells and variables derived from larger raster cells corresponding to the size of the field plots for estimation of forest variables. The second comparison is between the canopy volume model [

The study area is located in the southwest of Sweden (Lat. 58°N, Long. 13°E). The most common tree species are Norway spruce (

In total sixty-eight circular field plots with 12 m radius were allocated during July and August 2009 (

The stem number

The root mean square error (RMSE) at tree level of the regression models was 137 dm^{3} (19.1%) for pine, 102 dm^{3} (15.9%) for spruce, 389 dm^{3} (7.7%) for oak, and 90 dm^{3} (25.9%) for other species. The stem volume of all trees was estimated with the respective regression models.

The stem volume

The ALS data were acquired on 4 September 2008 using a TopEye MKII ALS system with a wavelength of 1064 nm carried by a helicopter. The flying altitude was 250 m above ground and the average emitted pulse density was 7 m^{−2}. The first and last returns were saved for each laser pulse and the average return density was 11 m^{−2} (

ALS returns were classified as ground or non-ground using the progressive Triangular Irregular Network (TIN) densification method [

The

The following measures were derived from the ALS returns in each circular field plot with 12 m radius.

The 10th, 20th, ..., 100th percentiles of the normalized _{10}, _{20}, ..., _{100}.

The total number of ALS returns: _{tot}

The number of ALS returns in intervals _{1}, _{2}, _{3}, and _{4}: _{1}, _{2}, _{3}, and _{4}.

The number of ALS returns ≥ 2 m and < 34 m above the DTM: _{veg}.

The total number of first ALS returns: _{f,tot}

The number of first ALS returns in intervals _{1}, _{2}, _{3}, and _{4}: _{f,1}, _{f,2}, _{f,3}, and _{f,4}.

_{1}was 2 ≤

_{2}was 10 ≤

_{3}was 18 ≤

_{4}

The vegetation ratio in each field plot was calculated as _{veg}_{veg}/N_{tot}_{j} were calculated as _{hj}_{j}/N_{tot}

The following measures were derived from the ALS returns in raster cells of 0.5 m.

The mean normalized _{mean}_{mean}

The mean normalized _{1}_{2}_{3}_{4}_{f,mn,1}_{f,mn,2}_{f,mn,3}_{f,mn,4}_{f,mn,1}_{f,mn,2}_{f,mn,3}_{f,mn,4}

The maximum normalized z-value of all first ALS returns < 34 m above the DTM (_{f,max}

The maximum normalized _{max}_{99}

To calculate the canopy volume for each interval _{j}_{f,j,raster}/_{f,tot,raster}. The maximum height of 34 m was chosen based on the maximum tree height in the field data and on the observation that ALS returns ≥ 34 m above the DTM were all erroneous returns high above the tree tops, found in a few field plots. Raster cells without ALS returns were excluded when calculating mean values and percentiles.

The canopy volume was calculated for four different height classes _{j}_{f,j}_{f,tot} and _{f,tot}_{j,raster}_{f,j,raster}/N_{f,tot,raster}

Local maxima detection was used to find individual tree tops in the raster of _{max}_{max}_{99}_{lim}_{max}_{lim}_{max}_{lim}_{loc}

The stem volume was estimated with five different regression models. The independent variables of the regression models were derived from the ALS returns in each circular field plot with 12 m radius in two cases (

The models in

The stem number and the basal area were estimated with two methods: (i) an area-based approach and (ii) an individual tree-based approach.

In the area-based approach, the stem number (

The models were selected with best subset regression. The models in

In the individual tree-based approach, values of DBH were calculated using a relationship between DBH and tree height based on a regression model for the subsample of trees where the heights were measured (_{j}_{j}_{loc}

The accuracy of the estimates from ALS data was validated using leave-one-out cross-validation for one field plot at a time: one field plot was excluded; the parameters of the models were estimated based on the remaining field plots and then applied to the excluded field plot to estimate forest variables. The accuracy was validated with the field-measured values using the RMSE and the bias (_{i}

The RMSE of the estimated stem volume was largest for the regression model in

The RMSE of the estimated stem volume was smaller when excluding field plots with > 80% basal area from oak (

The RMSE of the estimated stem number was smallest for the regression model in _{max}_{lim}_{max}

The RMSE of the estimated stem number was slightly larger for the regression model when excluding field plots with > 80% basal area from oak (_{max}_{99}

The RMSE and bias of the estimated basal area was smallest for the regression model in _{max}_{lim}_{max}

The RMSE of the estimated basal area was smaller for the regression model in

The most accurate estimates of stem volume, stem number and basal area were achieved with regression models that used rasterized (0.5 m raster cells) ALS data as input instead of 3D point cloud data directly. This suggests that the raster cells can compensate for the varying density of the ALS data and the variability of the forest properties within the field plots. For the two stem volume models that used input measures calculated at plot level from the normalized 3D point cloud directly, the canopy volume regression model was more accurate than the log-log regression model including the vegetation ratio and measures of the height of the ALS returns. However, the most accurate estimate was achieved with a log-log regression model including the vegetation ratio and a measure of the maximum height of the ALS returns derived from 0.5 m raster cells. Apart from the canopy volume models, the final models were selected with best subset regression, which means that the selection of independent variables was based on the reference data. Since the parameters of the regression model are also estimated based on the reference data, the model can be fitted very well to the reference data. However, it requires that the local reference dataset is large enough to base the models on. The canopy volume model is stable in the sense that the independent variables are not selected based on the local reference data, which might have advantages for estimation of stem volume for large areas. The stem volume used as ground truth was estimated with regression models with a comparatively high RMSE, which was around 20% for most of the trees. This makes the validation more uncertain. Excluding field plots with > 80% basal area from oak resulted in fewer outliers since most of the outliers were oak dominated field plots. Previous studies have reported larger errors for estimation of stem volume and basal area in mixed forest than in coniferous dominated forest [

The estimation of the basal area showed fewer outliers for large field-measured values than the estimation of stem volume. The reason may be that the outliers for stem volume were mostly oak dominated field plots and that the relationship between DBH and tree height is more similar for oak and other tree species than the relationship between stem volume and tree height. The RMSE of the regression estimates decreased slightly when excluding oak dominated field plots in the same way as for the estimation of stem volume. However, the RMSE and the bias of the basal area derived from local maxima increased when excluding oak dominated field plots. This may be because the regression model used for calculating DBH from the heights of the local maxima underestimated DBH for tall trees and the oaks were taller than the average tree. This negative contribution to the bias disappeared when excluding the oak dominated field plots and the result was a higher bias.

The bias of the estimated stem number changed in the negative direction when excluding oak dominated field plots. The reason was that the excluded field plots were outliers with an overestimated stem number. This may be due to the canopy of oak having more small variations than other species, which gives rise to several local maxima within the same tree crown. A few other field plots were outliers with an underestimated stem number for large field-measured values. This is expected when identifying trees from local maxima since trees below the dominant tree layer are not visible in the nDSM.

The estimates of stem number and basal area were more accurate for the regression models then when identifying tree tops from local maxima in the nDSM. The advantage of the latter is that a list of DBH estimates is produced at the same time. Distributions of DBH have previously been estimated from height and density measures from ALS data and theoretical diameter distribution models [

When tree tops were identified from local maxima in the nDSM, the RMSE of the estimated stem number was smallest when the nDSM was mean filtered for local _{max}_{max}_{max}_{max}_{99}_{max}_{99}_{max}_{99}_{99}

For the estimated basal area derived from local maxima, the RMSE was smallest when the nDSM was always mean filtered or mean filtered if _{99}_{max}

The RMSE of the estimated stem number was smallest when the nDSM was mean filtered with a 3 × 3 filter and the bias was highest. The RMSE of the estimated stem number was larger when a 5 × 5 and 7 × 7 filter was used and the bias was lower. This suggests that the larger filter sizes removed small variations in the nDSM that would otherwise have given rise to local maxima. However, the RMSE of the estimated basal area was smallest when a 7 × 7 filter was used and the bias was lowest. The bias was large and positive for the filter size 3 × 3 (

This study has compared estimation of forest variables from regression models based on measures derived from ALS data in small (0.5 m) raster cells and based on variables derived from the 3D point cloud. The RMSE of the results achieved from regression models based on 0.5 m raster cells were approximately 2–5% lower than those achieved from the 3D point cloud, which suggests that the smaller raster cells can compensate for the varying density of the ALS data. Once the ALS data have been rasterized, the raster cells may be aggregated to any area unit suitable for the application, which means that the approach is easy to integrate in operational work flows and may have advantages in terms of computational issues. Only one raster cell size was used in the study. The size of the raster cells could be optimized or the varying density of the ALS data could be compensated for in other ways, for example, by weighting the ALS returns depending on local density. This study has also compared a canopy volume model for estimation of stem volume in hemi-boreal forest with a model based on height percentiles and density of ALS data selected with best subset regression. The most accurate estimate was achieved with a log-log regression model including the vegetation ratio and a measure of the maximum height of the ALS returns, although the RMSE was only 1% lower. Hence, both model types may be used for estimation of stem volume. However, the selection of model type can be based on many considerations, for example, if the local reference dataset is large enough for best subset regression. Finally, the study has compared area-based regression models and individual tree-based models with different filtering conditions for estimation of stem number and basal area. The most accurate results were achieved from the regression models (7–11% lower RMSE for stem number; 39–40% lower RMSE for basal area compared to the best filtering conditions). However, an advantage of the individual tree-based models is that a list of DBH is estimated at the same time. When individual trees are derived from local maxima in an nDSM, the filter sizes and the conditions for filtering the nDSM must be carefully selected. Criteria for selection of filter sizes and conditions still remain to be defined for different forest types.

The field inventory and the acquisition of ALS data were financed by the Hildur and Sven Wingquist foundation. Parts of this work were done within the project LASER-WOOD (822030), funded by the Austrian Klima- und Energiefonds in the framework of the program “NEUE ENERGIEN 2020”. We would like to thank Heather Reese and Neil Cory who have checked the language in the manuscript.

Map of Sweden showing the location of the study area (

Side view of ALS data in a 10 m wide and 40 m long north-south transect in one field plot.

(_{max}_{lim}_{lim}

Estimates of stem volume from regression models in (

Estimates of stem volume from regression models in (

Estimates of stem number from (_{99}_{max}_{99}_{max}

Estimates of stem number from (_{99}_{max}_{99}_{max}

Estimates of basal area (_{99}_{max}_{99}_{max}

Estimates of basal area (_{99}_{max}_{99}_{max}

RMSE and bias for stem volume from regression models in

m^{3}·ha^{−1} |
% of mean | m^{3}·ha^{−1} |
% of mean | |
---|---|---|---|---|

Regression model in |
75.1 | 41.9% | 0.3 | 0.2% |

Regression model in |
68.6 | 38.2% | −0.3 | −0.2% |

Regression model in |
66.9 | 37.3% | −0.4 | −0.2% |

Regression model in |
71.5 | 39.8% | 0.7 | 0.4% |

Regression model in |
68.2 | 38.0% | 0.2 | 0.1% |

RMSE and bias for stem volume from regression models in

m^{3}·ha^{−1} |
% of mean | m^{3}·ha^{−1} |
% of mean | |
---|---|---|---|---|

Regression model in |
65.0 | 36.8% | 0.2 | 0.1% |

Regression model in |
57.1 | 32.3% | −0.1 | 0.0% |

Regression model in |
55.7 | 31.5% | −0.3 | −0.2% |

Regression model in |
60.1 | 34.0% | 1.0 | 0.5% |

Regression model in |
58.0 | 32.8% | 0.3 | 0.2% |

RMSE and bias for stem number from regression models in

ha^{−1} |
% of mean | ha^{−1} |
% of mean | |
---|---|---|---|---|

Regression model in |
410.8 | 55.8% | −13.0 | −1.8% |

Regression model in |
387.4 | 52.7% | −4.3 | −0.6% |

Mean filtered 3 × 3 | 507.8 | 69.0% | −252.9 | −34.4% |

Mean filtered 3 × 3 if _{99} |
506.3 | 68.8% | −241.2 | −32.8% |

Mean filtered 3 × 3 for local _{max} |
466.4 | 63.4% | 65.7 | 8.9% |

Mean filtered 3 × 3 if _{99} |
593.0 | 80.6% | −3.6 | −0.5% |

Mean filtered 3 × 3 for local _{max} |
675.7 | 91.9% | 460.0 | 62.5% |

Mean filtered 5 × 5 | 584.4 | 79.4% | −372.2 | −50.6% |

Mean filtered 5 × 5 if _{99} |
582.3 | 79.2% | −359.5 | −48.9% |

Mean filtered 5 × 5 for local _{max} |
476.2 | 64.7% | −17.6 | −2.4% |

Mean filtered 5 × 5 if _{99} |
634.1 | 86.2% | −92.0 | −12.5% |

Mean filtered 5 × 5 for local _{max} |
662.4 | 90.0% | 426.2 | 57.9% |

Mean filtered 7 × 7 | 633.2 | 86.1% | −432.0 | −58.7% |

Mean filtered 7 × 7 if _{99} |
631.5 | 85.8% | −419.7 | −57.0% |

Mean filtered 7 × 7 for local _{max} |
496.7 | 67.5% | −71.5 | −9.7% |

Mean filtered 7 × 7 if _{99} |
653.2 | 88.8% | −127.8 | −17.4% |

Mean filtered 7 × 7 for local _{max} |
651.0 | 88.5% | 388.8 | 52.9% |

RMSE and bias for stem number from regression models in

ha^{−1} |
% of mean | ha^{−1} |
% of mean | |
---|---|---|---|---|

Regression model in |
420.2 | 53.4% | −14.4 | −1.8% |

Regression model in |
393.9 | 50.0% | −4.5 | −0.6% |

Mean filtered 3 × 3 | 518.2 | 65.8% | −293.3 | −37.3% |

Mean filtered 3 × 3 if _{99} |
516.6 | 65.6% | −280.7 | −35.7% |

Mean filtered 3 × 3 for local _{max} |
447.3 | 56.8% | 26.7 | 3.4% |

Mean filtered 3 × 3 if _{99} |
580.8 | 73.8% | −49.5 | −6.3% |

Mean filtered 3 × 3 for local _{max} |
649.0 | 82.4% | 431.9 | 54.9% |

Mean filtered 5 × 5 | 604.6 | 76.8% | −411.6 | −52.3% |

Mean filtered 5 × 5 if _{99} |
602.4 | 76.5% | −397.9 | −50.5% |

Mean filtered 5 × 5 for local _{max} |
464.4 | 59.0% | −58.6 | −7.4% |

Mean filtered 5 × 5 if _{99} |
630.1 | 80.0% | −138.6 | −17.6% |

Mean filtered 5 × 5 for local _{max} |
635.4 | 80.7% | 396.1 | 50.3% |

Mean filtered 7 × 7 | 656.3 | 83.4% | −468.8 | −59.5% |

Mean filtered 7 × 7 if _{99} |
654.5 | 83.1% | −455.4 | −57.8% |

Mean filtered 7 × 7 for local _{max} |
491.5 | 62.4% | −113.3 | −14.4% |

Mean filtered 7 × 7 if _{99} |
652.0 | 82.8% | −173.7 | −22.1% |

Mean filtered 7 × 7 for local _{max} |
622.7 | 79.1% | 356.1 | 45.2% |

RMSE and bias for basal area from regression models in

m^{2} ha^{−1} |
% of mean | m^{2} ha^{−1} |
% of mean | |
---|---|---|---|---|

Regression model in |
6.7 | 23.2% | 0.0 | 0.0% |

Regression model in |
6.2 | 21.5% | 0.0 | −0.1% |

DBH local maxima, mean filtered 3 × 3 | 37.6 | 130.1% | 14.5 | 50.3% |

DBH local maxima, mean filtered 3 × 3 if _{99} |
37.6 | 130.0% | 14.7 | 50.7% |

DBH local maxima, mean filtered 3 × 3 for local _{max} |
40.7 | 140.7% | 19.9 | 68.7% |

DBH local maxima, mean filtered 3 × 3 if _{99} |
39.1 | 135.0% | 21.0 | 72.8% |

DBH local maxima, mean filtered 3 × 3 for local _{max} |
50.8 | 175.5% | 38.7 | 133.9% |

DBH local maxima, mean filtered 5 × 5 | 21.6 | 74.8% | 0.4 | 1.3% |

DBH local maxima, mean filtered 5 × 5 if _{99} |
21.6 | 74.6% | 0.5 | 1.8% |

DBH local maxima, mean filtered 5 × 5 for local _{max} |
23.8 | 82.1% | 6.4 | 22.0% |

DBH local maxima, mean filtered 5 × 5 if _{99} |
23.5 | 81.3% | 8.1 | 27.9% |

DBH local maxima, mean filtered 5 × 5 for local _{max} |
34.4 | 119.1% | 27.9 | 96.5% |

DBH local maxima, mean filtered 7 × 7 | 16.6 | 57.2% | −7.8 | −27.1% |

DBH local maxima, mean filtered 7 × 7 if _{99} |
16.5 | 57.0% | −7.7 | −26.5% |

DBH local maxima, mean filtered 7 × 7 for local _{max} |
17.4 | 60.0% | −1.7 | −5.8% |

DBH local maxima, mean filtered 7 × 7 if _{99} |
18.3 | 63.3% | 0.7 | 2.4% |

DBH local maxima, mean filtered 7 × 7 for local _{max} |
26.2 | 90.7% | 20.7 | 71.5% |

RMSE and bias for basal area from regression model in

m^{2} ha^{−1} |
% of mean | m^{2} ha^{−1} |
% of mean | |
---|---|---|---|---|

Regression model in |
6.5 | 22.1% | 0.0 | 0.1% |

Regression model in |
6.1 | 20.8% | 0.0 | −0.1% |

DBH local maxima, mean filtered 3 × 3 | 39.0 | 132.0% | 16.1 | 54.4% |

DBH local maxima, mean filtered 3 × 3 if _{99} |
39.0 | 131.9% | 16.2 | 54.8% |

DBH local maxima, mean filtered 3 × 3 for local _{max} |
42.2 | 142.9% | 21.6 | 73.1% |

DBH local maxima, mean filtered 3 × 3 if _{99} |
40.6 | 137.2% | 22.6 | 76.3% |

DBH local maxima, mean filtered 3 × 3 for local _{max} |
52.6 | 177.8% | 41.1 | 138.9% |

DBH local maxima, mean filtered 5 × 5 | 22.3 | 75.3% | 1.2 | 4.0% |

DBH local maxima, mean filtered 5 × 5 if _{99} |
22.2 | 75.2% | 1.3 | 4.5% |

DBH local maxima, mean filtered 5 × 5 for local _{max} |
24.6 | 83.2% | 7.4 | 24.9% |

DBH local maxima, mean filtered 5 × 5 if _{99} |
24.3 | 82.3% | 8.9 | 30.1% |

DBH local maxima, mean filtered 5 × 5 for local _{max} |
35.6 | 120.5% | 29.5 | 99.9% |

DBH local maxima, mean filtered 7 × 7 | 16.8 | 56.7% | −7.4 | −25.1% |

DBH local maxima, mean filtered 7 × 7 if _{99} |
16.7 | 56.4% | −7.3 | −24.6% |

DBH local maxima, mean filtered 7 × 7 for local _{max} |
17.8 | 60.4% | −1.2 | −4.0% |

DBH local maxima, mean filtered 7 × 7 if _{99} |
18.8 | 63.6% | 1.1 | 3.7% |

DBH local maxima, mean filtered 7 × 7 for local _{max} |
27.0 | 91.5% | 21.7 | 73.5% |