Estimation of tree lists from airborne laser scanning by combining single-tree and area-based methods

Individual tree crown segmentation from airborne laser scanning (ALS) data often fails to detect all trees depending on the forest structure. This paper presents a new method to produce tree lists consistent with unbiased estimates at area level. First, a tree list with height and diameter at breast height (DBH) was estimated from individual tree crown segmentation. Second, estimates at plot level were used to create a target distribution by using a k-nearest neighbour (k-NN) approach. The number of trees per field plot was rescaled with the estimated stem volume for the field plot. Finally, the initial tree list was calibrated using the estimated target distribution. The calibration improved the estimates of the distributions of tree height (error index (EI) from 109 to 96) and DBH (EI from 99 to 93) in the tree list. Thus, the new method could be used to estimate tree lists that are consistent with unbiased estimates from regression models at field plot level.


Introduction
Estimates of the size and position of every tree in a forest has become a realistic option with the commercial development of high-frequency airborne laser scanners and highresolution digital aerial cameras. Many modern systems for forest management planning require forest information at the individual tree level (Sö derberg and Ledermann 2003, Backéus et al. 2005, Kä rkkä inen et al. 2008 or, at the very least, information about the diameter at breast height (DBH) distribution . For the purpose of forest resource planning, it is also essential that the estimates are unbiased.
Data from airborne laser scanning (ALS) can give information about the height and density of the tree canopy. Recent developments in ALS technology have increased the pulse repetition frequency, making high spatial coverage possible without decreased ground sampling rate. There are already commercial projects in Sweden where high ground sampling rate (! 10 emitted pulses m -2 ) ALS data have been retrieved for large areas (! 100 000 ha) to produce estimates of forest variables for the forest industry. The high ground sampling rate ALS data can be used to measure tree canopy height and shape. By creating a canopy height model (CHM) from the ALS data and using an algorithm that identifies maxima in the canopy heights and divides the tree canopy into tree crown segments based on the height, it is possible to segment individual tree crowns (Brandtberg 1999, Hyyppä et al. 2001, Persson et al. 2002, Holmgren and Wallerman 2006, Solberg et al. 2006. Field data from the laser scanned area are needed to create statistical models to estimate variables relevant to When deriving stand properties from ALS data and field plots, the remotely sensed data and field data need to be accurately co-registered to reduce errors (Gobakken and Naesset 2008), preferably with submetre accuracy. However, field data with such accurate positions are not always available. There are a number of techniques to automatically correct poorly co-registered datasets: Dorigo et al. (2010) found 67% of the true field plot positions by searching for the best match between an ALS-derived tree crown height model and field plot tree positions and heights; Korpela et al. (2007) suggested a method that combines photogrammetric observations of tree tops and field triangulation; and Flewelling (2008) used a computer-assisted system and Voronoi tessellations to associate trees with individual tree crown segments. A method to cross-correlate tree position images for the co-registration of field and remotely sensed data has also been proposed by Olofsson et al. (2008). This technique requires input data from a remote sensing individual tree detection method (e.g. Erikson and Olofsson 2005).
The aim of this study was to develop and validate a method to produce a list of individual trees, with each tree having estimated variables such as tree height, DBH and stem volume. The most important requirement is that the estimated tree list should produce unbiased estimates if estimated tree variables, such as stem volume, are aggregated over an area, for example a forest stand. The developed method combines analysis at the individual tree level and the field plot level. The idea is to use the information about most individual trees that can be measured directly in the high ground sampling rate ALS data and calibrate these results with estimates from area-based methods at plot level with high accuracy and low bias.

Study area
The study area is 1989 ha and located in the north of Sweden (latitude 64 25 0 N, longitude 14 50 0 E) with elevation ranging from 325 to 658 m asl, which means that the site is located close to the local limit for productive forest. The dominating tree species are Norway spruce (Picea Abies), birch (Betula spp.) and Scots pine (Pinus Sylvestris).

ALS data
The ALS data acquisition was performed on 7 and 8 August 2007 using a Leica ALS50-II system carried by a helicopter. The flying altitude was 600 m above ground level and the scan angle AE16 , resulting in a scan width of 375 m and a scan density of about 10 emitted pulses m -2 . The first and last returns for each pulse were recorded and used in the further analysis. Laser returns were classified as ground or nonground using a progressive triangular irregular network (TIN) densification method (Axelsson 1999(Axelsson , 2000 in TerraScan software (Soininen 2004), and the ground returns were used to derive a digital elevation model (DEM). Laser canopy height (LCH) was derived by subtracting the relevant DEM value from each laser return.

Field data
The area was divided into five strata using an existing forest map with associated stand register, and a total of 179 field plots were allocated (table 1). Different field plot radius and minimum DBH were selected for cost-efficiency because of the large number of stems in young forest compared to old forest. The positions of the field plots were measured using a Global Navigation Satellite System (GNSS). The Forest Management Planning Package (Jonsson et al. 1993) was used to measure the trees in the field plots. Within the field plots, all trees with a DBH greater than the minimum DBH were measured using a calliper, and the tree species were recorded. The height was also measured for a subsample of the trees selected randomly with probability proportional to size (PPS) sampling proportional to basal area. The positions of the trees were registered relative to the centre of each field plot by measuring azimuth and distance with a compass and ultrasonic device, respectively. No position was registered for trees with a large inclination, that is with a horizontal distance greater than 3 m between the tree top and the position at breast height (3% of the total number of trees). The tree positions were clustered and the canopy had larger gaps than most managed forests in Scandinavia. The mean stem density was 1147 stems ha -1 and the mean stem volume was 96 m 3 ha -1 .

Methods
The method developed in this study is outlined in figure 1. The first step is estimation of individual trees. This includes automatic segmentation of individual tree crowns from ALS data and field plot matching to co-register the coordinates. The number of trees per segment is estimated and their height and DBH are estimated from regression models. The second step is estimation at the field plot level, which includes estimation of stem volume from regression, and estimation of percentiles of the tree height and DBH distributions by using a k-nearest neighbour (k-NN) approach. The last step is calibration of the tree list from the first step using a target distribution of tree height and DBH estimated at the field plot level from the second step. This includes removal of excess trees and addition of missing trees.

Estimation at individual tree level
3.1.1 Individual tree crown segmentation. Tree crowns were segmented automatically based on geometric tree crown models (Holmgren and Wallerman 2006). The segmentation process included several steps with different raster, all with a cell size of 0.25 m Â 0.25 m. A CHM was created from LCH data by setting the raster cell value to the maximum LCH value within each raster cell if this value was above a 2 m height threshold, otherwise the cell was given a zero value. The height threshold was used to avoid the influence of rocks and low vegetation. Raster cells containing no laser returns were set to zero. However, some of these raster cells could be in a tree crown. Such zero-values within crowns were regarded as noise and were therefore filled using a morphological closing operation. This involved several steps. First, a binary canopy raster was derived from the CHM, in which a raster cell was set to one if the corresponding CHM raster cell had a value greater than zero, or was left as a zero-value otherwise. This binary canopy raster should define areas containing

Estimation at plot level
The result is estimates of stem density, stem volume, and percentiles of DBH and tree height for each plot  crowns. To remove zero-values of canopy raster cells located within crowns, closing was done with a structure element of 3 Â 3 cells. The next step was to update the CHM based on the binary canopy raster in order to set non-zero values of the CHM at crown locations. Zero-values of the CHM were updated if the raster cell was at a tree crown location according to the binary canopy raster (value ¼ one). Neighbour cells with values greater than zero within the smallest window needed to cover at least one value greater than zero were used to update the CHM raster cell. If only one non-zero value was found within the window, that value was assigned, otherwise the mean was used.
A correlation raster was created based on both the CHM raster and LCH data. The correlation between geometric models of tree crowns and LCH values was calculated at each raster cell location of the correlation raster using generalized ellipsoids of revolution (GER). Pollock (1996) introduced the use of GER-based models of tree crowns for automatic detection of individual trees in aerial images. At each raster cell location, the cell value of the correlation raster was set to the maximum correlation from tests with several GER models. A correlation value was calculated by placing the centre of the GER at the cell location, setting the height (i.e. the vertical radius of the ellipsoid) to the value of the CHM at the cell location, and testing a horizontal radius. The tested horizontal radii of the GER were 0.5, 0.7, . . ., 4.0 m. To reduce the amount of false trees, a correlation value was only calculated for a GER model with a radius to height ratio ! 0.1 because, based on field experience, a smaller ratio would probably model an individual branch and not a whole tree. In addition, to avoid overfitting, correlation values were only calculated for GER models with at least 25 laser returns within the radius.
The correlation raster was smoothed and used for segmentation. A starting point, a seed, was placed at each raster cell with a CHM value above a 2 m height threshold and with a positive correlation value. For each seed, the current location was updated by changing the position to the position of the neighbour cell (eight neighbours) with the highest value. This was repeated until the position could not be updated because a local maximum was reached. The seeds with the final location at the same local maxima defined a tree crown segment. A raster cell that had not been included in a segment but was enclosed by raster cells from the same segment was assigned to the surrounding segment.
The result of the segmentation was tree crown segments, each including an individual tree or a group of trees. The maximum LCH value (H) within the segment was used to estimate the tree height. The area of a segment (A) was derived by counting the number of raster cells (0.25 m Â 0.25 m) of a segment and the width (W) of a segment was then derived assuming that a tree crown was circular.
3.1.2 Field plot matching. The field plot-measured tree positions had a precision of , 1 m but lower accuracy. This was because tree positions relative to the plot centre were measured with approximately decimetre accuracy but GNSS measurements of the plot centre could have a bias of several metres depending on the satellite configuration and poor receiving conditions of the satellite signals due to tree crowns. Therefore, the two coordinate systems, that is the local tree positions from the field inventory and the global coordinate system of the ALS data, needed to be coregistered before the field plot trees could be linked to the trees detected from ALS. To achieve this, the method by Olofsson et al. (2008) was used. In the search process, the estimated field plot centre was displaced up to AE20 m in the north-south 1180 E. Lindberg et al.
direction and up to AE20 m in the east-west direction, giving a search area estimated from the expected GNSS error to be large enough to contain the real field plot centre.
To compensate for possible compass errors, the field plot image was rotated 2 between each calculation (with minimum/maximum field plot rotation end values of AE16 ). The standard deviation of the tree displacement precision was set to 0.75 m based on the precision of the field plot-measured tree positions. The size of the raster cell used in the field plot matching was set to 0.25 m Â 0.25 m based on the stem density.
3.1.3 Estimation of number of trees per segment. Individual tree crown segmentation often produces segments that, in reality, contain several trees, which causes underestimation of the stem density (Persson et al. 2002). However, the shape and size of each tree crown segment could contain information about the actual number of trees, where, for instance, large and elongated segments probably contain several trees. To obtain an estimate of the correct stem density, a training dataset was used where the number of field-measured trees for each tree crown segment was known. Segments close to a field plot boundary usually cover ground outside the field plot and it is not known how many trees they contain. The models for individual trees were therefore based only on segments where the centre was located inside a field plot and at least 2 m from the boundary, from now on referred to as reference segments (figure 2). The estimation of variables at the individual tree level did, however, include all segments where the centre was located inside a field plot. We excluded trees with a large inclination from the field data when estimating the models at the individual tree level, but included them when estimating the models at the field plot level and validating the estimated results. Figure 2. Example of polygons from segmentation of ALS data and field-measured trees shown as circles with radius proportional to DBH. The outer large circle represents the field plot and the inner large circle represents the area where the centres of the reference segments are located.
As a first step, mean values for the number of trees per segment were estimated for reference segments in the training dataset based on the variables W j and W j /W s,i , for each segment j, where W s,i is the mean of segment widths within the same field plot i (table 2). W j and W j /W s,i were selected because they showed the highest correlation with the number of trees per segment of all available measures. The variables were divided into eight intervals of equal size. The number of intervals was selected based on the data to have a sufficient number of trees in each interval. Each reference segment was placed in the interval determined by the value of W j and W j /W s,i for that segment, resulting in 64 intervals. For each interval, the mean number of field measured trees was calculated. As a second step, the number of trees within segments was predicted for all segments. To estimate the number of trees inside a segment, it was first placed in an interval determined by W j and W j /W s,i for that segment. The number of trees m j inside the segment was estimated as the mean number of fieldmeasured trees for segments in the same interval of W j and W j /W s,i in the training dataset.
3.1.4 Estimation of tree variables. Regression models for the field-measured variables 4-11 in table 2 were estimated using the reference segments. The regression was done separately for segments containing different numbers of field-measured trees. For each segment, the variables 4-11 in table 2 were estimated as the weighted mean value of the regression result for segments containing different numbers of field-measured trees. The result was an estimate of variables for the largest tree in each segment (i.e. 4, 6, 8 and 10 in table 2) and an estimate of variables for the rest of the trees in the segment (i.e. 5, 7, 9 and 11 in table 2). The latter estimateÂ j was divided by the estimated number of trees minus one to obtain an estimate (Â j 0 ) for the average tree: If the result for the DBH was below the minimum value for the field-measured DBH, m j was reduced iteratively by one until the resulting DBH was above the minimum value. If the DBH was below the minimum value for the field-measured DBH even for one tree, the estimate was discarded. The estimates for the largest tree and the rest of the trees in all segments formed a list of tree candidates. H max Maximum field-measured tree height found within a segment 7 H rest Mean of field-measured tree heights for the rest of the trees within a segment 8 B max Maximum field-measured basal area found within a segment 9 B rest Sum of field-measured basal areas for the rest of the trees within a segment 10 V max Maximum field-measured stem volume found within a segment 11 V rest Sum of field-measured stem volumes for the rest of the trees within a segment 1182 E. Lindberg et al.

Estimation at field plot level
3.2.1 Estimation of forest variables in field plots. The mean of LCH, the standard deviation of LCH, percentiles of LCH and the vegetation ratio were derived based on the LCH distribution by using vegetation returns at the field plot level. The vegetation ratio is defined as the number of vegetation returns divided by the total number of laser returns. To exclude laser returns from below the tree canopy (e.g. shrubs and rocks), vegetation returns were defined as laser returns with a vertical distance to the DEM . 1 m and . 10% of the maximum height within the field plot (e.g. Naesset 2002). A regression model was used to estimate the stem volume per hectare (V r ) for each field plot i. To investigate the contribution from the available variables, different combinations were tested with the best subset regression, and the following regression model was selected: where p 30 is the LCH 30 th percentile, p 95 the LCH 95 th percentile, and R v the vegetation ratio, defined as the number of vegetation returns divided by the total number of laser returns in a field plot. All variables had a p-value 0.06 in the model and the variance inflation factor (VIF) was 1.207-4.183.
The stem density (number of stems per hectare, d r ) was estimated for each field plot i from a regression model: where p 90 is the LCH 90 th percentile. All variables had a p-value 0.001 in the model and the VIF was , 1.083. The distributions of tree heights, DBH and basal area have been estimated previously from ALS data. The Weibull distribution has been commonly used to describe tree size-related distributions such as DBH and basal area Naesset 2004, Maltamo et al. 2007). Some results (Breidenbach et al. 2008) indicate that distribution methods may fail in strata with large gaps in the canopy. Distributions of DBH have also been estimated using percentiles with non-parametric methods , Maltamo et al. 2006, Bollandsås and Naesset 2007. A comparison of a Weibull DBH distribution and DBH percentiles showed no significant difference between the two validated methods . Maltamo et al. (2006) compared estimated DBH percentiles and a Weibull DBH distribution. Of these two, DBH percentiles resulted in the best result for estimation of stem volume. Studies have also shown that the parametric distribution methods are not suitable for estimation of multilayered forest (Kangas and Maltamo 2000). For the present study, we used tree height and DBH percentiles to estimate the target distributions because theoretical distributions were not suitable for use with the small field plots.
For each field plot, the DBH percentiles and tree height percentiles were estimated from seemingly unrelated regression (SUR) models (Zellner 1962) (table 3), where z avg is the mean of LCH, z stdh is the standard deviation of LCH, and the other symbols are as defined earlier.

Estimation of distributions by using a k-NN approach.
To estimate the final tree height and DBH distributions, we used a k-NN approach. The estimates of the stem volume, stem density and tree height and DBH percentiles (equations (2) and (3) SilviLaser 2008 and table 3) were used to identify field plots with similar tree height and DBH distributions with the k-NN approach. The 10 field plots with the smallest sums of squared differences between the values were selected as the nearest-neighbour field plots.
Tree height percentiles and DBH percentiles were derived from the field-measured trees in the nearest-neighbour field plots. The field-measured trees at the nearestneighbour plots were placed in a distribution matrix where each row corresponded to a tree height percentile class and each column to a DBH percentile class (table 4).
The number of trees n fij in each matrix cell was multiplied by a scaling factor (S): whereV r is the stem volume per hectare estimated from regression and V kÀNN is the stem volume per hectare from nearest-neighbour field plots. The result was an estimated target distribution matrix (table 4), where each element corresponded to a tree height percentile and DBH percentile, and the value n tij corresponded to the number of trees in the field plot in each combination of percentiles. The aim was to make the sum of the volume of the trees in the target distribution matrix consistent withV r at the field plot level.

Calibration of tree list from estimation at field plot level
The aim was to predict tree lists with variables that produce unbiased estimates if aggregated over an area. Therefore, the list of individual tree candidates from the segmentation was calibrated using estimates of stem volume and estimates of tree height and DBH distributions at the field plot level. First, we calculated the distribution of the tree candidates from the segmentation by summing the number of tree candidates n stij in each percentile class given by the target distribution (table 4). Tree candidates with a height or DBH greater than the 100th percentile were excluded from the list. For each percentile class, the difference between the target distribution (i.e. the distribution estimated at field plot level by using the k-NN approach) and the candidate distribution was calculated. If the number of tree candidates was too large according to the target distribution, that number of trees was excluded from the list. If the number of tree candidates was too small, that number of trees was added to the list by selecting trees with correct height and DBH for the specific percentile class at random from the list of field measured trees in the nearest-neighbour field plots. The aim was a list of trees with size distribution and stem volume consistent with the estimates at the field plot level. The results were aggregated to the field plot level, and the procedure was repeated 50 times to study the average accuracy of the estimation.

Validation
All steps in the method were validated by using leave-one-out cross-validation. One field plot at a time was excluded from the dataset, and the rest of the dataset was used as reference data to estimate forest variables for the excluded field plot. This was done for estimates at both the individual tree level and the field plot level. The RMSE and bias for stem volume was calculated as follows: where V i is the observed andV i is the predicted stem volume of field plot i and n is the number of field plots. Corresponding expressions were to calculate the RMSE and bias for stem density. The error index (EI) for tree height, DBH and basal area in each field plot was also calculated. The EI measures the proportion of mismatches between two histograms based on given class boundaries and is, according to Reynolds et al. (1988), defined as: where I j is the number of estimated trees to histogram class j, T j is the number of actual trees in class j, m is the number of histogram classes, and N T is the total number of actual trees. In this study, the tree height, DBH and basal area were divided into 10 intervals each.   Table 5 shows the result for the estimation of the number of trees per segment. Most of the segments contained one or two trees. The accuracy was high for segments that contained one or two trees (0.71 and 0.65, respectively) and much lower for segments that contained three trees or more. The estimation had an overall accuracy of 0.60.

Estimation of stem volume and stem density
Estimation at the individual tree level only (table 6, methods 1a and 1b) resulted in a large RMSE and a large underestimation for stem density (52%, -35% and 45%, -18%, respectively). Estimation of the number of trees per segment with mean values (table 6, method 1b) resulted in a smaller RMSE and bias both for stem volume (34%, -3%) and stem density (45%, -18%) than estimation with one tree per tree crown segment (36%,-14% and 52%, -35% respectively) (table 6, method 1a). Estimation at the field Table 5. Result for the classification of the number of trees in each segment. The columns correspond to the correct number of trees per segment according to field data and the rows correspond to the estimated number of trees per segment.
Correct number of trees  Table 6. RMSE and bias for stem volume and stem density estimates, cross-validation results aggregated over field plots, relative to the mean values at the field plot level using the methods 1a, 1b, 2, 3a and 3b. 1a, estimation at the individual tree level; 1b, estimation at the individual tree level including estimation of the number of trees per segment; 2, estimation at the field plot level; 3a, calibration of tree candidate list from results at the field plot level; 3b, calibration of tree candidate list including estimation of the number of trees per segment from results at the field plot level.
plot level (table 6, method 2) resulted in very little or no bias for stem volume (0%) and stem density (-3%). However, these are not estimates for individual trees. Calibration of the tree candidate list from the estimation at the field plot level (table 6, methods 3a and 3b) reduced the RMSE and bias for stem density (37%, -3% and 37%, -3% respectively). For stem volume, the results were similar for all methods except for estimation at the individual tree level (with one tree per tree crown segment), which was an underestimation (-14%). Figure 3 shows the field-measured stem volume and the volume estimated from different methods. The estimates are distributed on both sides of the one-to-one line. The estimates are more accurate for small stem volume than for large stem volume.

Estimation of tree-size distributions
The EI was used to validate if the combined method improved the estimation of the tree-size distribution compared to estimates at the individual tree level. The EI, which measures the proportion of mismatches between two histograms, decreased after the calibration of the tree candidate list with the estimation at the field plot level. This was observed for the distributions of tree heights (109 to 96), DBH (99 to 93) and basal area (92 to 89), although the difference was most obvious for the distributions of tree heights and DBH (table 7). The error index was considerably lower for estimation of distributions at the field plot level (82, 76 and 69, respectively).  1a, estimation at the individual tree level; 1b, estimation at the individual tree level including estimation of the number of trees per segment; 2, estimation at the field plot level; 3a, calibration of tree candidate list from results at the field plot level; 3b, calibration of tree candidate list including estimation of the number of trees per segment from results at the field plot level.

Discussion
The use of individual tree-based methods based on ALS data is limited without including methods that also model the non-detected trees. The non-detected trees cause biased estimates if the individual tree estimates are simply aggregated to field plots or forest stands. This study has introduced a method to produce a list of trees (not just the trees that can be detected by finding local maxima in ALS data) using a combination of analysis at individual tree level and plot level. The method used individual tree detection followed by estimation of the number of trees per segment.
In the next step, target distributions of tree heights and DBH at plot level were estimated by using a k-NN approach and rescaling with the plot-wise volume estimates. These target distributions were used to compensate for missing trees in the original individual tree detection. As expected, the individual tree detection underestimated the stem density (Persson et al. 2002. For this study area, clustered trees made the tree detection task more difficult compared to the more evenly distributed trees in most managed forests in Scandinavia. However, the estimation of the number of trees per segment using variables derived from ALS data reduced the negative bias for stem density estimates from -35% to -18%. For stem volume, the negative bias of -14% was replaced by a weak positive bias by using this method. One reason why the bias was not completely eliminated may be that individual tree detection underestimates the number of trees because a segment cannot be defined if there are too few laser returns within a tree crown. This usually occurs for small tree crowns.
The area-based method resulted in an RMSE for stem volume at the plot level comparable to previous studies in spruce-dominated forest although the relative RMSE was larger due to a smaller mean stem volume (Naesset 2002, Maltamo et al. 2006. One error source may be that the field plot size was small. Trees standing close to a field plot boundary may have a large part of their branches on the other side. The proportion of deciduous forest was high compared to most boreal forests in Scandinavia, approximately 30% in strata 2 and 3, which may also degrade the accuracy . For other studies, stratification of data and the use of separate regression models for different strata have improved the accuracy . The strata definitions are based on, for instance, tree species composition and the strata are mapped by the interpretation of aerial photographs. For this study area, it would have been difficult to use separate regression models based on tree species because many field plots contained a mixture of tree species. However, the aim of this study was to compare the different methods and their results relative to each other using the same dataset.
We used a new technique to first estimate individual trees in whole field plots by using a k-NN approach and then multiply the number of stems with a scaling factor derived from stem volume estimates. This technique efficiently eliminated underestimations of stem volume and stem density. In several other studies, k-MSN approaches have been used to estimate forest variables with ALS data as input. One advantage of these techniques is that the covariance structure is reasonably maintained. For instance, Packalen and  predicted species-specific stand variables by using the k-MSN method without producing unrealistic combinations of estimates. For the present study, the k-NN approach was used to estimate distributions at field plot level because better estimates were achieved with this compared to 1188 E. Lindberg et al.
regression estimates of percentiles and stem density. The elimination of the bias for stem density estimates after estimation of tree lists by using a k-NN approach and rescaling with the stem volume indicates that the applied method is useful to achieve accurate tree lists. Thus, there is no bias for stem volume and stem density estimates and the covariance structure is realistic. The estimation of tree height and DBH percentiles at the field plot level resulted in a much lower EI than did estimation at the individual tree level. After calibration of the tree list from the individual tree detection, the EI and RMSE increased as compared to the estimates at field plot level that defined the target distribution. One reason for this is that there are few trees in each field plot, and a small error in the estimated tree height or DBH will change the histogram. The target distribution includes more trees, and small errors will not change the histogram. Previous studies have reported an EI for DBH distributions of 30-43% in large field plots (3121-4219 m 2 ) ) and 55-78% in smaller field plots (300-1000 m 2 ) Naesset 2005, Bollandså s andNaesset 2007).

Conclusions
In this study, the bias for the estimates of stem volume and stem density from individual tree detection was reduced by using estimation of the number of trees per segment. The best estimates of distributions of tree height and DBH were obtained by using a k-NN approach with estimated percentiles at field plot level as reference variables. The stem volume of the trees in the calibrated tree list aggregated to field plot level had an estimation error (RMSE and bias) that was similar to the direct regression estimates of stem volume.
The estimates of stem volume at field plot level were based on variables derived from the LCH. One problem with analysis at the field plot level is that the variables are derived from a population of trees and the accuracy of the estimates of these variables depends on the size of the field plots. For small field plots, tree crown segmentation is more flexible. To include variables derived from the tree crown segments, such as mean values, might improve the regression models. A mixture of tree species within field plots is also a motivation for tree crown segmentation because tree species information may be derived from the tree crown segments (Brandtberg et al. 2003. The results for the calibrated tree list rely on the estimates from the area-based method, and better estimates would probably improve the end-result. To validate the method, it should be applied to another study area with managed boreal forest. To further improve the results, other input variables for the k-NN approach could be used. The selection method in this study was based on the Euclidean distance between field plot level estimates of tree height and DBH percentiles. A better option may be to use optimized weights for the percentiles of LCH. The EI was lower for the tree lists calibrated with results from the area-based method. Individual tree detection works best for larger trees and the area-based method probably adds most information for smaller trees. It may be possible to improve this by deriving larger trees from individual tree detection and calibrating the distribution according to results from the area-based method for smaller trees. This is in line with the method used by Maltamo et al. (2004).