Assessing a Template Matching Approach for Tree Height and Position Extraction from Lidar-Derived Canopy Height Models of Pinus Pinaster Stands

In this paper, an assessment of a method using a correlation filter over a lidar-derived digital canopy height model (CHM) is presented. The objective of the procedure is to obtain stem density, position, and height values, on a stand with the following characteristics: ellipsoidal canopy shape (Pinus pinaster), even-aged and single-layer structure. The process consists of three steps: extracting a correlation map from CHM by applying a template whose size and shape resembles the canopy to be detected, applying a threshold mask to the correlation map to keep a subset of candidate-pixels, and then applying a local maximum filter to the remaining pixel groups. The method performs satisfactorily considering the experimental conditions. The mean tree extraction percentage is 65% with a coefficient of agreement of 0.4. The mean absolute error of height is ~0.5 m for all plots except one. It can be considered a valid approach for extracting tree density and height in regularly spaced stands (i.e., poplar plantations) which are fundamental for extracting related forest parameters such as volume and biomass.


Introduction
Forestry density and height distribution are among the most important parameters used for forest assessment and planning.Obtaining this data over large areas would require extensive (and expensive) OPEN ACCESS ground-truth sampling.Remote sensing techniques are a valid alternative if correctly applied [1].Sample plots can then be used as support for validating models used for extracting such parameters.In recent years, lidar has proven to be a valid technology for forestry applications, especially for accurately measuring height distributions, calculating volumes using regression, canopy modeling and stem position extraction.With the increase in use of information systems and modeling software, it is also interesting to use lidar-extracted information as input to 3D models of canopy structure [2].
Discrete return lidar data consists of three dimensional points with ordinal return number and intensity information.Data are measured using an emitted laser pulse which is reflected a number of times from the different intercepted surfaces until the last return, usually reflected from the ground.Airborne-carried sensors are ideal for covering large forest areas with a pre-determined point density which can be set, depending on requested spatial accuracy and flight height, by changing the laser pulse frequency, scan rate and flight height.From calculation of the time of flight of laser echoes, various structure metrics can be calculated, inferred or modeled [3] as support for forest inventories.Falkowski et al. [4] have found up to a 95% agreement rate when inferring forest inventory classes from lidar data.Stem density calculation over Eucalyptus stands using discrete return lidar [5] showed the feasibility of using lidar for management of such types of plantations.Korpela et al. [6] did an extensive study using lidar for estimating single tree position, diameter, height and relative volume, finding a 4.7% error on height estimation, 2% and 10% respectively for commission and omission errors of stem position detection, and 20% error on stem diameter estimation.
Most tree top detection methods are based on application of local maximum moving-window filter (LMF) applied to the canopy height model [7,8].Popescu et al. [9,10] have applied variable window size LMF, successfully improving tree position detection.Barilotti et al. [11] adopt a more complex method where a single tree can be modeled and its position extracted when there is a high enough lidar point density (>2-4 pts/m 2 ).
Integration of lidar with imagery to improve extraction of forest parameters is also being investigated.Tests on using multispectral data with lidar data gave significant improvement of results [12,13].Also structural forest information retrieval is better when imagery and lidar are used together [14].Imagery undoubtedly gives added value to lidar data, but it is not always available.
Forest characteristics for habitat studies can be inferred from lidar data [15].Laser with discrete returns has proven effective for extracting digital terrain and canopy height models (DTM and CHM respectively) even if an under-estimation error of tree height metrics is expected due to the low probability of the laser intercepting the topmost part of the tree-depending on point density and laser footprint size.The tree height and position random bias error is correlated with tree density, canopy density and shape, and of course is influenced by the algorithm used to extract CHM and DTM [16].Naesset [17] has proven that forest homogeneity impacts on accuracy.Significant correlation between inferred and real tree position in different (coniferous and broad-leaved) forest types was found by Barilotti et al. [18].Using morphological operators to detect the actual stem position from the crown's barycenter instead of the local maximum topmost part, can differ from the position of the tree base due to morphological growth characteristics, thus adding another source of error.Maltamo et al. [19] used Loyd's histogram threshold method over height distribution to estimate the mean heights of the dominated layer.
The approach that is assessed in this paper is a template kernel to extract tree position and corresponding height using correlation with CHM as a similarity measure.This method is not new in computer vision as it has been applied in feature recognition [20].Grudin [21] shows such a technique applied to feature recognition in faces.Canopy is a simpler feature to identify than facial elements because it is usually symmetric so no rotation of the template is needed, only the scale factor has to be considered.Similar techniques, which use morphological characteristics of trees, have been used, even proposing to infer tree species from its shape [22,23].

Study Area and Data Collection
The experimental area is situated in the north-eastern part of Italy, in the Veneto region.Its center is approximately at 12°57′22″ longitude and 45°37′20″ latitude relative to the 1984 World Geodetic System datum (WGS84).It is located in the Mediterranean Basin and falls in the Temperate Oceanic bioclimate.Four areas have been plotted inside an even-aged stand of Pinus pinaster which was introduced by reforestation activity to protect the nearby northbound Pinus pinea stands from marine aerosol.The area borders a shoreline environment towards the south and Pinus pinea stands and grain agriculture fields at the north (Figure 1).The survey campaign was done during the month of November 2008 using a reflector-less Leica TC200 total station to measure stem positions.The plots were georeferenced using a Geotop Hiper Pro GPS receiver to position six tie points in open-sky conditions.The georeference system used is the UTM-WGS84 datum (European Petroleum Survey Group-EPSG code = 32632).This instrument has a dual-frequency receiver which, using RTK connection with a fixed base antenna, allowed precise (±0.01 m) positioning of the six points.At least three of the six tie points were then measured from the total station each time it was positioned in each plot, thus allowing calculation of the position and orientation of the plot itself.Stem coordinates were recorded by measuring angle and distance from the instrument to the stem surface and correcting distance with the stem diameter at breast height, which was also recorded.Heights of all trees whose top was visible along the line-of-sight of the instrument were also recorded.Characteristics of the four areas are listed in Table 1.As can be seen the areas are very similar, except area B which has an empty patch at its south-eastern quadrant and thus has lower tree overall density.Lidar data was obtained by a sensor mounted on an AS350 B32 helicopter: characteristics of sensor and data are listed in Table 2. Point accuracy provided by the lidar survey company was double-checked using four of the six control points measured with GPS for plot positioning.All four points had a positional error well within given accuracy range, thus no further corrections were necessary.

Lidar Processing and Software Used
Processing consists of several steps which take as input the lidar point cloud and give as output a vector point map with inferred tree-top position and heights.It can be divided into a lidar-processing phase, a filter phase and a post-processing phase (Scheme 1).The lidar processing phase was done using Terrasolids' TerraScan module for Benteley Microstation.The filter phase and the post-processing phase are done using GRASS (Geographic Resources Analysis Support System) open source software.GRASS's open code and raster-processing libraries provided the opportunity to implement the correlation filter used in the filter phase and integrate it in the GRASS environment.The module is available at author's webpage and can be easily integrated into existing GRASS Linux distributions or compiled for any platform (comments are welcome).The resulting vector points in output from the filter procedure are spatially analyzed using R statistical language [24] procedures.Scheme 1. Procedure of filter application in its three phases.

Lidar Processing Phase
In the lidar processing phase the point cloud is classified into ground points and not-ground points using Axelsson's iterative triangulation algorithm [25].Successive gridding of the triangulated ground surface into a 0.5 m resolution raster produced the digital terrain model (DTM).The digital surface model (DSM) was created at the same resolution from gridding the triangulated surface of first + unique return points.No other process was necessary because neither buildings nor other disturbances (power lines) for vegetation-detection were present in the area.The difference between the DSM and the DTM provides the canopy height model (CHM).

Filter Phase
The filter phase consists of application of the correlation filter.This consists of a template matrix where each cell is populated with values using a Gaussian normal function: where m and n are column and row number of filter window of size m x n where m = n and are both odd values and σ is the standard deviation.The standard deviation can be set according to tree canopy type.For conical shapes, which require a sharper peak, a low standard deviation (1-3) can be used, whereas more elliptical canopies require a higher standard deviation value (7)(8)(9)(10).A value of 10 is used for this study case due to the similarity of such kernel with the canopy shape of Pinus pinaster.Geometrical representations of the template matrix with two different standard deviations can be seen in Figure 2.
where: r is the normalized correlation value at column i* and row j* which is the center-respectively i* = I − (m − 1)/2 and j* = j − (n − 1)/2 and i and j are the ith and jth row and column in the grid-B is the CHM, A is the kernel matrix with m columns and n rows which have to be odd identical values, B* is the part of CHM window overlapping the kernel (see Figure 3).The denominator in the expression is the standard deviation of values in the kernel and in the overlapped CHM and serves to normalize the result such that −1 ≤ r ≤ 1.The result of this phase is a raster representation of correlation values which corresponds to the degree of similarity between the template kernel and the corresponding grid values of the CHM.Since the CHM resolution is 0.5 m, and the maximum distance between nearest neighbor trees was found to be 5 m, we tested 5 window sizes; 3 × 3, 5 × 5, 7 × 7, 9 × 9 and 11 × 11 thus covering a real-size from 1.5 m to 5.5 m. Figure 4 depicts a representation of CHM and corresponding correlation map as well as a profile showing the typical ellipsoidal canopy shape.

Post Processing Phase
The correlation map is processed to extract local maximum correlation values.Masking is first applied to the correlation map in order to only keep cells which present significant correlation using a threshold.An objective method to decide the threshold value had to be found, so it was decided to use the lowest value that would reject the null hypothesis, with a 99% confidence that correlation is not significant.This value, (r), was calculated considering sample size equal to the number of cells in the kernel matrix (n) with t-value (t) relative to corresponding degrees of freedom: The value of t was chosen relative to a 99% confidence that correlation values above r will be significantly different from no correlation (r = 0).This is done to avoid local maxima detecting correlation values which are actually not significant.The next step in the post-processing phase is to isolate the local maximum of the correlation values around a neighborhood of the same size as the correlation kernel.This filter assesses if the cell at the center of the window has the highest value and accepts it only if true.There is plentiful documentation in the literature on its use directly on CHM [6,22,23]; in this study it is used on the correlation model and not directly on the canopy height model.The advantage should be that a correlation kernel considers overall shape similarity with the local CHM area, while the local maximum approach directly on CHM does not.The isolated pixels, which represent local maximum, are then converted to points to be analyzed as a point pattern in the R statistical analysis software.

Tree Density and Position
In Figure 5 the surveyed positions of tree trunks at breast height and of lidar-inferred tree tops from best performing window sizes (3 × 3, 5 × 5 and 7 × 7) are plotted.Tree count and corresponding density are extracted for all tested kernel sizes, and error with ground truth calculated (dD = D ground-truth − D lidar-inferred ) and plotted in Figure 6.From Figure 6, it can be affirmed that the best result of the tree density parameter is obtained using the 5 × 5 kernel.Interpolation of results between window sizes for all areas except area C shows that a theoretical 6 × 6 window would have performed even better.This is most probably influenced by the nearest neighbor distance which shows that the average distance between trees in those three areas is ~2.3 meters.The 5 × 5 window, in our 0.5 meter resolution raster, is the closest size as it embraces a 2.5 × 2.5 area.The fact that it actually under-estimates the tree count is most probably due to trees whose canopy is completely embedded with another tree, thus making separation very difficult.Area C has a slightly different result, showing a slight over-estimation using the 5 × 5. Since this area has the same characteristics as the others in terms of mean diameter, height and density, the reason might be found in irregular canopy shape.This aspect has to be investigated further as it can provide interesting information on the behavior of the method.An evaluation of accuracy of tree-top position detection was done using an agreement test.Agreement tests improve on simple agreement by considering also the agreement expected by chance and giving an estimated coefficient of agreement.Cohen [28] and Fleiss [29] have proposed different applications for the coefficient of agreement, respectively for categorical items and for categorical ratings in a number of class items.This study case considers reliability of crown-top position correspondence to position of trunk at breast height.These two elements can have a spatial offset, thus the objective is not to estimate absolute position error, but to estimate agreement in tree detection considering expected chance agreement together with omission and commission errors (false negatives and false positives).These last two elements are calculated as complements of the two point sets (ground truth positions and lidar-inferred positions) where matching nearest neighbors between surveyed points and lidar-inferred points are considered as common points in sets: where S is the surveyed point set and I is the lidar-inferred point set, C err is commission errors and O err is omission errors, M is the matching point set.Matching was done by sorting the results from cross nearest neighbor calculations from smallest to greatest; a point in I is considered as matching a point in S if it is within a threshold distance equal to the average tree distance measured (~2.3 m).Matches have to be unique in the sense that a tree in S cannot match more than one tree in I. Once all trees from ground survey (S) have been matched, eventual remaining points are considered as omission errors, while unmatched points and eventual remaining points from the lidar-inferred (I) set are considered commission errors.Having all this information the kappa coefficient of agreement (K) can be calculated: where Pr a is observed agreement and Pr e is chance agreement, N m is number of matching points, N S is number of surveyed points, N c is number of points assigned as commission errors, and N o is number of points assigned as omission errors.In Table 3 results are reported and calculated K values are plotted as a function of correlation filter window size in Figure 7. Percentage of tree extraction is the ratio of matched to surveyed tree ratio: N m /N S .The agreement coefficient between tree extraction and measured tree positions shows the best correlation window size to be 5 × 5 for all four areas, in line with the above observations that the scale of the window size has to be proportioned to the scale of the canopy investigated.It is interesting to note that the percentage of tree extractions in Table 3 is always higher for smaller kernel sizes.This of course does not account for false positives and false negatives like the agreement coefficient does and is therefore, misleading.In this context, it can be affirmed that an a priori knowledge of average tree nearest neighbor distance can be used to decide kernel size.This is of course valid for stands which present a spatial distribution which is not clustered, but would work well for regularly spaced trees such as in plantations.For example, poplar plantations are common in this area and in many regions of Italy and are encouraged with incentives given by afforestation programs as well [30].They could be ideal stands for the application of this method.Table 3. Coefficient of agreement (K) and percentage of tree extraction (%) for the four areas.KS = kernel size, N S = total number of ground-truth surveyed tree positions, N I = total number of lidar-inferred tree tops, N C = commission error (false positives), N O = omission error (false negatives) and N m = number of matching (correct) points.

Tree Height Distribution
As reported in Table 1, a sub-selection of trees in each plot were measured for height.The measured height distribution is compared with the height extracted after tree extraction.This is done by querying the CHM pixel value underneath the lidar-inferred tree-top position using results from the 5 × 5 kernel size which proved to have the highest accuracy.Measured tree height and corresponding lidar-inferred height are used to calculate error by subtracting measured value from inferred value.Error distribution statistics are reported in Table 4.It can be seen that lidar inferred height is underestimated in all four plots, probably due to the point density which is not very high (4-5 points per square meter) as has already been reported in the literature [16,[31][32][33][34].The mean absolute error is lower than other tests in different contexts [33,34] because some factors which influence height estimation are in this case ideal; terrain is flat and canopies are in a single layer.Another point worth mentioning is that the canopy shape is such that the slope value around the tree top is very low.For this reason a small positional error does not imply a significant height error like it would in the case of conical canopies.

Conclusions
This paper discussed results of a different approach towards tree extraction, positioning, and height estimation, on a single tree basis.Forest environments are, in reality, varied, and this method does not pretend to be accurate in all contexts, but wants to give an alternative to other methods which might work less well in situations like the one presented in the test site.As mentioned in the discussion section, the ideal context for this application should be regular spaced trees, without a significant dominated layer and an ellipsoidal shaped canopy.Conical shaped canopies will benefit mostly from a local maximum filter approach as application of a correlation filter would be redundant.A clustered spatial distribution would imply embedded canopies which are harder to detect (with any approach).The kernel size in relation to the canopy size is an important parameter as a larger window would embrace multiple canopies thus giving low correlations also when centered on a tree top, whereas smaller windows would be subject to -noise‖ effects of canopy morphology resembling tree tops and thus overestimate the number of tree positions.It would be interesting to conduct future investigations to understand the effect of lidar point density and respective CHM resolution on the filter performance.It is likely that higher point densities and higher CHM resolutions improve results because the matching process would be more precise.This method could be applied to stands with the preferred characteristic of having few or no dominant trees and where canopy shape makes it difficult for other methods to give good results.The ability to obtain stand density and spatial distribution of its elements over large areas using a semi-automatic process can be of great help as input to inventories and as parameters for derived ecological indices.Also less ground-truth surveys would be needed and could just be used to control the results, bringing an economical advantage.

Figure 1 .
Figure 1.Study area with the four ground-surveyed areas (A-B-C-D).

Figure 2 .
Figure 2. A 3D plot of the Gaussian kernel used as template here is shown over a graphical representation of generic canopies.(a) σ = 2 over a conical-shape canopy.(b) σ = 10 over a ellipsoidal-shape canopy.

Figure 3 .
Figure 3. Schematic representation of CHM grid and kernel matrix window for calculation of 2D cross-correlation value between A (kernel) and B* (CHM).See equation 2 and corresponding explanation for details.

Figure 4 .
Figure 4. (a) CHM grid and (b) corresponding correlation map resulting from the application of a 5 × 5 correlation filter with a mask over values below zero.The normalized correlation values have been scaled ×100.(c) Canopy and terrain profile of transect over area.

Figure 7 .
Figure 7. Kappa coefficient of agreement of tree detection for each kernel size.

Table 1 .
Summary of field data: A-D plots, N = number of stems in plot, SD = standard deviation, DBH = breast height diameter.

Table 2 .
Lidar flight characteristics.Z refers to vertical accuracy, XY to planar accuracy.

Table 4 .
Error statistics for height measures: N = Number of trees with height measurements, SD = standard deviation, M M = Mean of measures, M L = mean of lidar inferred heights, ME = mean of errors, MAE = mean of absolute errors, RMSE = root of the mean of sum over squared errors, % = RMSE relative to mean measured height.