Extraction of Spectral Information from Airborne 3D Data for Assessment of Tree Species Proportions

: With the rapid development of photogrammetric software and accessible camera technol-ogy, land surveys and other mapping organizations now provide various point cloud and digital surface model products from aerial images, often including spectral information. In this study, methods for colouring the point cloud and the importance of different metrics were compared for tree species-speciﬁc estimates at a coniferous hemi-boreal test site in southern Sweden. A total of three different data sets of aerial image-based products and one multi-spectral lidar data set were used to estimate tree species-speciﬁc proportion and stem volume using an area-based approach. Metrics were calculated for 156 ﬁeld plots (10 m radius) from point cloud data and used in a Random Forest analysis. Plot level accuracy was evaluated using leave-one-out cross-validation. The results showed small differences in estimation accuracy of species-speciﬁc variables between the colouring methods. Simple averages of the spectral metrics had the highest importance and using spectral data from two seasons improved species prediction, especially deciduous proportion. Best tree species-speciﬁc proportion was estimated using multi-spectral lidar with 0.22 root mean square error (RMSE) for pine, 0.22 for spruce and 0.16 for deciduous. Corresponding RMSE for aerial images was 0.24, 0.23 and 0.20 for pine, spruce and deciduous, respectively. For the species-speciﬁc stem volume at plot level using image data, the RMSE in percent of surveyed mean was 129% for pine, 60% for spruce and 118% for deciduous.


Introduction
Forest companies commonly utilize lidar-based forest information, estimated primarily using area-based methods [1][2][3]. In boreal forest, these methods deliver stand level estimation accuracies in terms of relative root mean square errors (RMSEs) typically in the range of 2.5-13.6% for basal area-weighted mean tree height, 5.9-15.8% for basal area-weighted mean stem diameter and 8.4-16.6% for mean stem volume [3,4]. This generally outperforms traditional sources for forest management data, such as subjective field inventory.
In forest management planning, tree species information is vital. Therefore, extending the lidar methodology by adding spectral data, e.g., from aerial images, to achieve tree species-specific estimates using various frameworks [5] such as non-parametric methods like k-MSN [6] has been of interest. In [6], RMSEs of 33-52%, 56-63% and 84-103% for pine, spruce and deciduous stem volume, respectively, were reported at plot level. The corresponding figures for stand level accuracies were 28% for pine, 32% for spruce and 62% for deciduous stem volume [7]. However, combining lidar and aerial images generally requires separate acquisitions as optimal flight specifications differ between the two sensor types; the separate acquisitions, thus, increases the cost.
A new and interesting option for acquiring three-dimensional (3D) and spectral data is multi-spectral airborne lidar. A three-sensor lidar system, with two near-infrared and one green laser, was used in a Canadian study to classify broadleaf and needle leaf trees with a classification error of 4.6%; when separating eight species the error was found to be 24.3% [8]. In a Finnish study [9], multi-spectral lidar was compared to uni-spectral lidar combined with aerial images for assessment of species composition. The results for multi-spectral lidar were in terms of RMSE for pine 18.2%, spruce 19.5% and broadleaved 19.5%, compared to the results for the combination of uni-spectral lidar and aerial images with RMSE of 16.7%, 19.3% and 15.4%, respectively, at plot level.
Aerial images and stereo photogrammetry have been used in many studies to predict structural forest variables with good accuracy [10][11][12][13][14]. Therefore, the option of using both structural (3D) and spectral information from images is an interesting alternative. This was studied in [10], using images with 60% forward and 30% side (60%/30%) overlap and the method proposed in [6]. This method involves pre-classification of each point in the point cloud to a specific tree species followed by k-MSN modelling to predict species-specific stem volume. The results in [10] showed RMSEs of 90.6%, 26.4% and 72.6%, for pine, spruce and deciduous mean stem volume, respectively, at stand level. In another study from Norway [15], species-specific mean stem volume was estimated from both 60%/30% and 80%/30% overlapping image data, and simple averaging of spectral metrics was used at plot level. The results in [15] showed RMSE of 43.3%, 35.0% and 80.1% for pine, spruce and deciduous mean stem volumes, respectively, at stand level, for 60%/30% image overlap data set. By increasing the image forward overlap to 80%, the accuracy increased slightly. Comparing multi-spectral lidar to aerial images as a single sensor solution for assessing tree species-specific stem volume, it was concluded in [16] that aerial images performed marginally better than multi-spectral lidar and that the combination of uni-spectral leaf-on lidar and aerial images was clearly better.
Spectral-based predictions and classifications are dependent on the quality of the aerial images, specifically: type of sensor; atmospheric conditions; sun-sensor-target geometry and phenology of the trees [17,18]. In [19], for example, ortho-rectified imagery and template matching were used to identify the sunlit part of single tree crowns to classify tree species with an overall accuracy of 89%. However, a shift in the relation of the spectral properties for Scots pine and Norway spruce was seen between June and October image data sets. Research has been aimed at calibrating sensors and performing radiometric correction of aerial images to improve tree species classification [20,21]. Nevertheless, the proximity effects, i.e., the effect of neighbouring trees on each other, in mixed stands was found to influence the mean reflectance by 1-17% in the visible bands and up to 33% in the near-infrared band, adding substantial classification errors [21].
Due to the rapid development of photogrammetric software and accessible camera technology, land surveys and other mapping organizations now provide various point cloud and digital surface model (DSM) products from aerial images often including spectral information. The spectral content originates from the images, which are radiometrically calibrated using the mapping camera software package with the aim of creating homogenous looking image blocks and orthophotos [22]. Alternatively, more advanced methods like radiometric block adjustment [23,24] can be done on the individual images. However, when the spectral information is projected to the point cloud it can be done in different ways as each point is seen in more than one aerial image [25]. Given a coloured point cloud, there are different ways to extract information, i.e., metrics that could be used for modelling tree species-specific variables like mean stem volume or tree species proportions, considered a research gap in the review article [14].
The aim of this study is to compare how different methods for colouring image point cloud data affects the accuracy of tree species-specific proportions and stem volume estimations using standard image products. Different spectral metrics and their importance for tree species estimation are also compared. A total of three different colouring options are used on two seasonally different aerial image data sets (leaf-on and leaf-off). These are also combined with the aim to improve estimation accuracy. Finally, a coloured DSM product available from the Swedish National Land Survey (Lantmäteriet) is included in the comparison as well as a multi-spectral lidar data set.

Study Area
The study area is part of the Remningstorp forest estate, which is situated at 58 • 30 N, 13 • 40 E (Figure 1). The estate is privately owned by a foundation and managed for timber production with a productive forest area of about 1200 ha. The terrain is relatively flat with a topography varying between 120 m and 140 m above sea level. The forest is mainly dominated by Norway spruce (Picea abies (L.) H.Karst.), Scots pine (Pinus sylvestris L.) and birch species (Betula spp.).

Field Data
Circular field plots with 10 m radius were objectively surveyed in a regular quadratic grid with 200 m spacing between adjacent plots over the study area in 2010 [26]. The starting point of the grid was chosen randomly. Each training plot was surveyed using the methods and state-estimating models of the Heureka forestry decision support system [27]. For field plots with basal area-weighted mean tree height less than 4 m or basal area-weighted mean stem diameter at breast height (i.e., 1.3 m above ground) less than 5 cm, height and species of all saplings were recorded. For the remaining field plots, recording of tree species and callipering of all trees at breast height including only trees greater than 4 cm in stem diameter, and sub-sampling of trees to measure height and age, were performed. Heights of callipered trees on the field plots were estimated using models relating tree height to stem diameter [28]. Plot location was measured using differential code GPS with post-processing producing sub-meter accuracy. The plots were re-surveyed in 2014 and 2015 using the same inventory method. In total, 263 plots were surveyed. However, in the data pre-preparation process, plots that did not have data (field or remotely sensed) for both times, had a mean tree height under 5 m or were clear-cut, were omitted from this study, leaving 156 field plots unaffected. For these field plots, the basal area-weighted mean tree height range was 6.0-28.1 m (mean 18.1 m), basal area 1.9-68.9 m 2 ha −1 (mean 24.8 m 2 ha −1 ) and the mean stem volume 3.4-790.1 m 3 ha −1 (mean 218 m 3 ha −1 ) in year 2014. The corresponding tree species-specific mean stem volume range was 0-355.1 m 3 ha −1 (mean 43.0 m 3 ha −1 ), 0-790.1 m 3 ha −1 (mean 142 m 3 ha −1 ) and 0-383.9 m 3 ha −1 (mean 32 m 3 ha −1 ) for pine, spruce and deciduous, respectively. The distribution of species proportions of mean stem volume for the plots are shown in Figure 2.

Aerial Images
Aerial images were captured on the 11th and 26th of July 2014 and the 11th and 8th of May 2016 using UltraCam XP-wa and UltraCam Eagle Mark 1 digital camera, respectively (hereafter referred to as data sets UC2014 and UC2016). The UC2014 images were acquired with an image size of 17,310 × 11,310 pixels and a focal length of 70 mm (field of view 73 • ) at 2800 m and the UC2016 images were acquired with an image size of 20,010 × 13,080 pixels and a focal length of 80 mm (field of view 66 • ) at 3700 m altitude above ground level (a.g.l.), generating images with a ground sampling distance (GSD) of 0.25 m and with a forward overlap of 60% and a side overlap of 30%. The images were block triangulated using bundle adjustment and radiometrically corrected by Lantmäteriet, as part of their operational aerial image production. The radiometric correction was conducted using a model-based approach, which included correction of haze, atmospheric effects, hotspots and an adjustment of the final colour tone [29], resulting in pan-sharpened colour infrared (CIR) images (green, red, infrared) with an 8 bit radiometric resolution. As a comparison to the UC2014 and UC2016 image data sets a photogrammetrically derived standard DSM product from Lantmäteriet, generated from the same 2016 images, but filtered to 2 × GSD resolution, was also used (delivered as a regularly spaced point cloud in las-format; UC2016DSM_SURE). The 2016 DSM product was not available for the year 2014.

Multi-Spectral Lidar
Multi-spectral lidar data were captured on the 21st of July 2016 using the Optec Titan sensor mounted in a fixed wing aircraft flying at 400 m a.g.l. The Titan sensor has three separate lidars, two infrared (channel 1: 1550 nm and channel 2: 1064 nm) and one green (channel 3: 532 nm). With a pulse rate of 225 kHz and a scan angle of 30 degrees the point density is more than 30 points/m 2 . The intensity values of each lidar point was calibrated by weighting with the distance to the sensor.

Stereo Photogrammetry
Photogrammetric processing of the images to produce point cloud data was done using the SURE software [30], which generates a height value for each pixel using a modified semi-global matching algorithm [31]. The software setting AERIAL6030 (predefined settings optimized for aerial surveys with 60%/30% overlap) was used to define parameters for point cloud generation. Finally, the point cloud, DSM and lidar height values were transformed from height above mean sea level to height above ground level by subtracting the height of the ground provided by the national digital terrain model (DTM) from Lantmäteriet (with 2 m spatial resolution and 0.2 m vertical accuracy, [32]).

Colouring Options
The UC2014 and UC2016 point clouds generated by SURE have colour values assigned to each point. SURE colours the points by assigning the DN value from one of the images in the stereo pair that has generated the point, but which is unknown. Using the photogrammetric software Inpho [25], each data set was re-coloured into two new data sets where for each point; (i) the spectral values of the nearest image was used and (ii) the mean of the spectral values of all images where the point was visible was assigned. These together with the DSM product and the multi-spectral lidar data sets resulted in eight different data sets (Table 1).

Spatial Metrics
Spatial predictor variables, i.e., point cloud metrics from the 3D-coordinates of the points, were calculated from all data sets (for lidar, only channel 2 was used) using FU-SION [33], which calculates various statistical summary measures of the point cloud data at each training plot. Density metrics of points above a specific height threshold were calculated using a height limit of 2 m, which is commonly used in lidar-based modelling [34]. All returns were utilized when metrics were calculated.
Lidar penetrates the canopy, giving echoes (points) within the canopy; therefore, metrics like vegetation ratio (i.e., percentage of points over a certain height threshold) describe forest density well and are often used in forest modelling [11,35]. However, for image-based point clouds, which only describe the surface of the canopy, density metrics like vegetation ratio tend to result in 0 or 1 and, thus, poorly describe the forest density. Therefore, for the image-based point clouds, in addition to the metrics from FUSION, a canopy height model (CHM) was generated, with 0.5 m GSD, assigning the maximum height to each raster cell. Metrics describing the surface of the canopy were calculated (i.e., mean of the following: canopy height, slope, aspect, surface ruggedness and surface roughness [36] for each training plot. These metrics were also calculated with all no-data pixels (i.e., occluded areas) set to zero instead of being ignored in the calculation. Additionally, using a filter approach, where a 3 by 3 pixel window identifies the centre cell as local maxima if the eight neighbours have a lower or equal height value, was applied and sum of squared heights of the local maxima were calculated from the image-based CHM for each training plot. The sum of squared heights was also calculated from a CHM, which had been smoothed using a 3 by 3 pixel mean filter. Using the planar position of the local maximum, two spatial descriptive statistics was generated for each plot; (i) spatial dispersion, calculated as the mean distance between nearest neighbours [37] and (ii) deviations from spatial homogeneity using Ripley's K function [38]. These other than FUSION metrics were calculated aiming to improve the characterization of forest density information from the image-based point clouds, but not calculated from the lidar-based point clouds.

Spectral Metrics
Spectral predictor variables, i.e., point cloud metrics from the green, red and nearinfrared values of the points (for lidar: channel 1, channel 2, channel 3), were calculated using FUSION for each plot. FUSION calculates various statistical metrics such as minimum; maximum; mean; standard deviation; mode; covariance; skewness; moments and percentiles, for each spectral band. For the image-based point clouds, the mean of the spectral values was calculated from the CHM cells above 2 m and with an aspect facing the sun (± 20 degrees of the sun's position at the mean time of image acquisition), where the remaining cells not satisfying these conditions were excluded in the calculation of the metrics. The reason for setting up these criteria were to coarsely get the mean value of the sunlit part of the crowns. This was not done for the lidar data since the illumination angle and viewing angle is the same. The spectral predictor variables for the lidar data set were calculated using the intensity of first and only returns. Due to the large number of derived spectral metrics, the metrics were sorted in groups according to their origin and expected operational applicability. Tree species estimations were then made and evaluated for each group separately as well as in combination. The groups of metrics are outlined in Table 2. Table 2. Metrics used in the study divided into five different groups with associated motivation for each group.

Metrics Motivation
A minimum, maximum, mean, standard deviation and variance of each spectral band * fundamental distributional statistics B mode, covariance, skewness, L-moments (L1, L2, L3, L4), the 1st, 5th, 10th, 20th, 25th, 30th, 40th, 50th, 60th, 70th, 75th, 80th, 90th, 99th percentile and generalized means for the 2nd and 3rd power, of each spectral band extensive distributional statistics C mean value of the sunlit part of the CHM for each band and normalised by dividing the value for each band by the sum of all bands data from the sunlit part only D groups A, B and C all spectral metrics E groups A, B, C and all spatial metrics (see Section 3.3. above) all spectral and spatial metrics * Mean, standard deviation and variance were normalised by dividing the value for each band by the sum of all bands.
Each colourized data set (Table 1) was then used together with each of the five spectral groups ( Table 2) to estimate species-specific proportions and mean stem volume. The UC2014 and UC2016 data sets were combined based on colour method and spectral metric groups to predict species-specific variables for year 2014.

Stem Volume
Linear regression analysis was used to model the relationship between spatial variables and mean stem volume. Selection of final models was based on produced RMSE, bias and adjusted R 2 values, significance and correlation of metrics and studies of residual plots. In the first phase, the best metric was selected by testing all possible variables in a simple linear model. In the second phase, the best performing metric was combined with a second metric and the best model was selected. Thereafter, logarithmic and square root transformation of the response variable, i.e., mean stem volume, were examined. Finally, a third metric was added if it improved the model, and a maximum of three metrics were selected for each model to avoid overfitting. Mean stem volume was predicted for each data set, i.e., UC2014, UC2016, UC2016DSM_SURE and Lidar2016_MS, using the field data from the corresponding year. For the data set where spectral data from both 2014 and 2016 images were used, the mean stem volume was predicted using the spatial metrics of the UC2014 data set (the 2016 data set is leaf-off, generating poorer estimates of mean stem volumes, [39]).

Species Proportion
Estimations of species-specific (pine, spruce and deciduous) proportions of mean stem volume from the spectral metrics were made using a non-parametric method, rather than a model-based approach. This, since the targeted proportions show complex dependencies to the very large number of addressed metrics, and the technical difficulties in modelling finite range processes (i.e., ratios limited to values strictly between 0 and 1) and skewed multimodal error distributions. Thus, estimations of the addressed species proportions from the various metrics were developed using the non-parametric method Random Forest [40,41] implemented in the R package randomForest [42,43]. In short, Random Forest combines the ideas of regression trees and bootstrap aggregating ("bagging", [40]) to fit and evaluate a large number of regression trees (a "forest"). Each tree is fitted using a random sample of the training data, and each node of the tree is defined by the best splitting variable out of a small random selection of the independent variables, i.e., the metrics. Error and variable importance are assessed using the training data left out (out-of-bag data). Given the forest of regression trees, estimation is made using a majority of votes.
For each combination of data set, colouring method and metric group, the proportions for pine, spruce and deciduous were predicted. Each species proportion was predicted independently, but was normalised by division with the sum of all predicted species proportions. Additionally, species-specific mean stem volume was calculated by multiplying the predicted species proportion by predicted total mean stem volume. The predictions were made by leave-one-out cross-validation at plot level, and accuracies were reported in terms of RMSE and bias for mean stem volume, species proportions and species-specific mean stem volume.

Results
Mean stem volume was predicted using the spatial metrics from each data set. The metrics used in the regression models and the resulting RMSEs are shown in Table 3. The RMSE in percent of surveyed mean stem volume varies only marginally between the different data sets, ranging from 36.0% to 36.7%. Table 3. Regression models of mean stem volume (see Section 3.5.1. above) used for each data set, with corresponding RMSEs (absolute and in percent of the surveyed mean). SumH 2 is the sum of squared heights for the local maxima; Roughness is the surface roughness of the CHM; Elev.P25 is the 25th height percentile and Elev.mean is the mean height.

Lidar2016_MS
Vol = β 0 + β 1 Elev.P25 + β 2 Elev.mean + ε i 82. 5 36.6 The lidar data set showed superior performance when predicting the species proportions with accuracies of 0.22, 0.22 and 0.16 for pine, spruce and deciduous, respectively, in terms of absolute RMSE. For the image-based data sets the accuracies ranged from 0.22 to 0.34, 0.21 to 0.34 and 0.16 to 0.30 for pine, spruce and deciduous proportions, respectively ( Figure 3). However, when comparing different colouring methods for the same data set and metric group, the variation was small. Metric groups A and B performed well for all the data sets and colouring methods, group D resulted in the poorest performance for UC2014, whereas the poorest performing metric group for the 2016 data sets (UC2016 and UC2016DSM_SURE) was group C. Comparing the years, the 2016 leaf-off data set generally had a lower RMSE for identifying pine proportions than the 2014 data set, while for spruce, the 2016 data set performed slightly better. The UC2016DSM_SURE data set, which is a filtered and thinned point cloud, showed very similar performance as other UC2016 combinations. Combining 2014 leaf-on and 2016 leaf-off spectral data was better than using either data set alone, resulting in the best performance for predicting species-specific proportions.
The best accuracy for the image data sets, measured as lowest mean of species-specific stem volume RMSEs in percent of field surveyed mean, was found for the combination of 2014 and 2016 spectral data, coloured by the mean value from the overlapping images and group B metrics. The RMSE for the species proportion was 0.24, 0.23 and 0.20 for pine, spruce and deciduous, respectively (Figure 3). For the species-specific stem volume at plot level, the RMSE in percent of surveyed mean was found to be 129%, 60% and 118% for pine, spruce and deciduous, respectively.
Compared to the field surveyed tree species proportions (Figure 2), the predicted proportions are underestimated at the ends of the range and overestimated in the middle, i.e., species-pure forests are underestimated and mixed forests are overestimated (Figure 4).

Discussion
The best performance for tree species-specific proportion was achieved using multispectral lidar. Compared to aerial images, lidar show no problems with sun-sensor-target geometry, since the system provides its own illumination source, and illumination and viewing angles are equal. This will, to a large extent, probably eliminate the problem of reflectance anisotropy and reflectance from adjacent trees described as a large source of error in species classification [21]. Compared to other studies where only aerial images were used, the results of the present study (i.e., RMSE of 129%, 60% and 118% for pine, spruce and deciduous, respectively, at plot level) have lower performance than, for example, Ref. [15] who reported RMSEs of 56.2%, 42.0% and 91.7% for pine, spruce and deciduous mean stem volume, respectively. Significant improvement in accuracy can be expected when results are aggregated to stands, for example, as in [15], which showed RMSEs of 43.3%, 35.0% and 80.1% for pine, spruce and deciduous stem volume, respectively, at stand level. In an earlier study, performed at the same study area as the present study (Remningstorp), RMSEs of 90.6%, 26.4% and 72.6% for pine, spruce and deciduous stem volume, respectively, were reported for stand level prediction [44].
In [15], an extensive transformation of predictor variables and an exhaustive variable selection method using best subset regression was used, aiming to find the best performing models. However, in the present study, the aim was to compare the effects of point cloud colorization; spectral metrics and seasonal data. Therefore, a simplified model approach was used, which may account for lower RMSEs.
The method of colouring the point cloud showed marginal impact on the results. This indicates that the method for colourization of the point cloud has little importance when predicting species proportions or mean stem volume. One reason for this result may be that the radiometric calibration done during the image post-production has already manipulated the spectral values towards an average of the overlapping images and a general tone for the image block. However, the aim with the relative radiometric calibration is to minimize the difference in colour tone between similar objects in different images, to improve image interpretation and analysis within the image block [20].
Group A metrics were as good or better than other metric groups, indicating that simple mean values of spectral information at the plot are sufficient. Creating more advanced metrics, group C did not improve the performance but rather had the opposite effect, indicating that information from only the directly sunlit part of the canopy does not improve performance. This is, to some extent, supported by the thorough study by [21], which reported that the difference in reflectance was higher in the sunlit part of the crown than in the self-shaded, for three tree species. This partly contradicts the results in [19], where the sunlit part of the crowns were used to classify tree species with high accuracy (88% and 89%), however, using a small evaluation data set and limited variation in view angle. Another explanation may be that the CHM is not accurate enough to properly model the sunlit part. The metric groups using many metrics (D and E), had lower performance than groups A and B, and this is probably due to the large number of descriptive variables used, making the parameter selection in Random Forest more difficult.
The difference in performance between seasons indicate that the prediction for specific tree species varies with the spectral variation as a result of seasonal change in tree phenology. This is because the spatial height distribution of the points and their spectral properties vary due to the phenological state of the trees, especially deciduous, over the seasons [45]. However, this variation in spectral information over the seasons can be utilized if multiple image data sets are used, indicated here by that the lowest RMSE result was obtained when combining spectral data from 2014 and 2016 data sets. The RMSE clearly improved for predicting deciduous proportion, which shows that the difference between foliated trees in the summer and barren branches and the ground in the spring images improves prediction. Changes in height distribution of the point cloud data between leaf-on and leaf-off images has been suggested for deciduous forest mapping [39]; however, in this study, none of the height metrics was ranked as top five in importance by Random Forest.

Conclusions
In this study, method of colorization marginally affected the prediction results for tree species-specific proportions when radiometric correction has been applied to the images. Simple mean values of the spectral metrics had the highest importance. Using spectral data from two seasons improves predictions of species proportions, especially proportion of deciduous trees. The predicted tree species proportions are underestimated for species-pure plots and overestimated for mixed plots. Further research should be directed towards the sub-plot level, for example, semi-ITC classification of tree species for stand level aggregation to species proportion.