^{*}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

The vegetation in the forest-tundra ecotone zone is expected to be highly affected by climate change and requires effective monitoring techniques. Airborne laser scanning (ALS) has been proposed as a tool for the detection of small pioneer trees for such vast areas using laser height and intensity data. The main objective of the present study was to assess a possible improvement in the performance of classifying tree and nontree laser echoes from high-density ALS data. The data were collected along a 1000 km long transect stretching from southern to northern Norway. Different geostatistical and statistical measures derived from laser height and intensity values were used to extent and potentially improve more simple models ignoring the spatial context. Generalised linear models (GLM) and support vector machines (SVM) were employed as classification methods. Total accuracies and Cohen’s kappa coefficients were calculated and compared to those of simpler models from a previous study. For both classification methods, all models revealed total accuracies similar to the results of the simpler models. Concerning classification performance, however, the comparison of the kappa coefficients indicated a significant improvement for some models both using GLM and SVM, with classification accuracies >94%.

Particularly in the boreal regions, forest ecosystems are expected to be highly affected by increasing temperatures caused by climatic changes [

A large proportion of the total land area in Norway is constituted by the forest-tundra ecotone. For such vast areas, cost-efficient motoring will most likely have to involve remote sensing techniques. However, the small size and sparse distribution of the objects of interest limit the monitoring capabilities of most available spaceborne optical remote sensing instruments because of their limited spatial resolutions. Trees located in the forest-tundra ecotone have an assumed height growth of 1 to 10 cm per year depending on locality and the prevailing microclimate, and a remote sensing technique with the capability to detect subtle changes in growth and colonization patterns in the forest-tundra ecotone is therefore needed. In this context, airborne laser scanning (ALS) may be a well-suited tool for monitoring changes regarding tree migration both further north and to higher altitudes. Several studies on the prediction of biophysical parameters have documented the suitability of ALS on a single-tree level at different scales [^{−2} for the discrimination of individual trees with a minimum tree height of 2 m. Based on positive laser height values as a criterion for successful tree detection inside field-measured tree crown polygons, Næsset and Nelson [^{−2}) for the detection of small pioneer trees irrespective of tree height. Detection success rates of over 90% for coniferous and at least 84% for mountain birch trees were reported for trees with tree heights ≥1 m [

With regard to forest inventory utilizing ALS data, it is more common to merely employ the height information of the laser echoes instead of using the full suite of available information. Spectral data,

The main objective of this study was to assess the capability of geostatistical and standard statistical measures derived directly from high-density ALS data to improve the classification of tree and nontree echoes. For this purpose, the following variables were derived from laser height and intensity values using a moving window and tested as discriminators in different classification models: (1) a geostatistical measure represented by the variogram-derived mean semivariance; and (2) standard statistical measures represented by the arithmetic mean, the standard deviation and the coefficient of variation. Based on two different classification methods, the accuracy and performance of the diverse models were assessed and finally compared to simpler models from a previous study [

The study area covered a 1000 km long and approximately 180 m wide longitudinal transect encompassing hundreds of mountain forest and alpine elevation gradients. The transect stretches from Mo i Rana in northern Norway (66°19′N 14°9′E) to Tvedestrand in the southern part of the country (58°3′N 9°0′E) (

The field work in the transect was carried out at 25 different field sites allocated along the transect during summer 2008 in order to provide

Each field site consists of two to four sample plots to cover the width of the forest-tundra ecotone. Because the width of the forest-tundra ecotone varies between different locations, the number of sample plots in each site was determined in field based on both visual and practical judgment of the altitudinal range of the ecotone in each case. Furthermore, sample plots within field sites were laid out with 50 m interdistance to avoid overlap. These procedures resulted in a total number of 77 sample plots. Two Topcon Legacy E+ 20-channel dual-frequency receivers observing pseudo range and carrier phase of both Global Positioning System and Global Navigation Satellite System satellites were used as base and rover receivers for real-time kinematic differential Global Navigation Satellite Systems (dGNSS) navigation and positioning. For each field site, the closest suitable reference point of the Norwegian Mapping Authority was selected to establish the base station. For the selection of the sample trees in the field, a modified version of the point-centerd quarter sampling method (PCQ) [

For each sample tree, several tree metrics were recorded individually. Tree species was determined and tree height was measured using a steel tape measure for smaller trees and a Vertex III hypsometer for tall trees. Stem diameter was callipered at root collar and crown diameters were measured in the cardinal directions with a steel tape measure. Finally, the precise position for each tree was determined using the dGNSS-based procedure described above.

In this study, a total of 524 trees were used, ^{2} to 19.54 m^{2}. A summary of the tree metrics is given in

Airborne laser scanner data were acquired on 23 and 24 July 2006 with an Optech ALTM 3100C laser scanning system.

A Piper PA-31 Navajo aircraft carried the laser scanning system at an average flying altitude of 800 m above ground level. The flight speed was approximately 75 ms^{−1}. The scan frequency was 70 Hz, the maximum half angle was 7°, and the average footprint diameter was estimated to 20 cm. Furthermore, the pulse repetition frequency was 100 kHz and resulted in a mean pulse density of 6.8 m^{−2}. The 1000 km long transect was split into 98 individual flight lines to keep the flying altitude across the mountains and hence the pulse density as constant as possible.

Pre-processing of the laser scanning data was conducted by the contractor (Blom Geomatics, Norway). For all laser points, planimetric coordinates (

For the derivation of the terrain model, laser echoes labelled as “last-of-many” and “single” (LAST) were used. Ground echoes were classified from the planimetric coordinates and the corresponding height values of the LAST echoes, and based on an iteration distance of 1.0 m and an iteration angle of 9°, a triangulated irregular network (TIN) was derived using the TerraScan software [

Laser echoes labelled as “first-of-many” and “single” (FIRST) were used for the analyses. For this purpose, FIRST echoes were projected onto the TIN surface to interpolate the corresponding terrain height values on these locations. Furthermore, the differences between the FIRST echo height values and the corresponding interpolated terrain heights were computed and stored. In this study, merely the FIRST echoes, hereafter referred to as laser echoes, with height values greater than zero were included because this criterion represents the sole indicator for the presence of objects on the terrain surface.

The ALTM 3100C instrument may record up to four echoes per laser pulse with a minimum vertical distance of 2.1 m between two subsequent echoes of an individual pulse. However, this instrument property in combination with low vegetation in the present study resulted in very few pulses with more than a single echo. Hence, the LAST and FIRST datasets were almost identical for most of the sample plots.

For assessing the capability of discriminators represented by geostatistical and standard statistical measures derived from the laser echoes to improve the classification of tree and nontree echoes, a sequence of computations had to be conducted prior to the analysis.

First, the field-measured crown diameters were used to compute elliptical tree crown polygons to select the tree echoes. Trees with a crown diameter value less than 1.0 m in at least one cardinal direction were assigned a tree crown polygon with a constant radius of 0.5 m. This was done to take into account the precision of the laser echoes (see Section 5).

Furthermore, areas within the sample plots where it was ensured that there were no trees because of the basic properties of the PCQ sampling method were identified in order to find and select nontree laser echoes. These areas were those sectors of the four quadrants that were closer to the plot center than the closest recorded tree irrespective of tree size class. In this process, the crown polygon of the closest tree was erased from the nontree sector to ensure that only laser echoes emerging from nontree objects were included.

The laser height and intensity values from the laser echoes were used for the computation of discriminators for the classification analyses. Concerning the laser height, the numerical height values were used directly. For laser intensity, the raw intensity values (_{Raw}_{Ref}

For the computation of the geostatistical and statistical measures, each of the 77 sample plots was overlaid with equally spaced grid points with an interdistance of 1 m. A moving window consisting of a circular buffer with a radius of 3 m was employed to select laser echoes for the estimation of the different geostatistical and statistical measures at each grid point both based on the laser height and intensity values. A radius of 3 m was chosen so that the moving window would be larger than the largest tree crown in the data material. Thereafter, each laser echo was assigned the computed measures of its closest grid point (

Semivariograms were employed as the geostatistical discriminator and were used in the analysis as a mean to characterize differences in the behavior of spatial correlation of laser height and intensity values for those tree and nontree echoes with positive height values.

A measure for the spatial correlation of a variable is derived from the calculation of the semivariances of multiple pairs of observations as a function of their separation distance [

The semivariances and hence the spatial variability of a variable can be illustrated by a semivariogram, which is usually referred to as a variogram. In case of spatial dependence, a univariate experimental variogram is characterized by an increase in semivariance with distance

For computation of the experimental variograms specifically, variograms were calculated individually for each grid point of the 77 sample plots using the

In addition to the geostatistical discriminator, statistical summary measures were employed. The arithmetic mean (

Generalised linear models (GLM) and support vector machines (SVM) were employed as classification methods in the analyses. Simple models (

Geostatistical and statistical measures that revealed a significant improvement of the model compared to the simple model when used individually were subsequently combined in extended models using all possible combinations (

GLM are commonly used in regression analysis, however, GLM also represent a suitable tool for binary classification problems predicting probabilities on a transformed scale [

In the statistical computing software R, the different GLM models (

SVM, which were developed by Cortes and Vapnik [

The different models (

A leave-one-out cross-validation was used to assess the classification performance of the modeling with GLM and SVM. In the validation, each entire field site (

For each model fitted for prediction irrespective of the classification method, the total percentage of correct prediction and the Cohen’s kappa coefficient [_{1} and _{2} are the two independent kappa coefficients, and _{κ1} and _{κ2} represent the respective standard errors. Kappa coefficients were evaluated quantitatively according to the grading suggested by Landis and Koch [

Classifications of the laser echoes into tree and nontree echoes using GLM and SVM models including geostatistical and statistical measures revealed total accuracies of at least 93.6% (

Furthermore, kappa coefficients were improved by at least 0.032 (

The classifications of the laser echoes using GLM revealed total accuracies between 93.6% and 94.9% (

The total accuracies differed with 1.3 percentage points between models (_{SV}_{AM}_{SD}_{CV}_{SD}_{CV}

Assessing the corresponding kappa coefficients, higher kappa coefficients were found for models including the geostatistical measure and/or the statistical measures represented by the arithmetic mean and the standard deviation derived from the laser height values (_{SV}_{AM}_{SD}_{SV_}H_{AM}_{AM}_{SV}_H_{AM}

Comparing the kappa coefficients of the nine estimated models to the simple model (

Using the geostatistical and statistical measures derived from the laser height values, a significant contribution could be found for the mean semivariance and the arithmetic mean (_{SV}_{AM}_{SV}_H_{AM}

For the SVM classification method, the twelve different models revealed total accuracies ranging from 94.7% to 95.7% (

The twelve models had a maximum difference in total accuracy of 1.0 percentage points (_{SV}_{AM}_{SD}_{SV}_H_{AM}_{SV}_H_{SD}_{AM}_H_{SD}_{SV}_{SD}_{SV}_H_{SD}

Furthermore, the corresponding kappa coefficients were higher for models including the mean semivariance, the arithmetic mean, and the standard deviation derived from the laser height values, both individually and in combination with one another (_{SV}_{AM}_{SD}_{SV}_H_{AM}_{SV}_H_{SD}_{AM}_H_{SD}

The comparison between the kappa coefficients of the simple model

No significant contribution could be found for any of the models consisting of the geostatistical and statistical measures derived from the laser intensity values (_{SV}_{SD}_{CV}_{AM}

For the laser height derived geostatistical and statistical measure, a significant contribution was found for six models including the mean semivariance, the arithmetic mean, and the standard deviation individually or in combination with one another (_{SV}_{AM}_{SD}_{SV}_H_{AM}_{SV}_H_{SD}_{AM}_H_{SD}

The classification into tree and nontree echoes including geostatistical and statistical measures revealed total accuracies that are equivalent to the results obtained by Stumberg

Kappa coefficients indicated a significant improvement when including geostatistical and statistical measures for some models in comparison to the classification performances reported by

In the present study the time difference between the acquisition of the ALS data and the field registrations will most likely have caused small differences between the two datasets. This would be due to tree growth and mortality or other external factors affecting the trees. We do however expect the errors introduced by this to be small.

The ALTM 3100C instrument used to acquire the ALS data in the present study has an expected precision of around 0.1 m vertically and 0.2–0.3 m horizontally [

To conclude, the classification of tree and nontree echoes based on previous models from the study conducted by Stumberg

Adding a geostatistical measure represented by the mean semivariance derived from the laser height values significantly improved the results compared to the basic model of both the GLM and the SVM classification methods, respectively. For this discriminator, total accuracies of at least 94% could be obtained irrespective of the classification method or being used individually or in combination with other statistical measures. The mean semivariance estimated from the laser intensity values, however, did not reveal a significant contribution to the classification performances.

With regard to the statistical measures, the arithmetic mean derived from the laser height had a significantly positive effect on the classification performances for both classification methods when being used individually and in most combinations with other measures. The laser intensity-derived arithmetic mean, however, revealed an equivalent performance for GLM and a worse performance using SVM. Concerning the standard deviation, no significant contribution could be found using GLM for neither the laser height nor intensity-derived values. Employing SVM, a significant improvement was merely obtained for the discriminator derived from the laser height. The coefficient of variation revealed no significant contribution to neither of the basic models

In general, the highest improvement of a basic model was found for the

This research has been funded by the Research Council of Norway (project #184636/S30). We wish to thank Blom Geomatics AS, Norway, for collection and processing of the airborne laser scanner data. Thanks also appertain to Vegard Lien at the Norwegian University of Life Sciences, who was responsible for the fieldwork. Furthermore, Nadja Stumberg would like to thank Hans Ole Ørka and Liviu Ene at the Norwegian University of Life Sciences for valuable remarks during the analysis process. Finally, we would like to thank the three anonymous reviewers for valuable and constructive comments and suggestions.

Nadja Stumberg has been the main author of the manuscript, carried out calculations and analysis in the study, and conducted parts of the field work. Marius Hauglin has co-authored and revised the manuscript. Ole Martin Bollandsås has planned and prepared the field data and revised parts of the manuscript. Terje Gobakken has prepared the remote sensing data, supervised parts of the study and has revised parts of the manuscript. Erik Næsset has planned and prepared the remote sensing data, detailed the field sampling design, supervised the study and revised parts of the manuscript.

The authors declare no conflict of interest.

Overview of the study area with the 25 specific field sites (black points). The 1000 km long transect (black line) stretches from to 66°19′N 14°9′E to 58°3′N 9°0′E.

Illustration of a PCQ sample plot (

Summary of field measurements of trees.

Mountain birch | Height (m) | 404 | 1.41 | 0.04 | 7.80 |

Diameter (cm) | 404 | 4.24 | 0.10 | 34.00 | |

Crown area (m^{2}) |
404 | 1.13 | 0.001 | 19.54 | |

| |||||

Norway spruce | Height (m) | 67 | 1.67 | 0.07 | 7.00 |

Diameter (cm) | 65 ^{a} |
6.54 | 0.20 | 19.10 | |

Crown area (m^{2}) |
67 | 1.45 | 0.006 | 5.69 | |

| |||||

Scots pine | Height (m) | 53 | 1.33 | 0.10 | 5.10 |

Diameter (cm) | 53 | 5.00 | 0.30 | 18.90 | |

Crown area (m^{2}) |
53 | 0.81 | 0.002 | 7.28 |

Note:

Missing values due to tree properties.

Geostatistical and statistical measures used for classification.

Laser |
Mean Semivariance | _{SV} |

Arithmetic Mean | _{AM} | |

Standard Deviation | _{SD} | |

Coefficient of Variation | _{CV} | |

| ||

Laser |
Mean Semivariance | _{SV} |

Arithmetic Mean | _{AM} | |

Standard Deviation | _{SD} | |

Coefficient of Variation | _{CV} |

Models used classification with GLM and SVM.

^{a} | |
---|---|

Basic models GLM | _{SV}_{AM}_{SD}_{CV}_{SV}_{AM}_{SD}_{CV} |

Additional models GLM | _{SV}_H_{AM} |

Basic models SVM | _{SV}_{AM}_{SD}_{CV}_{SV}_{AM}_{SD}_{CV} |

Additional models SVM | _{SV}_H_{AM}_{SV}_H_{SD}_{AM}_H_{SD}_{SV}_H_{AM}_H_{SD} |

Note:

Summary of the discriminator variables.

Tree | Height (m) | 1.59 | 0.04 | 6.49 |

Mean semivariance | 0.95 | 0.00 | 6.28 | |

Mean | 1.25 | 0.08 | 4.24 | |

Standard deviation | 0.91 | 0.00 | 2.58 | |

Coefficient of variation | 0.80 | 0.00 | 2.24 | |

Intensity | 51.62 | 4.24 | 90.95 | |

Mean semivariance | 114.36 | 0.00 | 603.08 | |

Mean | 53.80 | 34.21 | 76.58 | |

Standard deviation | 10.86 | 0.00 | 22.80 | |

Coefficient of variation | 0.21 | 0.00 | 0.48 | |

Slope (°) | 16.49 | 1.05 | 49.89 | |

| ||||

Non-tree | Height (m) | 0.17 | 0.01 | 4.72 |

Mean semivariance | 0.04 | 0.00 | 4.02 | |

Mean | 0.19 | 0.04 | 4.17 | |

Standard deviation | 0.12 | 0.00 | 2.46 | |

Coefficient of variation | 0.51 | 0.00 | 2.64 | |

Intensity | 56.22 | 0.51 | 110.82 | |

Mean semivariance | 60.14 | 0.00 | 1462.73 | |

Mean | 56.10 | 10.65 | 94.01 | |

Standard deviation | 7.56 | 0.00 | 38.26 | |

Coefficient of variation | 0.14 | 0.00 | 1.04 | |

Slope (°) | 16.54 | 0.005 | 79.68 |

Performance of the different models used for classification with GLM.

^{a} |
^{b} |
||||
---|---|---|---|---|---|

_{SV} |
0.85 | 0.947 | 0.605 | 2.333 | ^{*} |

_{AM} |
0.85 | 0.946 | 0.606 | 2.482 | ^{*} |

_{SD} |
0.80 | 0.943 | 0.590 | 1.255 | |

_{CV} |
0.75 | 0.936 | 0.526 | 3.469 | ^{**} |

_{SV} |
0.75 | 0.948 | 0.570 | 0.285 | |

_{AM} |
0.70 | 0.948 | 0.565 | 0.626 | |

_{SD} |
0.65 | 0.949 | 0.573 | 0.029 | |

_{CV} |
0.70 | 0.949 | 0.565 | 0.577 | |

_{SV}_H_{AM} |
0.85 | 0.946 | 0.606 | 2.480 | ^{*} |

0.75 | 0.949 | 0.573 |

Notes: Level of significance:

<0.05.

<0.005.

As received by the comparison between two independent kappa coefficients,

Performance of the different models used for classification with SVM.

^{a} |
^{b} |
^{c} |
^{d} |
|||
---|---|---|---|---|---|---|

_{SV} |
100 | 0.1 | 0.957 | 0.666 | 4.995 | ^{**} |

_{AM} |
1000 | 0.1 | 0.956 | 0.655 | 4.183 | ^{**} |

_{SD} |
100 | 0.1 | 0.957 | 0.660 | 4.539 | ^{**} |

_{CV} |
100 | 0.1 | 0.951 | 0.605 | 0.352 | |

_{SV} |
1000 | 0.1 | 0.953 | 0.613 | 0.901 | |

_{AM} |
1000 | 0.1 | 0.947 | 0.576 | 1.772 | |

_{SD} |
100 | 0.1 | 0.953 | 0.608 | 0.570 | |

_{CV} |
1000 | 0.1 | 0.950 | 0.605 | 0.353 | |

_{SV}_H_{AM} |
100 | 0.1 | 0.955 | 0.643 | 3.186 | ^{**} |

_{SV}_H_{SD} |
100 | 0.1 | 0.957 | 0.664 | 4.875 | ^{**} |

_{AM}_H_{SD} |
100 | 0.1 | 0.954 | 0.634 | 2.556 | ^{*} |

_{SV}_H_{AM}_H_{SD} |
1000 | 0.1 | 0.952 | 0.621 | 1.552 | |

1000 | 0.1 | 0.953 | 0.600 |

Notes: Level of significance: ′ <0.1.

<0.05.

<0.005.

Cost or penalty parameter.

Parameter regulating the radial basis function.

As received by the comparison between two independent kappa coefficients,