Combining Airborne Laser Scanning and Aerial Imagery Enhances Echo Classification for Invasive Conifer Detection

Dash, Jonathan P.; Pearse, Grant D.; Watt, Michael S.; Paul, Thomas

doi:10.3390/rs9020156

Open AccessArticle

Combining Airborne Laser Scanning and Aerial Imagery Enhances Echo Classification for Invasive Conifer Detection

by

Jonathan P. Dash

^1,*,

Grant D. Pearse

¹,

Michael S. Watt

² and

Thomas Paul

¹

Scion, 49 Sala Street, Private Bag 3020, Rotorua 3046, New Zealand

²

Scion, P.O. Box 29237, Fendalton, Christchurch 8041, New Zealand

^*

Author to whom correspondence should be addressed.

Remote Sens. 2017, 9(2), 156; https://doi.org/10.3390/rs9020156

Submission received: 6 November 2016 / Revised: 7 February 2017 / Accepted: 9 February 2017 / Published: 15 February 2017

(This article belongs to the Special Issue Fusion of LiDAR Point Clouds and Optical Images)

Download

Browse Figures

Versions Notes

Abstract

:

The spread of exotic conifers from commercial plantation forests has significant economic and ecological implications. Accurate methods for invasive conifer detection are required to enable monitoring and guide control. In this research, we combined spectral information from aerial imagery with data from airborne laser scanning (ALS) to develop methods to identify invasive conifers using remotely-sensed data. We examined the effect of ALS pulse density and the height threshold of the training dataset on classification accuracy. The results showed that adding spectral values to the ALS metrics/variables in the training dataset led to significant increases in classification accuracy. The most accurate models (kappa range of 0.773–0.837) had either four or five explanatory variables, including ALS elevation, the near-infrared band and different combinations of ALS intensity and red and green bands. The best models were found to be relatively invariant to changes in pulse density (1–21 pls/m²) or the height threshold (0–2 m) used for the inclusion of data in the training dataset. This research has extended and improved the methods for scattered single tree detection and offered valuable insight into campaign settings for the monitoring of invasive conifers (tree weeds) using remote sensing approaches.

Keywords:

LiDAR; classification; near-infrared; random forest; weeds; ALS; simulation; data thinning; invasive conifer; invasion ecology; data fusion

1. Introduction

Exotic conifers are the foundation of the plantation forest industry in many Southern Hemisphere countries, providing significant economic and social benefits. However, a number of exotic conifer species have become invasive and, under certain environmental settings, propagate beyond the plantation boundary and are now serious tree weeds [1,2,3]. In New Zealand, invasive conifers, often referred to as ‘wilding conifers’, are dominantly invading indigenous and semi-native grass and shrublands across large areas of the South and North Island [3,4]. Recent estimates place the total area affected in New Zealand at 1.7 M ha, with the rate of spread estimated at 5%–6% per annum [5]. Invasive conifers are estimated to occupy, with highly variable densities (from less than one tree per hectare to full cover), an area equivalent to the national plantation forest estate [5]. The costs from lost pasture land alone are estimated at between $88 and $221 million [6]. A range of chemical and physical control methods can be deployed to control invasive conifers [7], but these depend on successful detection of individuals and characterisation of the infestation level to be deployed cost effectively. Extensive closed canopy stands of mature invasive conifers are relatively easy to detect across the landscape, although smaller groups and isolated individuals of various sizes and particularly juvenile and stunted trees are more problematic. The control of early invasion stages, characterised by smaller, juvenile and more scattered trees, is the most cost-effective way to prevent further expansion of conifer infestations [4,8]. The detection of such trees is critical for invasive conifer management, even if the detection of individuals at such an early stage of invasion is a complex and laborious task [9,10]. Importantly, effective control in these early stages of invasion supported by a high detection rate of mature trees can reduce the cost and intensity of later management efforts on a site as widely-scattered seed producers are eliminated [8,11]. Current methods of surveillance and monitoring rely on the detection of invasive conifers across often wide areas carried out as helicopter-based surveys by skilled observers [12], ground surveys in smaller areas or a combination of both [13] with often various levels of detection success. These approaches are expensive, even if combined with control operations at the same time (search and destroy), and rely heavily on the observers’ ability to correctly and rapidly identify individual invasive conifers of variable size occurring at various densities across often multiple vegetation types.

Remote sensing has been widely used to detect and monitor invasive species across natural environments ranging from arid areas [14] and estuaries [15] to dense tropical rain forests [16] and coastal scrub communities [17]. The success of remote sensing techniques has been noted to vary according to the structural and phenological traits of the invasive species in relation to the invaded habitat [18]. Spectral and other properties from imagery alone can be used to successfully identify invasive species [18].

Airborne laser scanning (ALS) offers the appealing advantage of providing precise elevation and structural data from vegetation returns, which makes it well suited to the detection of taller isolated trees in otherwise short-stature vegetation types [19,20], and the combination of height and structural information derived from LiDAR data with imagery has been shown to improve the identification of invasive species [21,22]. The literature on identifying invasive conifers in New Zealand’s unique environment using remote sensing techniques is not well developed, and we are unaware of efforts to use ALS for invasive conifer detection in this context. However, relevant examples of this approach may be found in efforts to detect and monitor shifts in the tree line in boreal ecotones affected by climate change [23,24]. Height information from high-density ALS (7.7 pls/m²) has been successfully used to identify isolated small-stature trees in tundra environments, with nearly 91% of trees >1 m successfully detected according to elevation of echoes above a digital terrain model (DTM) generated from the same data [19]. Detection of smaller trees was more difficult, with trees <1 m returning discernible positive elevation values 5%–73% of the time depending on species and DTM properties. The decrease in accuracy could be partly attributed to the inherent vertical error in ALS elevation values, the fact that smaller individuals are less likely to be sampled or generate sufficient return energy and an increase in terrain features, such as rocks with a height close to that of the target trees [19]. Use of additional ALS-derived variables, such as intensity and topographic features, have been proposed as a means of improving the classification of trees and returns from other sources, especially at lower height thresholds [24,25]. In one study, the inclusion of intensity and slope with echo elevation allowed tree and non-tree echoes to be distinguished with an accuracy of 93%, and detection was high for nearly all tested models that included elevation and intensity [25]. However, other studies have found intensity data to be of little use in identifying ground vegetation [26]. This was at least in part due to the uncalibrated and sensor-dependent nature of intensity information from ALS data, which can reduce the applicability and utility of these data across surveys [26]. Echo-based classification requires substantial computational power to evaluate datasets over large areas that invariably contain millions of records. Stumberg et al. [25] explored the use of an unsupervised classification approach using a raster-based algorithm to identify pixels containing trees. Detection of trees using this approach was generally good with detection rates for larger trees (>1 m) ranging from 36%–73%. Overall, larger trees had an increased probability of detection and resulted in reduced errors of commission.

Research on the effect of ALS pulse density would ideally use data acquired from campaigns with varying altitude, speed or pulse frequency. Repeated campaigns would capture both changes in final pulse density, as well as related impacts, such as changes arising from beam divergence or altered energy per pulse [27,28,29,30]. Unfortunately, this is prohibitively expensive in most cases, and so, researchers have developed methods to thin ALS data and simulate adjusted flight patterns. Numerous studies have investigated the effect of pulse density on forest properties estimated from ALS data [30,31,32,33]. These studies have encompassed a wide range of forest types and data thinning methods [29,34]. However, the impact of pulse density on the detection of small, pioneer trees has not been well studied through simulation or campaign selection. Næsset and Nelson [19] achieved good detection of isolated trees with ALS data containing 7.7 pls/m². At this density, nearly every tree over 1 m was sampled by at least one pulse. Indeed, several other studies have used similar pulse densities in the range of 6–8.5 pls/m² [20,25,35] to achieve good rates of detection for similarly-sized trees (>1 m). Detection of pioneer trees at the boundary of the Arctic-boreal regions has been achieved from surveys with densities as low as 0.25 pls/m²; however, in this case, the trees had a modal height of 6.6 m [23]. Næsset and Nelson [19] used data on detection success and tree crown diameter to estimate the size of crowns required for detection at different pulse densities. Their estimate suggested that at 1 pls/m², the crown diameter would need to be 2.8–3.3 m to ensure detection. Overall, work in the boreal ecotone has demonstrated that individual pioneer trees in relatively open areas can be accurately detected using LiDAR, potentially over very large areas. In general, trees smaller than 1 m are problematic to detect without increasing the rate of false detections and other sources of error [19,20,25].

Remote sensing-based detection approaches of invasive trees that rely principally on the height of ALS echoes are likely to be well suited to many low stature vegetation types that are susceptible to conifer infestations in New Zealand. Particularly in short-stature grasslands, problematic invasive conifers are able to establish more successfully than most native tree species [8], simplifying the task of detection and avoiding a high commission error. Other vegetation types highly susceptible to conifer invasions in New Zealand, such as indigenous shrublands or tall tussock grasslands [36] mixed with complex terrain, pose a much greater challenge to detection efforts.

Our study area contained numerous vegetation types and, in some areas, the dominant indigenous vegetation formed closed canopy stands up to 4 m in height. A successful method for classifying invasive conifers requires the capacity to differentiate between echoes from invasive conifers and other vegetation. The ALS data alone may not contain sufficient information for this purpose. Fortunately, the spectral properties of invasive conifers are quite different to those of the other surrounding tree vegetation in this environment. This means that spectral information from aerial imagery may provide a practical means of separating invasive conifers from other vegetation types. Previous studies, working at a coarser resolution, have found that fusing structural information from ALS data with spectral data from satellite imagery provides a useful method for vegetation classification [37,38].

Controlling the spread of invasive conifers is critical to protection of New Zealand’s natural heritage, threatened ecosystems and ecosystem services and maintaining the licence to operate for plantation forest managers in the face of increasing environmental scrutiny. There is a need to develop feasible and effective detection methods to understand and monitor the spread of invasive conifers and to guide management and control efforts of infested areas. In this study, we attempt to develop a method for invasive conifer detection using an extensive dataset that included ALS and spectral data collected from an area dominated by indigenous and semi-native grass and shrublands in New Zealand and a field dataset that sampled 825 solitary conifers of various sizes.

Using this dataset, the objectives of this research were to (i) compare the accuracy of detection models developed using various combinations of ALS data (elevation, intensity) and aerially-acquired spectral data and (ii) determine the sensitivity of classification accuracy in these models to the height threshold used for inclusion and the density of the ALS data.

2. Materials and Methods

2.1. Study Site

The study site was located in the vicinity of Geraldine forest in the Canterbury region in the South Island of New Zealand (Figure 1). Geraldine forest and the adjacent study site is positioned in the foothills of the Southern Alps. The topography is characterised by steep and broken terrain with elevations ranging from 203–780 m above sea level. Silty-loam soils dominate, and the climate is temperate with a mean annual temperature of 8.6 °C and an annual rainfall of 864 mm. The dominant production tree species planted at Geraldine forest are Pinus radiata D. Don (P. radiata) and Pseudotsuga menziesii (Mirb.) Franco (Ps. menz.). The study site is enclosed by plantation forest and therefore prone to high seed-rain from the adjacent plantations, resulting in a high presence of self-established conifers of various ages and sizes. The dominant land covers are short-tussock grasslands dominated by Festuca spp., Poa spp. and patches of indigenous shrub, dominated by manuka Leptospermum scoparium or ferns dominated by bracken (Pteridium esculentum (G.Forst.) Cockayne). One distinctive area of the study site was dominated by the invasive shrub gorse (Ulex europaeus L.).

2.2. Field Data

A field survey was carried out between 16 May and 29 June 2016 to assess the severity of the conifer infestation across the study area. In the first instance, a grid with a randomised start point and orientation was used to locate 46 field plots throughout the study area, providing a sample with good spatial coverage of the invasive study area. Five of these plots were abandoned because the terrain was too steep for them to be safely measured, leaving a total of 41 established systematically. An additional 27 plots were selectively placed by the field crews in locations representing areas with light, moderate and dense cover of invasive conifers. This provided a dataset across all major vegetation types enabling us to characterise their structural and spectral properties using the remotely-sensed data.

The sampling unit was a slope-corrected 0.04 ha circular bounded field plot with the plot centres fixed using a Trimble Geo7X GNSS (Trimble Navigation Ltd., Sunnyvale, CA, USA). The accuracy of the recorded plot centre positions was increased by differential correction using a local base station network maintained by Land Information New Zealand (LINZ). For all invasive conifers found within the field plots, the species, total height and diameter recorded at breast height (1.4 m) for trees and at ground level for saplings. Trees were defined as those individuals with a measurable diameter at breast height. The distance and bearing of each tree from the plot centre was also recorded.

In total, 825 invasive conifers were identified, located and measured within the 68 field plots. Ps. menz. was the dominant species, constituting 98.5% (813) of all invasive conifers. The mean height of this species was 1.72 m and ranged from 0.05–12.90 m. A small number of P. muricata and P. radiata were found in the plots, with heights averaging 1.99 and 2.00 m, respectively (Table 1).

2.3. ALS Data

An ALS survey was completed over the study area on 13 and 14 June 2016 using a Riegl Q1560 two-channel scanner system with the settings shown in Table 2. A laser pulse rate of 330 kHz and a maximum scan angle of 14° off nadir were used. Flight planning ensured substantial overlap across the entire area of interest to remove the possibility of data voids. During field work, 99 ground control test points were obtained and compared to interpolated elevation values from the ALS data. The results indicated a mean difference in elevation of −0.004 m (SD 0.017 m) and RMS of 0.017 m. Initial ALS data processing, including tiling and classification, was carried out by the supplier using the TerraScan (TerraSolid, Helsinki, Finland). Intensity values in the ALS data were delivered uncalibrated and ranged between 0 and 65,535 (median = 41,600).

2.4. ALS Data Thinning

Several methods for thinning ALS data are commonly employed depending on the objectives of the operation. Where the objective is to reduce data size, these methods focus on iterative removal of points while minimising the loss of accuracy on the target output, such as elevation [39]. Other methods randomly remove every n-th echo until a target density is achieved. However, this approach does not simulate a reduction in pulses per unit area as might occur with a change in acquisition settings because the regular scan pattern will not be replicated [40,41]. A custom algorithm was developed using Scientific Python [42] to better simulate changes in pulse density that may be expected from increasing flight altitude or reducing overlap in order to achieve higher spatial coverage at the cost of lower final pulse density. The algorithm removed all echoes originating from a pulse marked for removal, with the target density determining the regularity of pulse removal. The thinned datasets retained some of the inevitable variation in pulse density contained in the original survey while approximating the target mean pulse density.

The selected target pulse densities of 10, 5, 2 and 1 pls/m² represented a compromise between achieving regularly-spaced intervals and incorporating common minimum pulse densities specified for ALS surveys in New Zealand. The data thinning algorithm successfully produced datasets with average realised properties that were consistent with the target pulse densities (Table 3).

2.5. Aerial Imagery

Aerial photography was captured over Geraldine forest and surrounding areas on 17 March 2016 using a Vexcel digital UltraCamEagle (UCE) camera with specifications shown in Table 4. Imagery was captured with a ground surface distance (GSD) of 0.30 m from a flying height of 5770 m. All imagery was free from cloud and cloud shadow and had a minimum sun angle of +35°. Image processing was carried out by the supplier and included ortho-rectification, map projection, mosaicking and the removal of atmospheric and topographical effects. Imagery was processed to Level 3 and manually checked for colour correctness and even tonal balance across the project area.

2.6. Echo Classification

Echo elevation values were converted to local normalised elevation values using a DTM triangulated from the ground classified echoes. These data were used to produce a pit-free canopy height model (CHM) [34] for the area of interest with a 0.3 m resolution. This resolution was selected because it approximates the footprint size of the laser beam in this study. Echo classification was evaluated as a point cloud processing method to investigate the detection of invasive conifers in the study area. To form a training dataset for echo classification, the location of each invasive conifer within the field plots was used to estimate the two-dimensional canopy area associated with each tree. Tree locations and the high-resolution CHM were used to train an algorithm based on the rLiDAR package [43]. The CHM was loaded into R and used to delineate the approximate canopy areas using the field measured tree locations and the ForestCAS function of rLiDAR. ForestCAS includes a user-defined parameter that sets the percentage threshold of subject tree height at which pixels are excluded from a tree canopy. Initially, the default (0.3) for this parameter was used, and this was adjusted, where required, in an exploratory manner following visual inspection of the resultant canopy polygons, CHM and imagery until the results were deemed to be accurate. Echoes within the final canopy polygons were classified as invasive conifer echoes; those outside the invasive conifer canopy area, but within the field plots, were classified as non-invasive conifer echoes.

Using echo classification, we sought to develop a method for classifying individual returns from the ALS point cloud into both those that were backscattered from an invasive conifer and those that were not. Developing a classification of this type may offer an efficient means of mapping invasive conifers across the landscape using ALS. The elevation of the return (Z) above the ground logically provides a useful variable for the classification of returns from trees and shrubby vegetation, as these have a value considerably greater than zero. However, the height of the return contains no useful information on the other properties of the target object. The intensity of the echo has previously been used to improve the classification of vegetation type and is more useful following range calibration [26]. Previous research has indicated that backscatter intensity is useful for differentiating between echoes originating in shrubby vegetation and those from other sources [44]. This suggests that, at the least, intensity can be useful for differentiating between vegetation and non-vegetation objects. Consequently, investigating the potential of backscatter intensity for classification of invasive conifer echoes is worthwhile. In this research, we examined the utility of uncalibrated ALS intensity values for improving echo classification.

We extracted spectral data from the high-resolution aerial imagery of the study area and combined these data with the ALS echoes. The data processing chain for the ALS data could not easily accommodate a fourth spectral band, and manual inclusion resulted in unwieldy computation times; therefore, we chose to drop the blue band. This was motivated by the fact that this band is sensitive to interference [45] and appeared to be less influential in previous work on spectral-based invasive species detection [18]. Each echo received the near-infrared, red and green values of the spatially co-incident pixel from the orthophotographs. Our method relied only on the simple assignment of spectral data to ALS points. This approach lacks the ability to account for the geometric effects that prevent image pixels from being reliably tied to spatially coincident ALS returns [46,47]. However, the imagery available was captured for the purpose of creating orthomosaics, and the high degree of overlap and knowledge of sensor geometry required by more sophisticated approaches was not available to us [46]. Nonetheless, as much of the vegetation was fairly low and the accuracy of assignment would also be limited by pixel size, we judged the loss of precision to be acceptable.

The coloured ALS point cloud was also used to summarise the spectral and structural properties of invasive conifers and the other major vegetation types in the study area (Figure 1). Echoes originating from invasive conifers were used to characterise their properties. Each study plot was classified according to its dominant vegetation type as either grassland, shrubs, ferns or manuka. Echoes originating from invasive conifers were excluded, and the remaining echoes within each plot boundary were used to characterise the properties of the dominant vegetation types in that plot.

2.7. Random Forest

Random forest (RF) is an ensemble decision tree classifier that uses bootstrap aggregated sampling (bagging) to construct many individual decision trees, from which a final class assignment is determined [48]. RF is increasingly being applied to natural resource problems [49] and has previously been used to successfully model several plantation forest variables using remotely-sensed data [50,51,52,53]. The RF algorithm constructs decision trees using a bootstrap sample from the available training data, with the remaining assigned as out-of-bag (OOB) samples. At each node, a random subset of predictor variables is tested to partition the observation data into increasingly homogeneous subsets. The node-splitting variable selected from the variable subset is that which resulted in the greatest increase in data purity (variance or Gini) before and after the tree node split [54]. This process ceases when there are no further gains in purity. Response variables can be continuous, calculated by averaging, or categorical, predictions derived via a model vote, amongst all decision trees. The computational load of the algorithm is reduced, as only a subset of variables is used at each node split. This process also reduces the correlation between trees, improving both predictive power and classification accuracy. The OOB sample data are used to compute accuracies and error rates, averaged over all predictions, and estimate variable importance [49,54]. RF provides two methods to estimate the importance of each predictor variable in the model. The mean decrease in accuracy (MDA) importance measure is calculated as the normalised difference between the OOB accuracy of the original observations to randomly-permuted variables [49,54]. An alternative variable importance measure is calculated by summing all of the decreases in Gini impurity at each tree node split, normalised by the number of trees [49,55]. RF is a well-regarded machine learning tool that has the capacity to identify complex and non-linear relationships in the fitting dataset and offers high classification accuracy [54,55].

RF categorical classification models were developed using the implementation of the RF algorithm available through the Ranger package [56] in R. This approach was chosen as it offered high performance classification and straight-forward parallelisation. Computing performance was important due to the large size of the training datasets. The classification training dataset included the invasive conifer key, computed from the field data, as the response variable and LiDAR elevation, intensity, near-infrared, green and red DN values as candidate variables.

RF models were initially fitted for 15 combinations of the predictor variables (Table 5) at an unthinned pulse density and with a height threshold of 0 m to examine the relative importance of ALS and spectral data for classifying invasive conifer echoes. These models ranged from a single variable model including only ALS elevation to a five variable model including all available spectral and ALS metrics. The effect of pulse density on classification accuracy was examined by fitting all 15 RF models with each of the four thinned datasets (Table 3), including all echoes regardless of their elevation. The effect of height threshold on classification accuracy was examined by varying the threshold below which the points were excluded from the training dataset between 0 m and 2 m, at intervals of 0.5 m. Using the unthinned ALS dataset, the 15 RF models were refitted at each height threshold.

2.8. Accuracy Assessment

Classification performance for each model was assessed using Cohen’s Kappa [57] (kappa) coefficient based on both a leave one out cross-validation (LOOCV) from the RF classification models and a leave one plot out cross-validation (LOPOCV) basis. LOPOCV was implemented by using a custom R function that sequentially excluded all echoes associated with a single reference plot and used all remaining echoes to train an RF model to predict classification values for the excluded plot. Using this approach provided a completely independent validation dataset for assessing predictive accuracy. The LOPOCV provides a much more conservative estimate of predictive accuracy than the other statistics calculated but is more indicative of model performance when applied to an independent dataset. Due to the high computational cost of calculating LOPOCV, this statistic was only calculated for the best performing model. In our approach, LOOCV was used to compare the relative accuracy of the models developed, and LOPOCV provides a measure of model accuracy and transferability to independent data that is more reflective of an operational deployment of this technique.

Kappa is a widely-used metric for assessing the agreement between two sets of observations. Kappa was calculated using the ‘psych’ R package [58], and unweighted kappa values were reported. The kappa statistic is generally deemed to be robust because it accounts for agreements occurring through chance alone. Several authors propose that the agreement expressed through kappa, which varies between 0 and 1, can be broadly classified as slight (0–0.20), fair (0.21–0.40), moderate (0.41–0.60) and substantial (0.61–1) [38,59]. Confidence intervals for kappa values were calculated using the methods proposed by Fleiss et al. [60] available through the ‘psych’ R package. Receiver operator characteristic (ROC) curves were also used to examine the accuracy of the classification. ROC curves are graphical representations of the accuracy of binary classifiers. The true positive rate (sensitivity) is plotted on the y-axis, and the false positive rate forms the x-axis. The ROC curve is plotted by calculating the cumulative distribution function on both of these axes with a diagonal reference line plotted to indicate where classification is no better than chance. The area under the curve (AUC) can be calculated from ROC curves and is used to quantify classification quality. AUC values for ROC curves vary between 0.5, classification no better than chance, to 1, indicating a perfect binary classification. ROC curves were plotted, and AUC was calculated, using the pROC R package [61].

In addition to the misclassification error, there will also be invasive conifers that were not sampled by the ALS campaign. In this case, there is no chance that these trees will be correctly classified as they will not be included in the sample population. The number of invasive conifer polygons containing no returns was used as a measure of the number of trees that would be omitted through this ‘out-of-sample error’ for each pulse density.

3. Results

3.1. Spectral and Structural Properties

The height profile of echoes from invasive conifers and plots containing manuka were superficially similar (Figure 2a), as both vegetation types formed continuous tree cover in some plots. Plots dominated by ferns, grassland and other shrubs have considerably different structural properties from invasive conifers. In the near-infrared band, echoes originating from the different vegetation types were quite different. This is particularly evident for plots dominated by manuka (Figure 2c). The differences in the green and red bands (Figure 2b,d) were marginally less distinct between echoes originating in invasive conifers and those originating in plots dominated by other vegetation types. The data summarised in Figure 2 suggest that combining elevation data from ALS with spectral data should provide a means of accurately classifying echoes originating in invasive conifers. The intensity values of echoes originating in invasive conifers spanned the entire range of intensity values (range = 0–65,535) in the study area and had a median intensity value (32,055) close to the study area median (41,600). This suggests that the uncalibrated intensity values used in this study would likely have little value in classification models for invasive conifers as they would overlap the values from all other vegetation types.

3.2. Classification Accuracy

The most accurate classification model (Model 1) included covariates from ALS data and data from all spectral bands (Table 5). Model 1 displayed substantial agreement between predicted and actual echo classification (kappa = 0.837, AUC = 0.885). A comparison of the four variable models showed that Model 1 was fairly insensitive to removal of the green band (Model 2: Kappa = 0.785, AUC = 0.856), red band (Model 4: kappa = 0.781, AUC = 0.854) or ALS intensity data (Model 5: kappa = 0.773, AUC = 0.849), but was sensitive to removal of the near-infra red band (Model 3: kappa = 0.744, AUC = 0.828). Compared to four variable models, there was a marked decline in model accuracy for models with three variables (Table 5). Three variable models that included ALS elevation, near-infrared and either intensity (Model 6) or another spectral band (Models 7 and 8) were far more accurate than three variable models with only spectral bands or a combination of ALS elevation and spectral bands other than near-infrared (Models 9, 10 and 13). Model 10 contained spectral information only and was substantially less accurate than models that contained ALS elevation and two spectral bands (Model 7, 8 and 9), but performed better than models that included only elevation and a single spectral band (Model 14 and 13), unless the single band was near-infrared (Model 12). Of the two variable models with ALS elevation data and a single spectral band, Model 12 including the near-infrared band was most accurate (kappa = 0.355, AUC = 0.597). By comparison, models fitted using ALS elevation data and either intensity (Model 11: kappa = 0.292, AUC = 0.597), the red band (Model 13: kappa = 0.224, AUC = 0.571) or the green band (Model 14: kappa = 0.221, AUC = 0.569) were considerably less accurate. Models developed using ALS elevation data alone displayed minimal classification accuracy (Model 15: kappa = 0.101, AUC = 0.529).

The receiver operator curves for all models were plotted and examined to determine the performance of the binary classification. This analysis highlighted a significant discrepancy between the best and worst performing models (Figure 3). The most accurate models displayed significantly more area under the ROC curve, indicating a far greater true positive and lower false positive classification rate. The worst performing models (e.g., Model 15 in Figure 3) were considerably closer to the diagonal line, indicating that their classification accuracy was closer to that expected through chance. The variable importance scores for the best random forest model (Model 1) were calculated through permutation. This analysis indicated that the near-infrared (importance = 0.147) was the most important predictive variable, followed by data from the green band (importance = 0.143). ALS elevation values and data from the red band were slightly less important, and ALS intensity data were considerably less useful (importance = 0.06).

3.3. Pulse Density and Height Threshold

Overall, pulse density did not have a significantly detrimental impact on classification accuracy for the best models (Figure 4). The more accurate four and five variable models (Models 1–5) showed little change in kappa with pulse density. The most accurate model (Model 1) was the only model where kappa increased slightly as the pulse density increased. All of the remaining models (Models 6–15) showed a decrease in kappa as the pulse density increased to 5 pls/m², which then stabilised and plateaued, or decreased marginally, at higher pulse densities.

Classification accuracy increased with the height threshold for all models (Figure 5). On average, the classification accuracy across all models increased from a mean kappa value of 0.485 at a 0 m threshold to 0.713 at a 2 m threshold. The increase in performance was particularly marked for the models that offered moderate classification accuracy, with less than four variables. Model 10 showed the greatest gain in kappa increasing from 0.308 at a 0 m threshold to 0.887 at 2 m. Although Model 10 did not include any ALS predictors, this model was among the best performing models at the 2 m height threshold. Model 5, a model that excluded ALS intensity values, became the most accurate classifier once the height threshold was increased above zero. The most accurate classifier (Model 1) displayed modest improvement as the height threshold was increased. At a 0 m threshold, the classification accuracy for Model 1 was already very high (kappa = 0.837), and this increased steadily to provide an exceptionally accurate classifier at a threshold of 2 m (kappa = 0.914). It is noteworthy that the shapes of the curves in Figure 4 and Figure 5 are influenced by the number of objects sampled by the ALS data under each scenario. The influence of ALS pulse density and height threshold on classification accuracy must be interpreted with reference to the changes in out-of-sample errors.

3.4. Leave One Plot Out Cross-Validation

A LOPOCV was implemented to test the applicability of the best performing model (Model 1) to independent data. Across all independent observations, the kappa value for the LOPOCV was 0.284 (confidence interval = 0.279–0.289). Using the categories previously proposed [59], this suggests a fair agreement between observed and predicted values. The overall AUC value for the LOPOCV analysis was 0.605 (confidence interval = 0.601–0.607), considerably lower than the AUC values acquired using LOOCV. The outputs of the best performing model (Model 1) confirm that model performance varied across the range of vegetation types in the study (Figure 6). Invasive conifers are accurately classified in many instances (e.g., Figure 6a), although there are some misclassified echoes that are also present. Areas with alternative vegetation composition contain some misclassified echoes (e.g., Figure 6b (ferns)), but in other examples, classification is considerably more accurate (Figure 6c,d).

3.5. Out-Of-Sample Errors

A linear model was used to examine the relationship between the proportion of missed trees per plot, ALS pulse density and the estimated canopy area of the missed individuals. A logit transformation of the proportion of missed trees was used as the response variable. This model (R² = 0.37, AIC = 971) indicated that ALS pulse density, canopy area and the interaction between these two explanatory variables all had a significant effect on the proportion of individuals missed by the ALS sampling at the 0.05 significance level. The number of invasive conifers that were missed decreased as the pulse density increased from an average of 6.2 trees per plot at 1 pls/m² to 0.6 trees per plot at 21 pls/m². The proportion of trees accounted for increased from 76% at 1 pls/m² to 96.4% at 21 pls/m². This result has important implications for guiding suitable campaign settings for invasive conifer detection. Below 5 pls/m², considerable numbers of invasive conifers start to be omitted from the ALS sampled area (Figure 7). It is also clear that at lower pulse densities a greater number of larger trees are omitted from the ALS sample than at higher pulse densities (Figure 7).

4. Discussion

The principle finding of this study is that using data from ALS in combination with spectral data from aerial imagery provides a fairly accurate means of classifying LiDAR echoes from invasive conifers in an invasion-prone vegetation type. The study results indicate that this approach may offer a promising method for detecting invasive conifers that invade relatively complex terrain with a vegetation structure composed of short tussock grassland intermixed with shrub species.

The ALS elevation and intensity data and the spectral data from aerial imagery were of limited value for detecting returns from invasive conifers in their own right. Fusion of the two data types improved echo classification. Related approaches relying on ALS elevation data are often concerned only with identifying all tree species in the landscape [19,20,26]. Numerous studies have fused ALS data with spectral data to improve classification [38,62]. However, these approaches often rely on pixel-based methods to classify different species [63,64] or the aggregation of remote sensing data sources to larger scales, e.g., individual tree crowns [62]. To the best of our knowledge, this is the first time that spectral data have instead been added to individual ALS echo data to assist with the identification of invasive conifers. This method appears to improve classification accuracy. Although we did not test alternative methods, the removal of the aggregation step may offer some advantages over approaches relying on the aggregation of LiDAR data into surfaces. Of the spectral bands available, the near-infrared band was the most important for detecting invasive conifers in combination with ALS data.

The effect of pulse density on classification accuracy was examined and found to have a negligible effect for the five most accurate models, but the number of missed invasive conifers was highly dependent on the pulse density (increasing the probability of omission). Raising the height threshold for the inclusion of echoes in the training dataset was found to have little impact on four- and five-variable models, but did markedly improve the classification accuracy of the remaining models. Our approach is reliant on very accurate alignment between both datasets used and will be limited by the coarsest resolution data available. In our study, the coarsest remotely-sensed data used were the aerial imagery (0.3 m GSD), and it is possible that finer resolution imagery would further improve classification accuracy. As pixel values represent an aggregation of spectral values, the attribution to a return could be erroneous (e.g., ground echoes could be assigned canopy spectra values). This could well be the causal mechanism for improved classification accuracy at higher height thresholds. However, the resolution of this dataset is relatively fine, and lower cost alternatives, such as satellite imagery, would be unlikely to provide a useful classification for all but the most severe conifer infestations or larger scattered conifers. Very high spatial, and spectral, resolution satellite imagery could likely provide comparable results to the aerial imagery employed in this research. Further research should be able to provide insight into the optimal spectral and spatial resolutions for invasive conifer detection across a wider range of vegetation types. Emerging sensor technologies will inevitably offer improvements in this area. Photogrammetric point clouds derived using structure from motion could provide a functional, and potentially less expensive, alternative to the combination used in this research. However, our imagery contained insufficient overlap to test this thoroughly.

In this study, field data, comprising a characterisation of the present vegetation, as well as the presence and tree metrics of invasive conifers, was collected across the study area covering the dominant vegetation types present. The objective was to ensure that remotely-sensed data were available for all of the common vegetation types in an area vulnerable to conifer infestation. Consequently, classifiers developed from this field dataset should be more robust to false positive classification of echoes from other trees and shrubs in the study area than if the sampling design had focussed solely on the invasive conifer infestation and did not encompass such a wide range of vegetation types. The size of the ground sample of invasive conifers in this study (n = 825) is equivalent to or larger than other remote sensing studies aimed at detecting pioneer trees in the boreal-alpine transition zone [19,20,26]. Unlike previous research, we did not record crown width measurements in the field survey plots, which saved significant data collection effort. Instead, a novel technique was applied where the crown area of invasive conifers was extracted from the CHM using a semi-automated technique, and this was used to provide a binary classification key for the training dataset. Following some manual corrections, this approach appeared to offer a successful alternative to field measurement. However, we were not able to empirically determine the quality of crown area estimates with this dataset; this would be a valuable topic for further research. The apparent success of the canopy delineation approach was probably due, in part, to the exceptionally high density of the original ALS dataset, which provided a high quality, fine resolution CHM with minimal distance between subsequent returns.

The ALS data thinning algorithm employed was based on systematically removing all echoes associated with a pulse in a manner that simulated increased pulse spacing on the ground. We believe this technique provided a reasonable approximation of the effect of reduced pulse density in the ALS data. However, recently-proposed, sophisticated methods for simulation of LiDAR campaign effects on pulse density may offer more realistic results [29]. The complexity of these methods and the computation time associated with such a high-density base dataset motivated our selection of a simpler pulse-based thinning algorithm. Regardless, no data thinning technique currently available can fully simulate the increased laser footprint size (further exaggerated on steep slopes), the effect of the increased thickness of the atmosphere experienced when flying at a greater altitude [65] or more oblique scanning angles that may be associated with different campaign settings [29,66]. Sophisticated simulations based on ray tracing [67] offer the best opportunity for those seeking a more complete understanding of the influence of campaign settings on the data generated.

The most accurate RF classification models developed in this study provided highly accurate echo classification that could be useful for invasive conifer detection. Using the entire dataset, the most accurate classification model included information from ALS data and from all available spectral bands. The kappa value for this model (0.837) was substantially higher than the kappa value (0.594) of the best reported model from a similar echo classification study in Norway [20]. It is reasonable to speculate that this improved performance may be due to the inclusion of high-resolution spectral data, although the properties of the ALS dataset and intensity of the field survey may also be contributing factors.

Removal of the ALS intensity value from the classification model led to only a negligible decrease in model performance, and the RF importance scores indicated that these data contributed the least to successful classification. This is consistent with other recent research that suggested that the utility of LiDAR intensity data for classifications of trees and non-trees is ‘far from significant’ [26]. However, it is possible that intensity values may be useful in situations where there is extensive inorganic material (e.g., rocky terrain) with relatively distinct backscatter characteristics. In this study, we found that a successful classifier was characterised by the ability to differentiate between vegetation and non-vegetation echoes, as well as between vegetation types. For this reason, models based only on ALS elevation data or spectral data were of little use. Models that combined ALS data with all available spectral bands were the most accurate classifiers. However, of the spectral bands tested, the near-infrared was the most valuable for classifying invasive conifers. It is well established that coniferous species show distinctive optical characteristics [68]. In comparison to other species groups, conifers are characterised by higher levels of absorption across the visible wavelengths [68] with especially high levels of absorption observed in the near-infrared portion of the spectrum [69,70]. It is noteworthy that the differences in the visible spectrum are related to changes in leaf chemistry, while differences in the near-infrared portion of the spectrum are primarily related to differences in leaf structure [68,69]. We deem it likely that these properties can partly explain the importance of all bands, as well as the increased importance of the near-infrared data in discerning coniferous species from other vegetation types. This has important practical implications for data collection campaigns aimed at detecting the spread of invasive conifers. Future research should seek to further capitalise on the distinctive spectral properties of conifers in these landscapes by investigating other wavelengths and the possibility of using vegetation indices in conjunction with ALS elevation data.

ALS pulse density was investigated in this research and, as expected, was found to have very little impact on classification accuracy. However, this result should not be interpreted as an indication that low-density ALS data are equally as useful for invasive conifer detection as high density data. In addition to the echo classification error, a further source of error is the omission of invasive conifers that are not scanned due to the campaign settings or occlusion from surrounding terrain or vegetation. Further research is required to investigate the effect of pulse density on the probability of detection invasive conifers of various size by remotely-sensed datasets of varying resolution.

We varied the height threshold used for inclusion in the classification dataset and investigated the effect of height threshold on classification accuracy. This analysis showed that the most accurate models were relatively insensitive to increases in the height threshold. This suggests that these models provide an accurate classification right down to the ground level, and so, the echo classification approach has the potential to correctly identify smaller trees. Interestingly, the model that included all predictors except ALS intensity data (Table 4, Model 5) was more accurate than Model 1, which included ALS intensity data, at height thresholds exceeding 1 m. This could be because at heights below 1 m, the echo classifier needs to be able to differentiate between invasive conifers and inorganic material (e.g., rocks), and ALS intensity may provide useful information within this height range. However, at higher height thresholds, this is no longer an issue, and so, the ALS intensity data do not contribute to classification accuracy. Model 10 was fitted with spectral data only and was the most sensitive to the height threshold used. When the height threshold was 0 m, the accuracy of the model was only fair, but the accuracy rate increased rapidly with the height threshold, and by 2 m, this was amongst the best performing models. This is most likely caused by the fact that above 2 m, all echoes will be from trees or large shrubby vegetation, and the spectral values have significant power to differentiate between invasive conifers and other vegetation types. This suggests that aerial imagery has utility for detecting larger invasive trees, but if detection prior to maturity is required, then ALS data, in combination with spectral values, have greater capacity for detection.

The characteristics of the ALS survey and the size of the target have a significant effect on the probability of invasive conifers being included in the campaign out-of-sample error. This result has important practical implications for guiding suitable campaign settings for invasive conifer detection. Below 5 pls/m², considerably more invasive conifers were missed, and those missed individuals were also considerably larger. Detecting and eradicating invasive conifers before they start to produce cones is vital to controlling conifer spread and minimising their ecological impact. As a result, based on the evidence presented in this paper, it is suggested that a pulse density of 5 pls/m² or greater should be used if invasive conifer detection is a major objective of the data acquisition. For large-scale management, it may be infeasible to collect ALS data at this density due to financial or time constraints. In these situations, successive, lower density surveys may offer the possibility to detect missed individuals as height and canopy volume increases while background objects that interfere with detection, such as rocks and mounds, remain static [19,20]. The high growth rates of invasive conifers may further assist detection from repeated surveys. Estimates from Ledgard and Paul [10] suggest that during the earlier stages of invasion, height growth for Pinus contorta Dougl. ex Loudon (one of the most problematic species) may be as high as 30 cm per annum. This is significantly higher than growth rates in the boreal ecotone, where research has shown that data from multi-temporal ALS campaigns contains valuable data for monitoring changes in the tree-line [71]. The time required before previously undetected invasive conifers would reach sufficient height and crown diameter to be detected in subsequent lower density surveys may be quite short, and the implementation of this approach may represent an effective and efficient means of monitoring and controlling the spread of invasive conifers over large areas of New Zealand.

5. Conclusions

The objectives of this study were (i) to compare the accuracy of detection models developed using various combinations of ALS data (elevation, intensity) and aerially acquired spectral data and (ii) to determine the sensitivity of classification accuracy in these models to the height threshold used for inclusion and the ALS pulse density used. Through this research, we found that combining spectral data with ALS data resulted in much greater classification accuracy than either ALS or spectral data alone. Uncalibrated ALS intensity data were the least useful candidate variable tested, and of the spectral bands examined, the near-infrared was the most valuable. The most accurate model contained ALS elevation and intensity data, as well as all three spectral bands examined. When this model was applied to a completely independent dataset through a LOPOCV, the classification accuracy was fair. Varying the height threshold for inclusion in the training dataset and the ALS pulse density had very little effect on classification accuracy. However, ALS pulse density had a significant effect on the size and number of invasive conifers that were not sampled by the ALS survey. We found that considerably more, larger invasive conifers were excluded from the sample when the ALS pulse density was reduced below 5 pls/m². Both the accuracy of the models developed and the effect of pulse density on the probability of sampling invasive conifers are specific to this terrain and vegetation type. However, the findings of this research can be used as a basis to inform practitioners planning surveys that include remotely-sensed data for monitoring the spread, and planning control efforts for, these invasive species.

This research has proposed a novel approach to classifying echoes from ALS data for the classification of invasive conifers in a grassland environment by incorporating spectral values from aerial imagery.

Acknowledgments

This research was funded through Scion’s core funding and Land Information New Zealand (LINZ). Blakely Pacific Ltd. and LINZ generously provided the ALS data used. The authors would like to thank Rod Brownlie, Dave Henley and Les Dowling of Scion for completing the field work, for contributing to the study design and for useful comments on the study. Mark Kimberley of Scion also provided useful advice during the study design. We would like to thank the academic editor and the three anonymous reviewers for their valuable comments that have improved the quality of this paper.

Author Contributions

All authors conceived of and designed the research. J.P.D. and M.S.W. designed the study. J.P.D. developed the analysis methodology and completed the data analysis. G.D.P. developed the ALS data thinning algorithm. J.P.D. and G.D.P. wrote the initial manuscript. All authors contributed to the writing of the paper and edited the final version.

Conflicts of Interest

The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses or interpretation of data; in the writing of the manuscript; nor in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

ALS	Airborne laser scanning
AUC	Area under the curve
AUCC	95% confidence interval associated with AUC
CHM	Canopy height model
DTM	Digital terrain model
GSD	Ground surface distance
Kappa	Cohen’s kappa coefficient
KappaCI	95% confidence interval associated with a kappa value
LiDAR	Light detection and ranging
LINZ	Land Information New Zealand
LOESS	Locally-weighted scatter plot smoothing
LOOCV	Leave one out cross-validation
LOPOCV	Leave one plot out cross-validation
MDA	Mean decrease in accuracy
NIR	Near-infrared
OOB	Out-of-bag
pls/m²	Laser pulses emitted from the airborne laser scanner per m²
pts/m²	Points (or echoes) returned to the airborne sensor per m²
RF	Random forest
ROC	Receiver operator characteristic
UCE	Ultra cam Eagle

References

Richardson, D.M.; Rejmánek, M. Conifers as invasive aliens: A global survey and predictive framework. Divers. Distrib. 2004, 10, 321–331. [Google Scholar] [CrossRef]
Richardson, D.M.; Higgins, S.I. Pines as invaders of the southern hemisphere. In Ecology and Biogeography of Pinus; Richardson, D.M., Ed.; Cambridge University Press: Cambridge, UK, 1998; pp. 450–470. [Google Scholar]
Howell, C.J.; McAlpine, K.G. Native plant species richness in non-native Pinus contorta forest. N. Z. J. Ecol. 2016, 40, 1. [Google Scholar]
Ledgard, N.J. Wilding Control. Guidelines for the Control of Wilding Conifers; Scion: Rotorua, New Zealand, 2009. [Google Scholar]
Anonymous. The Right Tree in the Right Place—New Zealand Wilding Conifer Management Strategy 2015–2030; Ministry for Primary Industries: Wellington, New Zealand, 2011.
Velarde, S.J.; Paul, T.; Monge, J.; Yao, R. Cost Benefit Analysis of Wilding Conifer Management in New Zealand. Part I—Important Impacts Under Current Management; Report S0013; Scion: Bloxham, UK, 2015. [Google Scholar]
Ledgard, N. Wilding control guidelines for farmers and land managers. N. Z. Plant Prot. 2009, 62, 380–386. [Google Scholar]
Froude, V.A. Wilding Conifers in New Zealand: Beyond the Status Report; Report Prepared for the Ministry of Agriculture and Forestry, Pacific Eco-Logic, Bay of Islands, 44; Ministry of Agriculture and Forestry: Wellington, New Zealand, 2011.
Clifford, V.; Paul, T.; Pearce, G. Quantifying the Change in High Country Fire Hazard From Wilding Trees; Report Prepared for Rural Fire New Zealand; New Zealand Fire Service Commission: Wellington, New Zealand, 2013. [Google Scholar]
Ledgard, N.; Paul, T. Vegetation successions over 30 years of high country grassland invasion by Pinus contorta. N. Z. Plant Prot. 2008, 61, 98–104. [Google Scholar]
Buckley, Y.M.; Brockerhoff, E.; Langer, L.; Ledgard, N.; North, H.; Rees, M. Slowing down a pine invasion despite uncertainty in demography and dispersal. J. Appl. Ecol. 2005, 42, 1020–1030. [Google Scholar] [CrossRef]
Woods, D. The highs and lows of wilding conifer control operations: The good, the bad and the ugly! In Managing Wilding Conifers in New Zealand—Present and Future, Proceedings of the NZ Plant Protection Society Workshop, Christchurch, New Zealand, 11 August 2003; Hill, R.L., Zydenbos, S.M., Bezar, C.M., Eds.; New Zealand Plant Protection Society: Christchurch, New Zealand, 2003; pp. 55–63. [Google Scholar]
Cochrane, P.; Grove, P. Exotic Wilding Conifer Spread Within Defined Areas of Canterbury High Counrty; Environment Canterbury: Prebbleton, New Zealand, 2013. [Google Scholar]
Mureriwa, N.; Adam, E.; Sahu, A.; Tesfamichael, S. Spectral Discrimination of Prosopis Glandulosa (Mesquite) in Arid Environment of South Africa: Testing the Utility of in Situ Hyperspectral Data and Guided Regularized Random Forest Algorithm; Asian Association on Remote Sensing: Manila, Philippines, 2015. [Google Scholar]
Hestir, E.L.; Khanna, S.; Andrew, M.E.; Santos, M.J.; Viers, J.H.; Greenberg, J.A.; Rajapakse, S.S.; Ustin, S.L. Identification of invasive vegetation using hyperspectral remote sensing in the California Delta ecosystem. Remote Sens. Environ. 2008, 112, 4034–4047. [Google Scholar] [CrossRef]
HALL, S.J.; ASNER, G.P. Biological invasion alters regional nitrogen-oxide emissions from tropical rainforests. Glob. Chang. Biol. 2007, 13, 2143–2160. [Google Scholar] [CrossRef]
Underwood, E.; Ustin, S.; DiPietro, D. Mapping nonnative plants using hyperspectral imagery. Remote Sens. Environ. 2003, 86, 150–161. [Google Scholar] [CrossRef]
Niphadkar, M.; Nagendra, H. Remote sensing of invasive plants: incorporating functional traits into the picture. Int. J. Remote Sens. 2016, 37, 3074–3085. [Google Scholar] [CrossRef]
Næsset, E.; Nelson, R. Using airborne laser scanning to monitor tree migration in the boreal–alpine transition zone. Remote Sens. Environ. 2007, 110, 357–369. [Google Scholar] [CrossRef]
Stumberg, N.; Ørka, H.O.; Bollandsås, O.M.; Gobakken, T.; Næsset, E. Classifying tree and nontree echoes from airborne laser scanning in the forest–tundra ecotone. Can. J. Remote Sens. 2013, 38, 655–666. [Google Scholar] [CrossRef]
Hantson, W.; Kooistra, L.; Slim, P.A. Mapping invasive woody species in coastal dunes in the Netherlands: A remote sensing approach using LIDAR and high-resolution aerial photographs. Appl. Veg. Sci. 2012, 15, 536–547. [Google Scholar] [CrossRef]
Bork, E.W.; Su, J.G. Integrating LIDAR data and multispectral imagery for enhanced classification of rangeland vegetation: A meta analysis. Remote Sens. Environ. 2007, 111, 11–24. [Google Scholar] [CrossRef]
Rees, W.G. Characterisation of Arctic treelines by LiDAR and multispectral imagery. Polar Rec. 2007, 43, 345–352. [Google Scholar] [CrossRef]
Thieme, N.; Martin Bollandsås, O.; Gobakken, T.; Næsset, E. Detection of small single trees in the forest–tundra ecotone using height values from airborne laser scanning. Can. J. Remote Sens. 2011, 37, 264–274. [Google Scholar] [CrossRef]
Stumberg, N.; Bollandsås, O.M.; Gobakken, T.; Næsset, E. Automatic detection of small single trees in the Forest-Tundra Ecotone using airborne laser scanning. Remote Sens. 2014, 6, 10152–10170. [Google Scholar] [CrossRef] [Green Version]
Næsset, E. Discrimination between Ground vegetation and small pioneer trees in the Boreal-Alpine Ecotone using intensity metrics derived from airborne laser scanner data. Remote Sens. 2016, 8, 548. [Google Scholar] [CrossRef]
Thomas, V.; Treitz, P.; McCaughey, J.H.; Morrison, I. Mapping stand-level forest biophysical variables for a mixedwood boreal forest using LiDAR: An examination of scanning density. Can. J. For. Res. 2006, 36, 34–47. [Google Scholar] [CrossRef]
Morsdorf, F.; Frey, O.; Meier, E.; Itten, K.I.; Allgöwer, B. Assessment of the influence of flying altitude and scan angle on biophysical vegetation products derived from airborne laser scanning. Int. J. Remote Sens. 2008, 29, 1387–1406. [Google Scholar] [CrossRef]
Wilkes, P.; Jones, S.; Suarez, L.; Haywood, A.; Woodgate, W.; Soto-Berelov, M.; Mellor, A.; Skidmore, A. Understanding the effects of als pulse density for metric retrieval across diverse forest types. Photogramm. Eng. Remote Sens. 2015, 81, 625–635. [Google Scholar] [CrossRef]
Keranen, J.; Maltamo, M.; Packalen, P. Effect of flying altitude, scanning angle and scanning mode on the accuracy of ALS based forest inventory. Int. J. Appl. Earth Obs. Geoinf. 2016, 52, 349–360. [Google Scholar] [CrossRef]
Maltamo, M.; Eerikäinen, K.; Packalén, P.; Hyyppä, J. Estimation of stem volume using laser scanning-based canopy height metrics. Forestry 2006, 79, 217–229. [Google Scholar] [CrossRef]
Gobakken, T.; Næsset, E. Assessing effects of laser point density, ground sampling intensity, and field sample plot size on biophysical stand properties derived from airborne laser scanner data. Can. J. For. Res. 2008, 38, 1095–1109. [Google Scholar] [CrossRef]
Watt, M.S.; Adams, T.; Gonzalez Aracil, S.; Marshall, H.; Watt, P. The influence of LiDAR pulse density and plot size on the accuracy of New Zealand plantation stand volume equations. N. Z. J. For. Sci. 2013, 43, 1–10. [Google Scholar] [CrossRef]
Khosravipour, A.; Skidmore, A.K.; Isenburg, M.; Wang, T.; Hussin, Y.A. Generating pit-free canopy height models from airborne LiDAR. Photogramm. Eng. Remote Sens. 2014, 80, 863–872. [Google Scholar] [CrossRef]
Hauglin, M.; Næsset, E. Detection and segmentation of small trees in the Forest-Tundra Ecotone using airborne laser scanning. Remote Sens. 2016, 8, 407. [Google Scholar] [CrossRef]
Taylor, K.T.; Maxwell, B.D.; Pauchard, A.; Nuñez, M.A.; Peltzer, D.A.; Terwei, A.; Rew, L.J. Drivers of plant invasion vary globally: Evidence from pine invasions within six ecoregions. Glob. Ecol. Biogeogr. 2015, 25, 96–106. [Google Scholar] [CrossRef]
Reese, H.; Nyström, M.; Nordkvist, K.; Olsson, H. Combining airborne laser scanning data and optical satellite data for classification of alpine vegetation. Int. J. Appl. Earth Obs. Geoinf. 2014, 27, 81–90. [Google Scholar] [CrossRef]
Hauglin, M.; Ørka, H.O. Discriminating between Native Norway Spruce and Invasive Sitka Spruce—A comparison of multitemporal Landsat 8 imagery, aerial images and airborne laser scanner data. Remote Sens. 2016, 8, 363. [Google Scholar] [CrossRef]
Liu, X. Airborne LiDAR for DEM generation: Some critical issues. Prog. Phys. Geogr. 2008, 32, 31–49. [Google Scholar]
Baltsavias, E. Airborne laser scanning: basic relations and formulas. ISPRS J. Photogramm. Remote Sens. 1999, 54, 199–214. [Google Scholar] [CrossRef]
Jakubowski, M.K.; Guo, Q.; Kelly, M. Tradeoffs between LiDAR pulse density and forest measurement accuracy. Remote Sens. Environ. 2013, 130, 245–253. [Google Scholar] [CrossRef]
Millman, K.J.; Aivazis, M. Python for Scientists and Engineers. Comput. Sci. Eng. 2011, 13, 9–12. [Google Scholar] [CrossRef]
Silva, C.A.; Crookston, N.L.; Hudak, A.T.; Vierling, L.A. rLiDAR: LiDAR Data Processing and Visualization, R package version 0.1. 2015.
Wing, B.M.; Ritchie, M.W.; Boston, K.; Cohen, W.B.; Gitelman, A.; Olsen, M.J. Prediction of understory vegetation cover with airborne LiDAR in an interior ponderosa pine forest. Remote Sens. Environ. 2012, 124, 730–741. [Google Scholar] [CrossRef]
Kaufman, Y.J.; Tanre, D. Atmospherically resistant vegetation index (ARVI) for EOS-MODIS. IEEE Trans. Geosci. Remote Sens. 1992, 30, 261–270. [Google Scholar] [CrossRef]
Packalen, P.; Suvanto, A.; Maltamo, M. A Two Stage Method to estimate species-specific growing stock. Photogramm. Eng. Remote Sens. 2009, 75, 1451–1460. [Google Scholar] [CrossRef]
Packalen, P.; Maltamo, M. Predicting the plot volume by tree species using airborne laser scanning and aerial photographs. For. Sci. 2006, 52, 611–622. [Google Scholar]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Mellor, A.; Haywood, A.; Stone, C.; Jones, S. The performance of random forests in an gperational settingfor large area sclerophyll forest classification. Remote Sens. 2013, 5, 2838–2856. [Google Scholar] [CrossRef]
Dash, J.P.; Marshall, H.M.; Rawley, B. Methods for estimating multivariate stand yields and errors using k-NN and aerial laser scanning. Forestry 2015, 88, 237–247. [Google Scholar] [CrossRef]
Dash, J.P.; Watt, M.S.; Bhandari, S.; Watt, P. Characterising forest structure using combinations of airborne laser scanning data, RapidEye satellite imagery and environmental variables. Forestry 2016, 89, 159–169. [Google Scholar] [CrossRef]
Watt, M.S.; Dash, J.P.; Bhandari, S.; Watt, P. Comparing parametric and non-parametric methods of predicting Site Index for radiata pine using combinations of data derived from environmental surfaces, satellite imagery and airborne laser scanning. For. Ecol. Manag. 2015, 357, 1–9. [Google Scholar] [CrossRef]
Watt, M.S.; Dash, J.P.; Watt, P.; Bhandari, S. Multi-sensor modelling of a forest productivity index for radiata pine plantations. N. Z. J. For. Sci. 2016, 46, 1–14. [Google Scholar] [CrossRef]
Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J. Random forests for classification in ecology. Ecology 2007, 88, 2783–2792. [Google Scholar] [CrossRef] [PubMed]
Criminisi, A.; Konukoglu, E.; Shotton, J. Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning; NOW Publishers: New York, NY, USA, 2012. [Google Scholar]
Wright, M.N. Ranger: A Fast Implementation of Random Forests, R package version 0.5.0. 2016.
Cohen, J. A Coefficient of Agreement for Nominal Scales. Educ. Psychol. Meas. 1960, 20, 37–46. [Google Scholar] [CrossRef]
Revelle, W. Psych: Procedures for Psychological, Psychometric, and Personality Research, R package version 1.6.12; Northwestern University: Evanston, IL, USA, 2016. [Google Scholar]
Richard, J.; Landis, G.G.K. The measurement of observer agreement for categorical data. Biometrics 1977, 33, 159–174. [Google Scholar]
Fleiss, J.; Cohen, J.; Everitt, B. Large sample standard errors of kappa and weighted kappa. Psychol. Bull. 1969, 72, 323–327. [Google Scholar] [CrossRef]
Robin, X.; Turck, N.; Hainard, A.; Tiberti, N.; Lisacek, F.; Sanchez, J.C.; Müller, M. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 2011, 12, 77. [Google Scholar] [CrossRef] [PubMed]
Holmgren, J.; Persson, A.; Soderman, U. Species identification of individual trees by combining high resolution LiDAR data with multi-spectral images. Int. J. Remote Sens. 2008, 29, 1537–1552. [Google Scholar] [CrossRef]
Dalponte, M.; Bruzzone, L.; Gianelle, D. Tree species classification in the Southern Alps based on the fusion of very high geometrical resolution multispectral/hyperspectral images and LiDAR data. Remote Sens. Environ. 2012, 123, 258–270. [Google Scholar] [CrossRef]
Ghosh, A.; Fassnacht, F.E.; Joshi, P.; Koch, B. A framework for mapping tree species combining hyperspectral and LiDAR data: Role of selected classifiers and sensor across three spatial scales. Int. J. Appl. Earth Obs. Geoinf. 2014, 26, 49–63. [Google Scholar] [CrossRef]
Goodwin, N.R.; Coops, N.C.; Culvenor, D.S. Assessment of forest structure with airborne LiDAR and the effects of platform altitude. Remote Sens. Environ. 2006, 103, 140–152. [Google Scholar] [CrossRef]
Lovell, J.L.; Jupp, D.L.; Culvenor, D.S.; Coops, N.C. Using airborne and ground-based ranging LiDAR to measure canopy structure in Australian forests. Can. J. Remote Sens. 2003, 29, 607–622. [Google Scholar] [CrossRef]
Disney, M.; Kalogirou, V.; Lewis, P.; Prieto-Blanco, A.; Hancock, S.; Pfeifer, M. Simulating the impact of discrete-return LiDAR system and survey characteristics over young conifer and broadleaf forests. Remote Sens. Environ. 2010, 114, 1546–1560. [Google Scholar] [CrossRef]
Gates, D.M.; Keegan, H.J.; Schleter, J.C.; Weidner, V.R. Spectral Properties of Plants. Appl. Opt. 1965, 4, 11–20. [Google Scholar] [CrossRef]
Slaton, M.R.; Raymond Hunt, E.; Smith, W.K. Estimating near-infrared leaf reflectance from leaf structural characteristics. Am. J. Bot. 2001, 88, 278–284. [Google Scholar] [CrossRef] [PubMed]
Williams, D.L. A comparison of spectral reflectance properties at the needle, branch, and canopy level for selected Conifer species. Remote Sens. Environ. 1991, 35, 79–93. [Google Scholar] [CrossRef]
Nyström, M.; Holmgren, J.; Olsson, H. Change detection of mountain birch using multi-temporal ALS point clouds. Remote Sens. Lett. 2013, 4, 190–199. [Google Scholar] [CrossRef]

Figure 1. Overview and geographic location of study site and field plots (light green filled circles). Right hand panel shows examples of dominant vegetation types from the study area. These included: (a) invasive conifers; (b) areas dominated by bracken; (c) grassland areas; and (d) areas dominated by manuka trees up to 4 m.

Figure 2. Box and whisker plots showing the spectral and structural properties of invasive conifers (conifer) and the other major vegetation types in the study area derived from the coloured ALS point cloud. Panels show (a) return elevation; (b) spectral properties in the green band; (c) spectral properties in the near-infrared band; (d) spectral properties in the red band.

Figure 3. The receiver operator characteristic (ROC) curve for the most accurate (Model 1) and the least accurate (Model 15) models. The ROC curves for all other models lie between these two.

Figure 4. Classification accuracy (kappa) for all random forest models (Table 5) at each of the tested pulse densities.

Figure 5. Classification accuracy (kappa) for all models at all height thresholds tested.

Figure 6. Performance of the RF classifier when applied to the plots shown in Figure 1 containing: (a) invasive conifers; (b) areas dominated by bracken; (c) grassland areas; and (d) areas dominated by manuka trees. Invasive conifer classified echoes are shown in light green, and the invasive conifer outlines used to train the model are shown in blue. Echoes not classified as invasive conifers are not shown.

Figure 7. Proportion of invasive conifers missed per plot at each pulse density. Points are jittered to avoid over-plotting, and the point size is proportional to the mean canopy area of the missed invasive conifers in the plot. The grey line shows a linear model.

Table 1. Field tree data summary with range shown in brackets. Diameters shown were measured at breast height (1.4 m) for larger trees and ground level for saplings.

**Table 1.** Field tree data summary with range shown in brackets. Diameters shown were measured at breast height (1.4 m) for larger trees and ground level for saplings.
Species	Count	Mean Height (m)	Mean Diameter (mm)
Ps. menz.	813	1.72 (0.05–12.90)	28.18 (4–250)
P. muricata	8	1.99 (1.02–3.94)	22.16 (7–40)
P. radiata	4	2.00 (1.28–2.70)	20.50 (5–41)

Table 2. Campaign settings for ALS data acquisition.

**Table 2.** Campaign settings for ALS data acquisition.
Variable	Value
Scanner	Riegl Q1560, 2 channel
Laser pulse rate	330 kHz
Max scan angle	14°
Echo types	1st, 2nd, 3rd, …, 7th and last
File format	LAS 1.4
Map projection	NZTM2000
Horizontal datum	NZGD2000
Vertical datum	NZVD2009
Mean pulse density	21.1 pls/m²
Mean echo density	36.5 pts/m²
Pulse spacing	0.22 m

Table 3. Summary of original and thinned ALS datasets.

**Table 3.** Summary of original and thinned ALS datasets.
Target Pulse Density (pls/m²)	Realised Point Density (pts/m²)	Realised Point Spacing (m)	Realised Pulse Density (pls/m²)	Realised Pulse Spacing (m)
1	1.8	0.8	1.1	1
2	3.7	0.5	2.1	0.7
5	9.1	0.3	5.3	0.4
10	18.2	0.2	10.6	0.3
unthinned	36.5	0.2	21.1	0.2

Table 4. Summary of aerial imagery data.

**Table 4.** Summary of aerial imagery data.
Variable	Value
Radiometric resolution	32-bit colour (4 × 8 bits per band)
Spectral resolution	Red, green, blue, near-infrared
Pixel resolution	0.3 m GSD
Spatial accuracy	±4 m at the 95% confidence interval in the clear open space (2 sigma) over the area of interest
Data format	GeoTiff with associated world file (TFW)
Forward overlap	60% (min 54%)
Side overlap	30% (min 15%)

Table 5. Classification accuracy expressed through Cohen’s Kappa and associated 95% confidence interval (KappaCI), area under curve (AUC) and the associated 95% confidence interval (AUCCI) from receiver operator characteristic (ROC) curves for all 15 models examined. Predictor variables denoted with an * were included in the model.

**Table 5.** Classification accuracy expressed through Cohen’s Kappa and associated 95% confidence interval (KappaCI), area under curve (AUC) and the associated 95% confidence interval (AUCCI) from receiver operator characteristic (ROC) curves for all 15 models examined. Predictor variables denoted with an * were included in the model.
Model	Predictor Variables					Statistics
Identifier	Elevation	Intensity	NIR	Red	Green	Kappa	KappaCI	AUC	AUCCI
1	*	*	*	*	*	0.837	0.835–0.839	0.885	0.884–0.887
2	*	*	*	*		0.785	0.782–0.788	0.856	0.854–0.858
3	*	*		*	*	0.744	0.741–0.747	0.828	0.826–0.829
4	*	*	*		*	0.781	0.777–0.783	0.854	0.851–0.855
5	*		*	*	*	0.773	0.771–0.776	0.849	0.847–0.850
6	*	*	*			0.521	0.517–0.524	0.698	0.695–0.699
7	*		*	*		0.513	0.508–0.515	0.693	0.691–0.694
8	*		*		*	0.508	0.504–0.511	0.691	0.689–0.693
9	*			*	*	0.316	0.313–0.321	0.602	0.601–0.604
10			*	*	*	0.308	0.306–0.313	0.599	0.597–0.600
11	*	*				0.292	0.288–0.295	0.597	0.595–0.598
12	*		*			0.355	0.352–0.359	0.625	0.623–0.626
13	*			*		0.224	0.220–0.227	0.571	0.569–0.572
14	*				*	0.221	0.218–0.225	0.569	0.568–0.571
15	*					0.101	0.099–0.104	0.529	0.528–0.531

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dash, J.P.; Pearse, G.D.; Watt, M.S.; Paul, T. Combining Airborne Laser Scanning and Aerial Imagery Enhances Echo Classification for Invasive Conifer Detection. Remote Sens. 2017, 9, 156. https://doi.org/10.3390/rs9020156

AMA Style

Dash JP, Pearse GD, Watt MS, Paul T. Combining Airborne Laser Scanning and Aerial Imagery Enhances Echo Classification for Invasive Conifer Detection. Remote Sensing. 2017; 9(2):156. https://doi.org/10.3390/rs9020156

Chicago/Turabian Style

Dash, Jonathan P., Grant D. Pearse, Michael S. Watt, and Thomas Paul. 2017. "Combining Airborne Laser Scanning and Aerial Imagery Enhances Echo Classification for Invasive Conifer Detection" Remote Sensing 9, no. 2: 156. https://doi.org/10.3390/rs9020156

APA Style

Dash, J. P., Pearse, G. D., Watt, M. S., & Paul, T. (2017). Combining Airborne Laser Scanning and Aerial Imagery Enhances Echo Classification for Invasive Conifer Detection. Remote Sensing, 9(2), 156. https://doi.org/10.3390/rs9020156

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Combining Airborne Laser Scanning and Aerial Imagery Enhances Echo Classification for Invasive Conifer Detection

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Site

2.2. Field Data

2.3. ALS Data

2.4. ALS Data Thinning

2.5. Aerial Imagery

2.6. Echo Classification

2.7. Random Forest

2.8. Accuracy Assessment

3. Results

3.1. Spectral and Structural Properties

3.2. Classification Accuracy

3.3. Pulse Density and Height Threshold

3.4. Leave One Plot Out Cross-Validation

3.5. Out-Of-Sample Errors

4. Discussion

5. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI