Mapping and monitoring the tree species composition of Earth’s forests using remote sensing technologies is a challenging task that has motivated an active research community over the past three decades [1
]. In between the global coverage and long-standing historical records of satellite platforms [8
] and the flexibility and high spatial resolution achieved by unmanned aircraft systems [11
], airborne platforms provide a valuable intermediate scale of local to regional ecosystem monitoring capabilities [3
]. Passive optical (multispectral [10
] and hyperspectral [18
]) and active (such as Light Detection and Ranging (LiDAR) [19
]) remote sensing systems have each been used to map tree species on their own with varying levels of success, but combining spectral and structural remote sensing data has been shown to generally improve tree species detection abilities compared to using any of them alone [2
]. For instance, the combination of hyperspectral and LiDAR data has the ability to differentiate between species with similar reflectance properties but different mean heights [3
These remotely sensed spectral and structural characteristics of trees are used to predict species using a variety of pixel-based and object-based classification approaches [6
]. Three commonly used classification approaches are non-parametric machine learning techniques including Support Vector Machines (SVM), Random Forest (RF), and neural networks [6
]. SVM and RF tend to perform similarly in terms of classification accuracy and training time [29
]. Neural networks are increasingly used in ecological remote sensing studies for their ability to identify trends and patterns from data, model complex relationships, accept a wide variety of input predictor data, and produce high accuracies, at the expense of requiring large amounts of training data [13
]. Tree species classification accuracies reported throughout the literature vary widely from approximately 60% to 95%, along with the type and number of sensors used, biodiversity within forests, and classification methods utilized [6
]. Many studies highlight the value of deriving metrics such as texture to quantify crown-internal shadows and foliage properties [35
], calculating vegetation indices and employing dimensionality reduction when working with hyperspectral data [36
], and removing dark or non-green pixel outliers that may contain shadows or soil [37
] when preparing remote sensing image features for species classification [6
]. Other factors that impact tree species classification accuracy include tree species complexity [38
], and the time of acquisition within a year or season, especially for deciduous trees with distinctive phenological patterns [27
Large amounts of open ecological and remote sensing data are becoming increasingly available in recent years to enable and motivate advancements in tree species classification efforts [6
]. A notable source of these data is the National Ecological Observatory Network (NEON), a National Science Foundation (NSF) funded Grand Challenge project awarded with the purpose of measuring ecological change for a span of 30 years [40
]. NEON provides publicly available data at 81 field sites in 20 distinct eco-climatic domains across the continental United States, Alaska, Hawaii and Puerto Rico [41
]. NEON collects airborne remote sensing observations at a subset of these sites every year from the Airborne Observation Platform (AOP). The AOP sensor suite includes multispectral, hyperspectral, and LiDAR instruments to map regional land cover at high spatial resolutions, ranging from 0.1 to 1 m. [42
]. NEON generates a series of publicly available image data products from the AOP data and provides documents with detailed descriptions of their standardized data collection protocols and processing algorithms. Within the extent of AOP coverage, NEON field technicians collect terrestrial field-based plant measurements towards the aim of monitoring changes in biodiversity, species abundance, and productivity [41
]. In-situ measurements of individual trees are collected at stratified random plot locations at vegetated NEON sites every 1–3 years, including stem locations, species, and crown diameter [43
NEON data has enabled a new wave of tree detection and classification research, in addition to a need for integrative and reproducible analysis and synthesis [34
]. This research has been further accelerated by an ecological data science competition that has tasked research groups with tree crown segmentation, alignment of data, and species classification at the open canopy longleaf pine ecosystem at the Ordway-Swisher Biological Station in Florida [37
]. A second iteration of this competition is currently underway with the focus of developing methods that generalize to other NEON sites. Preparing accurate training data that connects field-based ground truth species measurements with remote sensing observations is a critical, yet challenging, aspect of these efforts [37
]). One recent study proposed a new approach to create tree crown training data using a consumer-grade GPS and tablet to spatially match individual trees measured in the field directly onto an AOP-derived image [47
]. Another recent study leveraged existing unsupervised algorithms to delineate tree crown boundaries based on AOP LiDAR canopy height data, and refined these boundaries using almost 3000 hand-annotated bounding boxes drawn around individual trees using AOP multispectral imagery collected at the San Joaquin Experimental Range (SJER) NEON site in California [28
]. These studies are making exciting progress towards creating high-quality, high-volume tree species training data at NEON sites. Ultimately, manually creating training data is an expensive and time-consuming task that introduces human decision making and/or may require external data sets or information beyond what is provided publicly within NEON data products and protocols.
Our work explores training data set preparation using openly available NEON data without requiring external data collection or manual delineation steps. We integrate in-situ vegetation measurements with multisensor AOP remote sensing data at the Niwot Ridge Mountain Research Station (NIWO) subalpine forest NEON site in Colorado to evaluate the impact of our hands-off training data preparation approaches on tree species classification accuracy. We create a series of training data sets by representing individual trees as points and circular polygons with various sizes based on in-situ crown diameter measurements. We propose a preprocessing workflow to remove small, suppressed trees and clip areas of overlap between neighboring tree crown polygons in layered canopies to give preference to taller trees that are more likely to be seen by the airborne remote sensing platform. To assess the impact of training data preparation on species classification accuracy, we train random forest (RF) models to predict tree species using each training set. We also evaluate variable importance to assess the contribution of each AOP remote sensing data product for species classification. This work contributes to the open development of forest composition mapping efforts using NEON data without the need for manual delineation or external data sources. We explored the following objectives:
Evaluate which training set preparation approach yields the most accurate tree species classification accuracy. We expected smaller tree polygons would capture more valuable variation in canopy features than using stem location points, and capture less noise and neighboring materials than larger circular polygons.
Evaluate the value added of our proposed tree crown polygon clipping workflow, which removes tree crown polygons with small area values and clips overlapping tree crown regions based on associated in-situ tree height measurements.
Assess the tree species classification accuracies achievable for the four dominant subalpine conifer species in a region of the Southern Rockies, Colorado, USA using the proposed NEON training data preparation approaches.
Determine which NEON AOP imagery-derived features are the most important for predicting tree species to help inform overarching tree species classification efforts. We anticipated the hyperspectral imagery to be the most important compared to RGB or LiDAR-derived features.
Contribute open reproducible tools so that the NEON data user community can use and build upon these techniques across diverse vegetated ecosystems.
The AOP hyperspectral reflectance curves appear very similar across all four tree species, including the characteristic green peak (550 nm) within the visible wavelength region and the steep slope at the edge between the red and near-infrared regions (750 nm) and the shoulder or flattening off into the near-infrared region (800 nm) (Figure 6
). To compare reflectance magnitudes in different wavelength regions, we overlaid all four mean spectral reflectance curves with standard deviation shading (Figure 7
). The overlaid mean reflectance curves extracted within the clipped half-diameter polygons have very similar shapes across the conifer species but they appear to be biased or separated vertically to varying degrees. The relationship between reflectance magnitude across species curves differs across wavelength regions. Features and biases in reflectance may aid in differentiating between species during classification.
When we compared the overall classification accuracies for each of the RF models trained using each a different training set (Table 3
), the clipped half-diameter polygon training set yielded the highest overall accuracy (OA) values of 69.3% and 60.4% for out-of-bag and independent validation evaluations, respectively.
The user’s and producer’s accuracies for the clipped half-diameter polygon RF model varied greatly across the four species, from 32.6% to 94.9% (Table 4
). Pinus flexilis
(Limber pine) is consistently the most accurately classified species, while Abies lasiocarpa
(Subalpine fir) and Picea engelmannii
(Engelmann spruce) were less accurately classified. Abies lasiocarpa
is often incorrectly predicted as Picea engelmannii
(Engelmann spruce). The confusion matrices of the other five training data sets demonstrate analogous species-specific classification performance.
When we ranked the AOP-derived descriptive features in order of importance (Figure 8
), the structural features derived from the LiDAR data (aspect, slope, and canopy height) were ranked as the top three most important variables. Variable importance appeared to taper off after this, with the following multispectral- and hyperspectral-derived features listed as the next most important across the MDA and MDG metrics: blue intensity from the digital camera, ARVI (Atmospherically Resistant Vegetation Index), PRI (Photochemical Refelectance Index), and NDVI (Normalized Difference Vegetation Index).
We built an open reproducible workflow to create training data sets using NEON data without the need for manual delineation or external data sources. We represented individual trees as points and circular polygons with various sizes based on in-situ crown diameter measurements, and we implemented a clipping workflow to give preference to larger, taller trees that are more likely to be seen in layered canopies by the airborne remote sensing platform. We trained RF models to predict species using each training set paired with AOP remote sensing features, and determined which remote sensing data products were most important for species identification. Our results show that 60–69% overall classification accuracies are achievable at the NIWO site without any manual refinement of the training data or incorporation of data beyond the NEON collection protocols. These accuracy values fall within the lower end of the reported range of 65–95% for studies utilizing combined sensor systems (spectral and structural data) [6
]. Based on the extracted spectra, the mean and standard deviation of reflectance for the four coniferous tree species appear to be very similar (Figure 7
Accuracy values were highest when half-diameter circular polygons were used, rather than points or maximum-diameter polygons (Table 3
). This is useful information for creating future training data sets using NEON in-situ measurements. Minimum crown diameter is also measured, and could be used to approximate circular crown size, as there are no other measurements that give a more specific indication of crown shape in the NEON data collection protocol. After applying the proposed clipping workflow, classification accuracy increased slightly (no more than 2%) between each pair of reference data sets. We expected a larger accuracy increase, under the assumption that removing tree crowns with small area values and clipping overlap between neighboring crown polygons would yield a purer spectral and structural signal for each species.
Accuracy values varied across each of the four coniferous species in our study area (Table 4
). The worst classification results were obtained for Subalpine fir (ABLAL) and Engelmann spruce (PIEN), while the best classification results were obtained for limber pine (PIFL2). Based on the spectral reflectance curves in Figure 7
, PIFL2 appears to be noticeably better separated from the other three species across the longer wavelengths, from 1500 nm out to 2500 m, and this spectral separation may help explain why PIFL2 was classified most accurately.
Besides the clipping workflow and using the CHM to remove pixels associated with a canopy height of zero, points and polygons derived from the in-situ tree measurements were used as-is, without any site-specific parameters, manual quality assessment, or alignment procedures. Our visualizations indicate that some stem locations and polygons extend beyond the actual location and boundaries of tree crowns in the AOP imagery (Figure 4
). This could result from field measurement errors and uncertainties, georectification and image formation artifacts introduced during the creation of the remote sensing data products, and/or asymmetric crowns.
We performed pixel-based classification, but existing studies show potential for improved results using manual and automated tree crown delineation to refine the selection of pixels within their reference data sets and perform object-based classification [28
]. We recognize that individual tree detection and segmentation are complicated tasks that often require site-specific tuning parameters to avoid under/over segmentation [5
]. Tree detection methods often struggle with transferability across geographies and vegetation types. We aimed to create and evaluate a training data processing method that takes full advantage of standardized NEON-collected tree measurements, and avoids being site-specific. We are aware of other studies such as [62
] who are making impressive strides towards creating transferable methods for large-scale tree detection across NEON sites. Note that methods like this one utilize tens of thousands of hand-annotated tree crowns for model training.
Beyond using the CHM to identify canopy pixels, additional spectral data can be used such as an NDVI threshold to isolate forest pixels and eliminate shadow pixels [36
]. An NDVI threshold could also be used to identify pixels with trees that have died since the collection of the field data, as there was a time difference of 1–2 years between the in-situ and remote sensing data collections in this study. Incorporating additional steps to refine the reference data sets, ensure alignment with the remote sensing imagery, and delineate individual crown boundaries instead of assuming perfectly circular crowns may improve upon the accuracies achieved here. We anticipate that employing outlier removed steps may further separate the mean spectral reflectance curves and reduce the standard deviation of hyperspectral reflectance per species that are presented in Figure 6
and Figure 7
We found that the LiDAR-derived remote sensing features of aspect, slope, and canopy height were consistently ranked as the most important for species classification (Figure 8
). This is interesting, as canopy height on its own is typically limited for robust species classification as it is dependent on tree age [6
]. In addition to using the CHM, other studies have found that calculating metrics related to the vertical distribution, density and intensity of individual LiDAR returns improved classification results, so these metrics may be promising to incorporate in the future [6
]. Incorporating LiDAR point cloud-based metrics per pixel and/or tree crown object may improve species classifier performance in the future. Aspect and slope are important drivers of microclimate conditions such as temperature, moisture, and sun and wind exposure, so their importance has ecological merit in the mountainous landscape of the Southern Rockies. The importance of aspect and slope in our variable assessment may indicate that this classifier is not just learning how to identify trees using inter-species spectral variability, but also incorporating spectral information from the habitat or niche where each species resides. Future work would be required to assess the consistency of these observations with documented patterns of species distributions, but this highlights an interesting potential avenue for assessing how species habitats may shift in mountainous regions across the 30 year duration of the NEON project.
Following the LiDAR-derived variables, the ranking of RGB and NIS-derived variable importance varied slightly between importance metrics (Figure 8
). These variable importance rankings may be useful for iterative variable selection in future classification efforts, such as backwards feature selection [36
]. In addition, performing a sensitivity analysis would be valuable to fine-tune the ntree
RF parameters in subsequent analyses to potentially achieve higher species classification accuracies.
From the AOP hyperspectral imagery, we utilized a series of vegetation indices and the first two principal components (PCs) in our RF models. To further improve the utility of the hyperspectral data, we consider employing other dimensionality reduction techniques and including additional features in future work. We incorporated the first two PCs, which explained 96% of the hyperspectral data. Including additional PCs to explain additional variance may improve our classification results in future work. PCA and Minimum Noise Fraction (MNF) were the most commonly used dimensionality reduction methods in a recent tree species classification review [6
]. There is also potential for MNF to improve classification results, and this would be an interesting comparison to make in future research. Hyper-dimensional image data volume may also be reduced by selecting specific spectral bands that are shown to successfully discriminate species [15
]. There are specific wavelengths that have been shown to vary as a function of vegetation type and health, the foundation for vegetation indices as well, such as reflectance at 550 nm (the green peak within the visible wavelengths) [64
] and reflectance at 750 nm (at the NIR shoulder) [65
]. Additionally, spectral separability analysis could be useful to analyze our extracted reflectance curves presented in Figure 6
and Figure 7
. Separability metrics such as the Jeffries–Matusita distance and transformed divergence have been used to identify wavelengths that play a significant role in vegetation classification [66
]. Regarding the contribution of the RGB imagery, we consider using texture metrics in future analyses to quantify spatial patterns within tree crowns such as shadows and other foliage characteristics to potentially improve classification results [35
]. Calculating explicit texture metrics may also help to ensure that the RGB data is not adding redundant information that is already described by the visible bands within the hyperspectral data.
This analysis can also be performed at other NEON sites to generate additional training data sets for regional species classification. However, the types of vegetation and topography vary among NEON domains, and we expect this to influence the resulting variable importance rankings and species classification accuracies. For instance, the LiDAR-derived features of slope and aspect describe the underlying terrain steepness and orientation. These two variables were found to be important for species classification at the mountainous NIWO site in the Southern Rockies in our study. However, the Ordway–Swisher Biological Station (OSBS) site in north-central Florida is relatively flat. We do not expect the slope and aspect variables to be as useful for species classification at a site such as OSBS where slope and aspect are relatively constant across space.
We expect overall species classification to be influenced by the diversity and tree canopy complexity at different ecosystems across NEON sites. For instance, San Joaquin Experimental Range (SJER) in central California features open woodland dominated by large oak trees, pine trees, and scattered shrubs and grasses. Harvard Forest (HARV) in Massachusetts features primarily closed-canopy mixed forest composed of both coniferous and hardwood trees, densely packed with overlapping crowns. We expect greater species classification accuracies to be achieved at SJER compared to at HARV, because an open woodland offers simpler canopy structure and clearer separation between neighboring tree crowns. In addition, open woodland enables more accurate GPS measurements to be collected at plot corner locations as opposed to in dense, closed-canopy tree cover.
Incorporating additional training data and/or implementing a more complex classification model may improve our results. Complete field data entries for just 699 trees were available at the NIWO site for our analysis. Deep learning methods are gaining traction in the forest remote sensing field for their ability to perform complicated tasks such as tree detection and species classification, although they require large volumes of training data [32
]. Deep learning methods may also enable more effective transferability of species classification methods across geographies and vegetation types. When a recently developed deep learning classification method was applied across multiple NEON sites, tree detection was the least successful at the NIWO site compared to three other NEON sites with oak woodland, mixed pine, and dense deciduous forest types [62
]. Taking both our results and the findings of [62
] into consideration, species classification may objectively be a difficult task at the NIWO site due to the spectral similarity and structural characteristics of the coniferous vegetation species present there.