1. Introduction
Coastal redwoods (
Sequoia sempervirens) are an exceptionally charismatic species. They are the tallest trees on earth, with some individuals taller than 100 m and older than 2000 years in age. Redwoods are endemic to the central coast of California and Southern Oregon, and redwood forests are locally- and globally-valued ecosystems [
1]. Historically, redwoods have been an important source of lightweight, decay-resistant timber, and redwood logging remains an important source of revenue in some regions. Redwood forests are unique sites for recreation, and visitors to redwood-dominated state and national parks contribute
$34 million dollars each year to local economies [
1]. Redwoods have also been the focus of recent research interest due to their unique potential for carbon storage [
2], with some redwood forests estimated to store 2600 mg of aboveground carbon per unit hectare, the highest values recorded on earth [
3].
Given the multiple and diverse interests in redwoods within the state of California, strategic land management is needed to bolster conservation goals [
1]. Redwood conservation has a long history in California, beginning with the founding of the first state park in 1901 following intensive and unregulated logging of old-growth redwoods during the second half of the eighteenth century. While some old-growth forests were protected, logging in many redwood forest regions in California continued until the 1970s, when the passing of the Endangered Species Act increased the cost of obtaining a permit to harvest redwoods [
4]. Currently, approximately 90% of extant redwood forest is second-growth forest that has been logged at some point during the last two centuries. Redwood forests are fragmented by past harvest history and current land use pressures, and accordingly, there is a need for local-scale, high resolution management of redwood forest land. Maps of redwood distributions are currently available at 1-km spatial resolution, but due to the complex topography, geology, and climate of coastal California, the abundance and properties of redwoods can vary enormously within a 1 km
2 area. Higher-resolution maps have not yet been produced due to the challenge of conducting field work to identify trees over large areas in topographically complex regions.
Remote sensing image analysis provides an array of potential tools for mapping vegetation properties [
5]. Variation in sensor type, the spectral resolution of the sensor and the spatial resolution of the data have different applications. Passive sensors such as the Landsat Multispectral Scanner (MSS), have been used for mapping vegetation types at the community level and regional scales [
6]. Active remote sensing has also been applied for wetland vegetation classification, for example through the identification of mangroves in Vietnam from the Advanced Land Observing Satellite Phased Array type L-band Synthetic Aperture Radar (ALOS PALSAR) data [
7]. With imagery at higher spatial resolution, discrimination of individual tree species is possible. Sentinal-2A—a satellite multispectral spectrometer with 13 spectral bands, ranging from 10–60 m resolution—was used to map tree species with 65% overall accuracy in Germany [
8]. QuickBird satellite data at less than 3-m resolution was used to map forty tree species with a kappa coefficients between 0.68 and 0.94 [
9], and WorldView-2 satellite data at 2-m resolution was used to map 10 tree species in Austria with between 33 and 94% producer’s accuracies and 57–92% user’s accuracies [
10].
Classification of imagery high in both spectral and spatial resolution has an especially effective approach for mapping of individual tree species across a multitude of environments. Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) imagery, at 3.5 m resolution and including 224 bands, was used to map urban tree species in Southern California with 94% accuracy at the tree type level and 70% accuracy at the individual tree level [
11]. Hyperspectral data including 214 bands at approximately 2-m resolution from the Carnegie Airborne Observatory (CAO) was used to map tree species in a tropical forest with 94–97% producer’s accuracy and 94–100% user’s accuracy [
12]. Combining high-resolution hyperspectral data with structural attributes of species derived from LiDAR data has also yielded maps with high accuracy. Including LiDAR-derived height and canopy profile information improved producer’s accuracies by 5.1–11.6%, and user’s accuracies by 8.4–18.8% for mapping 11 trees species in Canada from Airborne Imaging Spectrometer for Applications (AISA) hyperspectral data at 2-m resolution and LiDAR data at 0.8 m resolution [
13].
Tree species classifications from remotely-sensed imagery have been implemented with both parametric and nonparametric classification algorithms [
14], however, recent approaches have focused on nonparametric machine learning (ML) algorithms because they have shown to be high-performing and efficient [
15]. Support vector machine (SVM) is the most widely-used algorithm for tree species mapping from remotely-sensed data [
14], however, artificial neural networks (ANN) [
16] and decision-tree-based methods such as random forest (RF) [
17,
18] and gradient boosted regression trees (GBRT) have also been used [
19]. For example, a recent comparison of SVM with an RF classifier for mapping multiple tree species in Muir Woods National Monument in California with AISA hyperspectral imagery found that the SVM produced slightly higher overall accuracy (95.02% for SVM and 92.91% for RF) [
20]. Gradient boosted regression trees (GBRT) are similar to RF, however for GBRT, regression trees are fitted sequentially on observations that are modelled poorly by the existing set of trees, which has the potential to improve their performance [
21]. GBRT have been used less frequently than RF thus far, but have performed well for species mapping in recent studies. For example, a recent study using GBRT to map giant sequoia trees from CAO hyperspectral imagery from the southern Sierra Nevada mountains in California yielded 95.2–98% overall accuracy [
22], and a study comparing SVM with GBRT’s for detecting Ohi’a crowns infected with rapid Ohi’a death on Hawaii island found that using a combined SVM and GBRT approach yielded higher performance than either algorithm independently [
19]. Other decision tree algorithms such as rotation forest [
23] and logistic model tree algorithms [
7] have recently been successfully applied for mapping mangrove species.
Here, using imaging spectroscopy data collected from the Carnegie Airborne Observatory (CAO) and field training data, we used the unique reflectance signatures of redwoods in a GBRT classification model to identify redwoods with high accuracy in hyperspectral images collected from three forests. We used a grid search to identify a GBRT model that minimizes false detections of redwoods while maintaining high computational efficiency for application on large datasets. We then applied this model to three coastal redwood forests in California, and compared its performance across three forests with multiple measures of model performance. We used the resulting maps to assess variability in redwood distributions at 10-m resolution and discussed its potential for advancing knowledge on redwood ecology and improving the efficiency of redwood conservation and management.