Deep Learning and Phenology Enhance Large-Scale Tree Species Classiﬁcation in Aerial Imagery during a Biosecurity Response

: The ability of deep convolutional neural networks (deep learning) to learn complex visual characteristics offers a new method to classify tree species using lower-cost data such as regional aerial RGB imagery. In this study, we use 10 cm resolution imagery and 4600 trees to develop a deep learning model to identify Metrosideros excelsa (p¯ohutukawa)—a culturally important New Zealand tree that displays distinctive red ﬂowers during summer and is under threat from the invasive pathogen Austropuccinia psidii (myrtle rust). Our objectives were to compare the accuracy of deep learning models that could learn the distinctive visual characteristics of the canopies with tree-based models (XGBoost) that used spectral and textural metrics. We tested whether the phenology of p¯ohutukawa could be used to enhance classiﬁcation by using multitemporal aerial imagery that showed the same trees with and without widespread ﬂowering. The XGBoost model achieved an accuracy of 86.7% on the dataset with strong phenology (ﬂowering). Without phenology, the accuracy fell to 79.4% and the model relied on the blueish hue and texture of the canopies. The deep learning model achieved 97.4% accuracy with 96.5% sensitivity and 98.3% speciﬁcity when leveraging phenology—even though the intensity of ﬂowering varied substantially. Without strong phenology, the accuracy of the deep learning model remained high at 92.7% with sensitivity of 91.2% and speciﬁcity of 94.3% despite signiﬁcant variation in the appearance of non-ﬂowering p¯ohutukawa. Pooling time-series imagery did not enhance either approach. The accuracy of XGBoost and deep learning models were, respectively, 83.2% and 95.2%, which were of intermediate precision between the separate models.


Introduction
The early stages of a biosecurity response to a newly arrived plant pathogen can have a significant bearing on the final outcome and cost [1,2]. Once an unwanted pathogen has been positively identified, mapping and identification of potential host species become essential for managing the incursion [3]. Identification of host plants must be carried out by trained personnel and the hosts may be located across a mixture of public and private property or in hard to access areas. For these reasons, carrying out large-scale searches for host plants can be very costly and challenging to resource.
The level of host detection and surveillance required in the face of an incursion is usually defined by the response objective. Eradication of a pathogen necessitates exhaustive detection of host species to monitor spread and enable the destruction of infected plants or even hosts showing no signs of infection to limit future spread. A monitoring objective may require only the identification of key indicator species to define the infection front and monitor impacts and host range. Finally, long-term management strategies may require large-scale but inexhaustive host identification to locate resistant individuals within a population for breeding programmes or other approaches to biological control [4,5].
Remote sensing can complement all of these objectives by offering an efficient and scalable means of identifying host species [6,7]. Imagery acquired from UAVs, aircraft or even space-borne optical sensors can be used to identify both potential hosts as well as the symptoms of pathogen infection on susceptible host species [6,8]. However, the detection and classification of species from remotely sensed data comprise a complex sub-discipline. Fassnacht et al. [9] carried out a comprehensive review of methods for tree species classification using remotely sensed data and highlighted clear themes in the literature. Multispectral and hyperspectral data were identified as being the most useful data sources for accurate species classification with LiDAR data being highly complementary. Through capturing reflected light outside the visible spectrum, the use of multi/hyperspectral data sources increases the chance of observing patterns of reflectance related to structural or biochemical traits that may be unique or distinctive to species or groups.
Multispectral data (4-12 bands) are relatively easy to capture and have been widely used in combination with machine learning methods to perform species classification [10][11][12]. However, accurate classification is often limited to broad groups such as conifer vs. deciduous forest types [13]. Hyperspectral data contain many more (>12) narrow spectral bands-enhancing the ability to observe small differences that may be present between the spectra of tree species and has been well studied for fine-grained species classification tasks [14,15]. The idea of unique spectral 'signatures' for species has been present in the literature for several decades; however, [9] concluded that these signatures appear to be rare in practice and, when present, require observation of a wide portion of the spectrum using sophisticated sensors [16].
Although hyperspectral data have been successfully used to classify as many as 42 species [6,9,17], large-scale applications of hyperspectral-based species classification face challenges related to practicality and cost. The increased spectral resolution usually demands careful acquisition from expensive sensors and is constrained by illumination and atmospheric requirements. The post-processing of these data can also be complex and requires careful correction of atmospheric impacts and noise reduction. Finally, the substantial volumes of data must often be subjected to dimensionality reduction before analysis can proceed [13,18]. Classification is based on patterns in the calibrated reflectance spectra from the canopy and differences in data sources and quality can reduce the transferability of the classifiers [19]. Other information content, such as the structure, shape, texture and other distinctive but hard to quantify characteristics are often neglected or partially utilised. Efforts to characterise the texture or the shape of the crown or other attributes typically rely on a small number of engineered features to summarise complex attributes [11,13].
In contrast, the human visual system allows experienced individuals to distinguish many species by visual inspection alone. Some cryptic species remain hard to tell apart visually, but trained experts (and even non-experts) can discriminate a surprising number of species [20]. This has led to the development of sites such as iNaturalist, where members of the public can upload images of species for experts to identify [21]. Recently, the advent of deep learning models based on convolutional neural networks (hereafter referred to as deep learning) has transformed the capability of machines to perform fine-grained classification of images, often reaching or exceeding human-level accuracy [22,23]. The architecture of these networks allows these networks to effectively learn the features important for classification. This is an important contrast with other approaches as the features are not engineered or pre-selected but rather learned by the network from labelled training examples with little requirement for image pre-processing.
Deep learning has been used for tree species classification from various combinations of LiDAR, hyperspectral and multispectral imagery [24][25][26][27]. Many studies have also successfully used simpler RGB imagery for species detection and classification. Importantly, these approaches have demonstrated a remarkable capacity to perform fine-grained species classification from consumer-grade camera imagery that is poorly suited to traditional remote sensing [28,29]. However, these studies have mostly used RGB data collected from UAV [30][31][32] and to a lesser extent high-resolution satellites [33,34], which constrains the ability to scale predictions in the former case or limits the spatial resolution of predictions in the latter case.
Although RGB imagery is routinely captured at regional levels by fixed-wing aircraft in many countries, few studies have undertaken large-scale host species identification using this ubiquitous data source. These data often include only RGB colour channels in uncalibrated radiance values rather than reflectance. The simplicity of these data means that large areas can be captured at high-resolution (<10 cm) for lower unit cost. Successful application of deep learning for large-scale host species identification using aerial imagery offers a scalable method to support biosecurity responses that bypasses many issues facing ground-based surveillance such as permissions and safe accessibility.
Classification of tree species is generally enhanced when there is low spectral variability within a species and high spectral variability between the target and other species [35]. Often there are times during the year when interspecies spectral variability is greater because of variation in phenological attributes such as leaf flush, senescence, or flowering. Little research has examined how phenological variation can be used by deep learning to improve species classification in trees, although we are aware of one such study for an invasive weed [36]. Collection of data from a species during a period of distinctive phenology could assist the use of deep learning through both enhancing predictive precision and providing a means to rapidly generate large training datasets.
Myrtle rust, caused by the fungal plant pathogen Austropuccinia psidii (G. Winter) Beenken (syn. Puccinia psidii), affects a broad range of hosts in the Myrtaceae family, causing lesions, dieback and, in some cases, mortality [37,38]. The pathogen is airborne and has spread rapidly around the globe [39][40][41][42]. New Zealand is home to at least 37 native myrtaceous species [43]. Of these, Metrosideros excelsa Sol. Ex Gaertn (pōhutukawa) has very high cultural value and has been widely planted for amenity purposes. This coastal evergreen tree has a sprawling habit of up to 20 m and produces dense masses of red flowers over the Christmas period [44], earning it the name 'the New Zealand Christmas tree'. Observations from pōhutukawa growing in other countries where myrtle rust is present indicate that the species is susceptible to myrtle rust [45,46].
In May 2017, myrtle rust was detected on the New Zealand mainland for the first time [47]. The disease has spread rapidly and has established on numerous native and exotic host species [48].
The overarching goal of this research was to test novel methods suitable for large-scale identification of key Metrosideros host species focussing on pōhutukawa as a test case. Specifically, the objectives of the research were to (1) test two state-of-the-art classification methods (XGBoost and deep convolutional neural networks) applied to three-band aerial imagery leveraging the strong phenology of pōhutukawa, i.e., distinctive flowering in summer, (2) test classification of the same trees without the assistance of phenology by using historical aerial imagery (3) test how practical and generally applicable these techniques are in real-world conditions by creating a combined dataset from objectives 1 and 2 that contained imagery captured using different sensors in different years and that showed a mixture of flowering and non-flowering trees.

Ground Truth Data
New Zealand maintains an extensive biosecurity surveillance system and an established incursion response protocol. During the first months after the incursion of myrtle rust, sites that were confirmed to contain infected hosts received intensive ground-based surveys to identify and inspect all potential host species within a fixed radius from the infected site. New, confirmed infections triggered additional searches around the new site. A mobile app used by trained inspectors was used to record the genus, GPS location and infection status for every host inspected during the response. These efforts produced a substantial volume of ground surveillance data including GPS locations and positive identification of Metrosideros spp. by trained inspectors. Many of the trees inspected were present within the coastal city of Tauranga (Figure 1), and nearly all the records for Metrosideros spp. in this region were pōhutukawa. The extensive and distinctive red flowers of pōhutukawa are easily identifiable from above in the summer which made this species an ideal candidate to test the potential to utilise phenology to enhance species identification in RGB aerial imagery. For much of the rest of the year, some degree of buds, flowers, or seed capsules are present but less distinctive. However, the multi-leader crown shape and blueish hue of the large, waxy and elliptical leaves are also distinctive and present all year round ( Figure 2).
Aerial imagery captured over Tauranga during the 2018-2019 summer period (Table 1) was overlaid with the ground surveillance locations in a GIS. Locations were collected using consumer-grade GPS and could only be considered approximate.  Table 1. Summary of multitemporal imagery used to develop classification models.

Imagery Dataset Phenology Resolution, Colour Channels
Tauranga-summer 2018-2019 Wide-spread flowering 10 cm/pixel, 3-band RGB Tauranga-March 2017 Limited flowering 10 cm/pixel, 3-band RGB For each inspection record, a trained analyst examined the GPS point and identified the corresponding tree in the aerial imagery. If the tree showed at least some evidence of flowering, then the imagery was annotated by delineating a bounding box around the canopy extent. The distinctive features of the canopy and strong flowering observed in the imagery greatly assisted identification and annotation; however, inspection records were only at the genus level and other species with similar phenological traits such as Metrosideros robusta (rātā) may occasionally be found within this region. In addition, some cultivated Metrosideros excelsa 'Aurea' ('yellow' pōhutukawa) appeared to be present within the dataset but these were removed due to the small number of samples available.
We assessed the purity of the training dataset by inspecting the majority of trees using publicly available, street-level imagery followed by on-site inspections for a smaller subset of trees. The results show that all study trees identified through combining the aerial imagery and surveillance records were pōhutukawa. After completing this process, we considered that the assembled training dataset consisted of only pōhutukawa and any misclassifications would have been very small in number.
Development of the classifiers also required negative examples. The candidate negative examples were any tree other than Metrosideros spp., hereafter referred to as other species. We once again leveraged the ground inspection efforts to develop this dataset. The intensity of the initial surveillance efforts meant that within inspected areas, such as streets or parks, the locations for nearly every pōhutukawa were recorded. We used these areas to select negative examples and cross-referenced a substantial portion of the dataset against other imagery and field inspections. This approach reduced the chances of acci-dentally including pōhutukawa or biasing the training set by excluding species that were visually similar to pōhutukawa due to uncertainty. In addition, this provided a realistic set of non-target tree canopies that the classifier might encounter in the areas surveyed for the biosecurity response. Bounding boxes around the canopies were defined against the aerial imagery and annotation proceeded until the dataset was balanced. Figure 3 shows examples of typical and atypical pōhutukawa and other tree species as seen in the aerial imagery.

Imagery Datasets
The aerial imagery datasets consisted of large orthomosaics generated from campaigns carried out in 2017 and 2019 using different aerial cameras ( Table 1). The imagery from 2017 showed lower levels of detail, probably due to poorer image matching, and the trees had less visual detail ( Figure 3). The bounding boxes were used to extract sub-images from the larger orthomosaics and each image 'chip' showing a tree canopy was labelled with the dataset year and class (pōhutukawa or other spp.). Very small trees (canopy radius <~1.5 m) were excluded as these canopies contained too few pixels.
The final datasets included 2300 images of tree canopies evenly split between pōhutukawa and other spp. with images available for both 2017 and 2019 ( Table 2). Images of pōhutukawa from 2019 and 2017 were used, respectively, to test the classification with and without the assistance of strong phenological features ( Table 2). The imagery from the 2017 and 2019 datasets was combined to assess how well the model would generalise under real-world conditions ( Table 2). The images of tree canopies were randomly split into training data (70%) used to fit the models. Validation data (15%) were used to select hyperparameters and evaluate model performance during training and a test set (15%) was used to assess final model performance on completely withheld data (Table 2). Trees were assigned to the same splits in the 2017 and 2019 datasets for a fair comparison of the models. For the combined dataset, data were re-shuffled at the tree level and the imagery from both years was included in the assigned split to prevent data leakage.

Deep Learning Models
We selected the ResNet model architecture [49] for classification of the tree canopies. The ResNet model is made up of small building blocks called the residual block. Each residual block is primarily made up of two to three convolution layers (this is dependent on the depth of the network) stacked together. The convolution layers are designed to learn and fit against the residual of the target function. The learned residual is then mapped back to the learned function through a skip connection that connects the input of the residual block to the output of the stacked convolution layers. By designing the neural network to learn and optimise on the residual instead of the original function, ResNet can learn the unknown original function more easily, thereby improving accuracy. We used the ResNet-50 architecture, which comprises 49 convolution layers organised into residual blocks and a fully connected layer for classification ( Figure 4). A randomly initialized, fully connected layer was trained for 2 epochs to adapt a model pre-trained on the ImageNet [50] task to the binary classification task in this study. Thereafter, differential learning rates in the range 1 × 10 −3 -1 × 10 −6 were used to adapt deeper layers of the network at linearly decreasing learning rates for another 30 epochs. At this point, the performance validation metrics showed no further benefits from additional training. All deep learning models and metrics were implemented using the PyTorch 1.4 deep learning library [51] and the Scikit-Learn Python package [52]. Model training was carried out using a Nvidia Tesla K80 GPU with 12 GB of memory.

XGBoost Models
Approaches to species classification frequently use imagery to generate variables (metrics), such as vegetation indices, to capture features or characteristics useful for discriminating different species [9,13]. This may be done using rule-based methods [53] or machine learning methods such as decision trees [54]. We chose variables that target the distinctive properties of pōhutukawa canopies. These included spectral metrics aimed at capturing the blueish hue of the leathery, elliptical leaves and the strong and distinctive sprays of red flowers present in summer. The canopies also exhibit distinctive textural properties arising from the multi-stem structure and leaf and bud arrangements independent of the presence or absence of flowers (Figures 2 and 3). Texture analysis using grey-level co-occurrence matrices (GLCMs) [55] was used to try and capture these characteristics. Computation of the texture images was done using the 'glcm' package [56] in R [57]. The GLCM metric classes and parameters were selected based on the analysis and recommendations of [58]. The raw digital numbers (pixel radiance values) within each canopy bounding box were used to generate patch-level mean values for the predictive variables (Table 3). This was necessary because this type of imagery is optimised for visual appearance and lacks the information required to calculate reflectance.
We selected the XGBoost algorithm to perform binary classification using the 'xgboost' package for R [59]. XGBoost is a tree-based machine learning algorithm that is scalable, fast and has produced benchmark results on classification tasks [60]. The spectral and textural variables were used to train the XGBoost classifier for a maximum of 400 iterations, with early stopping based on validation set metrics used to prevent over-fitting. Subsampling of variables and observations for individual tree learners was also implemented alongside fine-tuning of the gamma hyperparameter to further guard against over-fitting.

Performance Metrics
The predictions made by the final models on the withheld test splits of the three imagery datasets were used to compute the number of correct classifications (true positives) and incorrect classifications (false positives) for the pōhutukawa and 'other spp.' classes. These values were used to compute the classification performance metrics shown in Table 4. Table 4. Performance metrics used to assess classification models. TP = true positive, FP = false positive, TN = true negative, FN = false negative.

Metric Description Definition
Accuracy A measure of how often the classifier's predictions were correct.

Error
A measure of how often the classifier's predictions were wrong. 1 − Accuracy Cohen's kappa A measure of a classifier's prediction accuracy that accounts for chance agreement.

Sensitivity (Recall)
The proportion of actual positives (Metrosideros) that were correctly identified by the classifier.

Specificity
The proportion of actual negatives (other species) that were correctly identified by the classifier.

XGBoost Models
The results from the XGBoost and deep learning models applied to the withheld portions of the datasets used to test classification with phenology (2019 imagery), without phenology (2017 imagery) and classification of the combined datasets are shown in Table 5. The XGBoost classifiers showed moderately high accuracy on all three datasets ( Table 5). The strong phenological traits of the pōhutukawa captured in the 2019 summer imagery produced the model with the highest accuracy (86.7%). The sensitivity and specificity were similar, reflecting nearly equal rates of false negatives and false positives. The variable importance scores extracted from XGBoost are shown in Figure 5. The scaled green pixel values and the RG ratio metric capturing the ratio of red to green pixels had the highest importance in the 2019 model utilising phenology. These two metrics most likely captured differences between the mostly green canopies of other spp. and the extensive showers of red flowers present on many pōhutukawa. The misclassified pōhutukawa often showed lower levels of flowering or were very small (Figure 6a). There were several hundred non-pōhutukawa trees in the dataset with canopies that appeared red in colour. These trees were often falsely classified as pōhutukawa (Figure 6b). This suggests that variables capturing the strong flowering patterns drove the high performance of the XGBoost model but struggled to separate other species with reddish or darker canopies.

Deep Learning Models
The deep learning models performed substantially better than the XGBoost models on all three datasets. The classifier developed using the 2019 imagery with strong phenology achieved an accuracy of 97.4%. The specificity of the model (98.3%) was slightly higher than the sensitivity (96.5%). The false negatives often showed similarities to pōhutukawa with a few exceptions (Figure 7a). The few false positives (Figure 7b) included a single relatively obvious error and some examples with limited or irregular flowering. The classification performance indicated that the model was highly effective at discriminating flowering trees from most other species with reddish canopies or other flowering species present in the data. Without using the strong flowering, the accuracy of the deep learning classifier dropped to 92.7% (Table 5). The model appeared to struggle more with the canopies affected by the lower quality of the imagery-small, blurry canopies without the characteristic appearance visible in other images frequently appeared in the misclassified images and the false negatives and false positives were visually similar to each other (Figure 7c,d).
As with the XGBoost models, the deep learning model trained on the combined imagery from both 2017 and 2019 (with and without strong phenology) did not show improved performance with a larger dataset. The model achieved 95.2% accuracy and showed the largest difference between sensitivity (93.2%) and specificity (97.3%), reflecting additional false negatives. Most of the misclassified canopies were from the 2017 dataset, and once again these images often showed blurry and indistinct features relative to other correctly classified examples (Figure 7e,f).

Discussion
This study demonstrated that deep learning algorithms could classify pōhutukawa in the study area with a very high level of accuracy using only three-band RGB aerial imagery, with or without the use of phenology to enhance detection. Existing remote sensing approaches to tree species classification rely extensively on calibrated multi or hyperspectral data that can be expensive and complex to capture over larger areas [9,13,18,24,26,27]. In contrast, RGB aerial imagery is routinely captured over large areas. Our results suggest that combining deep learning with this type of imagery enables large-scale mapping of visually distinctive species.
Significant gains in deep learning model accuracy were realised through leveraging the visual distinctiveness of pōhutukawa flowering that was clearly identifiable in 2019 aerial imagery. This distinctive phenological attribute also greatly assisted the collation of a robust number of samples (1150) that was large in comparison to many other tree classification studies [9,24,62]. Although we had access to ground-truth data, the characteristic flowering would have allowed most trees to be readily identified without the ground inspections. Through linking these clearly visible tree locations to previously collected imagery of pōhutukawa that were not flowering, it was possible to rapidly assemble data and train deep learning models that could accurately classify pōhutukawa without the strong phenology. Through combining these two sets of phenologically contrasting images we were able to assemble and train a model from a dataset that more closely approximated a real-world scenario where pōhutukawa exhibited variation in phenological expression. This workflow highlights how imagery with clear phenological traits can be used to rapidly assemble a more general dataset and through this approach mitigate a common bottleneck for training deep learning models.
The phenology of tree species has previously been used to enhance remote sensing classification [63,64]. However, attempting classification using only three-band imagerywith or without phenology-is less common [9]. This imagery lacks the spectral bandwidth required by most traditional methods to discriminate species. The few indices that can be derived are not widely generalisable, as the imagery represents sensor radiance rather than reflectance from the canopy and the imagery is manipulated to enhance visual appearance. To overcome this limitation, we derived features such as textural metrics and simple band ratios aimed at capturing the bright-red, extensive flowering of these species and the characteristic blueish hue and textural properties of the canopies.
This approach was successfully used by the XGBoost classifier for classification in the presence of phenology, and although not as accurate, the model without phenology was still reasonably robust. The performance of both models was high compared to other examples in the literature. For example, [62] achieved 68.3% classification accuracy of pōhutukawa using multispectral satellite data from the Coromandel region in New Zealand. The addition of LiDAR-derived features improved this result to 81.7% but pōhutukawa were noted to be more difficult to detect than several other species targeted in that study. It is likely that having multispectral and LiDAR data would have further improved the XGBoost results in our study, but this would come with higher costs for data acquisition, storage and processing.
The deep learning approach differed in fundamental ways to traditional remote sensing methods. While the models will utilise the colour of the canopies, as demonstrated by the accurate classification of flowering trees, the deep learning approach is also capable of learning harder-to-quantify features. For example, the characteristic appearance of the multi-stem canopy, distinctive canopy texture and extensive budding are relatively easy for knowledgeable analysts to identify in the aerial imagery and the deep learning models can 'learn' that these or similar features are important. This makes the models harder to interpret but powerful for complex classification tasks [65].
One key methodological limitation of our approach was the need to manually delineate individual tree canopies before training and inference could be carried out. This requirement is present in many traditional remote sensing approaches to species classification. A common workflow is to use LiDAR-derived elevation data alone [66] or in combination with multispectral data (especially the vegetation-sensitive NIR band) to delineate tree canopies [15,67]. While effective, this method introduces the need for costly LiDAR data and substantial analysis to extract canopies. More complex deep learning frameworks may offer an alternative option to perform both segmentation and classification, although the training data are more expensive to collect [31,68].
The high classification accuracies observed in this study are likely subject to some caveats. The models were exposed to the unique characteristics and properties of both aerial imagery datasets. Deep learning approaches do not expect or require calibrated or corrected imagery, but it is possible that subtle differences in resolution or other dataset characteristics may reduce transferability to new, unseen aerial imagery. The level of flowering seen in the 2019 dataset varied widely and many trees showed limited flowering. However, the imagery was also sharper and many of the other characteristic features of pōhutukawa were more easily visible in the imagery (e.g., buds, canopy form and hue, e.g., Figure 3). This provided additional features for the deep learning models and likely contributed to the high accuracy above and beyond the distinctiveness of the flowering. The 2017 imagery had the same nominal resolution (10 cm) but had markedly lower quality and detail (Figure 3). The pōhutukawa all exhibited a blueish hue in this imagery and some of the textural attributes were still discernible-both of which are likely to have contributed to the performance of the deep learning model. For predictions to work in new areas, the features learned from these datasets would need to be discernible in the new imagery. A brief test conducted by reducing the resolution of some of the imagery (bilinear resampling) showed that the accuracy of the combined classifier declined rapidly as the distinctive features were lost, with simulated 15 cm imagery showing only a 70% accuracy rate.
The resolution of the imagery also placed a limit on the size of the trees that could be classified. Many canopies fell between 30 and 60 pixels in size. At this size, the characteristic traits were difficult for a human observer to discern and the models would also have had limited information to learn from. This problem was reduced when phenology could be utilised, but smaller canopies were more frequently misclassified. It is very likely that higher-resolution imagery would have improved the classification accuracy still further and may enhance the transferability of the models. Outside of this domain, for example, where only moderate to low resolution imagery is available, traditional multispectral or hyperspectral methods may be more appropriate as they attempt to recover and utilise the spectral attributes of the canopy that can persist at coarser resolutions or be retrieved through unmixing.
Future work should explore expanding these methods to a greater number of species and validate the transferability of deep learning models across multiple, regional datasets. An extremely promising area of research is the potential for combined deep learning architectures that offer localisation and segmentation as well as classification [31,33]. This work could enable large-scale and repeatable mapping of tree species across a range of environments from lower-cost RGB datasets. This would be useful for biosecurity as well as many other applications.

Conclusions
In this study, we combined distinctive phenological traits and biosecurity surveillance records to develop a high-quality dataset to train and test novel algorithms to detect pōhutukawa from simple three-band (RGB) aerial imagery. Both modelling approaches performed well when the dataset included distinctive phenological traits (extensive, bright red flowers). However, the deep learning algorithm was able to achieve very high accuracies even in the absence of some key traits such as the distinctive flowers. The results of this study suggest that deep learning-based approaches could be used to rapidly and accurately map certain species over large areas using only RGB aerial imagery. Candidate species include those where classification is achievable by an experienced analyst using the same input data. The deep learning approach did appear sensitive to image resolution and quality and higher resolution imagery would likely expand the range of species suitable for classification using this method.