Species Classification in a Tropical Alpine Ecosystem Using UAV-Borne RGB and Hyperspectral Imagery

Páramos host more than 3500 vascular plant species and are crucial water providers for millions of people in the northern Andes. Monitoring species distribution at large scales is an urgent conservation priority in the face of ongoing climatic changes and increasing anthropogenic pressure on this ecosystem. For the first time in this ecosystem, we explored the potential of unoccupied aerial vehicles (UAV)-borne red, green, and blue wavelengths (RGB) and hyperspectral imagery for páramo species classification by collecting both types of images in a 10-ha area, and ground vegetation cover data from 10 plots within this area. Five plots were used for calibration and the other five for validation. With the hyperspectral data, we tested our capacity to detect five representative páramo species with different growth forms using support vector machine (SVM) and random forest (RF) classifiers in combination with three feature selection methods and two class groups. Using RGB images, we could classify 21 species with an accuracy greater than 97%. From hyperspectral imaging, the highest accuracy (89%) was found using models built with RF or SVM classifiers combined with a binary grouping method and the sequential floating forward selection feature. Our results demonstrate that páramo species can be accurately mapped using both RGB and hyperspectral imagery.


Introduction
Páramos are located across the highlands of Costa Rica and Panamá down to the northern Andes of Ecuador, Colombia, Venezuela, and Peru [1]. These highly diverse tropical alpine ecosystems provide various services to some of the main capital cities in Latin America, including clean water provision and carbon storage [2], and are an essential reservoir of species with high pharmaceutical potential [3]. Currently, this ecosystem is facing multiple threats that are reducing its surface area. Some of these threats result from unsustainable land use in mining and agriculture, the introduction of invasive plant and animal species, and impacts from climate change [4,5], hence the paramount importance of understanding the vegetation structure, in terms of composition and spatial patterns at multiple scales to ensure its sustainable management and conservation. However, such understanding has been hindered by the inaccessibility and cost of performing surveys at extensions larger than a set of localized plots. While in other ecosystems, remote sensing has shown to be useful for acquiring data at the species-level, particularly high-resolution imagery (<1 m/pixel) in the tropical lowlands [6] and temperate ecosystems [7], in the páramo, the practicality of remote sensing has not yet been thoroughly evaluated.
The potential of remote sensing lies in the ability to provide continuous data at multiple spatial, temporal, and spectral scales, thereby acquiring information on multiple properties of the landscape. Furthermore, it is a tool to inform sampling or as a sampling approach itself in areas where sampling is extremely difficult. Digital imagery in the red, green, and blue wavelengths (henceforth RGB imagery), useful in manual species identification, enables spatial patterns analysis and provides valuable data to inform field sampling design and extend field sampling from already sampled sites to unsampled sites [8]. This technology, thought to be replaced with the newly available thermal and multispectral sensors, continues to be actively used in the development of digital surface models (DSM), animal monitoring (e.g., orangutans and elephants in Malaysia), geological feature assessment (e.g., coastal features) and low-cost agricultural applications [9]. The development of light-weight hyperspectral sensors has increased its medical, agricultural, and environmental applications, as it provides more information on the biochemical structure of surfaces. By measuring reflecting light at hundreds of spectral bands, it allows the detection of subtle differences in leaf composition (e.g., pigments, nutrients, water content) and structure (e.g., size, thickness, shape) from the individual plant scale and at varying degrees of aggregation [10]. Providing detailed information that allows the detection of plant diseases or lack of nutrients and water, invasive plant species, and differentiation of single plant species [11]. This sensor can be mounted on satellites, planes, and unoccupied aerial vehicles (henceforth UAV), with a resulting variation in the spatial resolution of the data and, more importantly, the availability and affordability of each combination sensor(s) vehicle [12] in ecological studies.
Ecological studies have benefited from UAV-borne remote sensing technology, and its potential is thought to have revolutionized ecology and conservation [13], especially in developing countries where research and cost-effective monitoring schemes are urgently needed. However, high ecosystem complexity results in painstaking and costly fieldwork campaigns. UAV-borne RGB and hyperspectral studies have been mostly performed in lowland ecosystems, but recently, increased attention has been placed in highland ecosystems, mainly in Europe, at the alpine grasslands [14]. In the tropical alpine regions of Latin America, the páramo, for example, UAV-borne RGB and multispectral imagery have rarely been used [15,16], and there are no studies using UAV-borne hyperspectral imagery investigating species detection and classification. One possible explanation might be that extracting ecologically meaningful information from hyperspectral data can be complex (i.e., time, computer processing) because it depends on environmental variables that affect the amount of light reflected on the sensor. Likewise, vegetation properties like species diversity and plant size and structure can affect species detection. Tropical alpine ecosystems are characterized by conditions that can hinder our ability to use this new technology; such as high humidity and cloudiness, rough topography, and tremendous diversity of diminutive plant species (around 3500 species), many endemic to the páramo (60% of all species) [1]. Consequently, the combination of spectral band selection and species classification methods to deal with the effect of environmental factors and the high amount of information per pixel (i.e., pixel problem) [17] has a significant effect on the applicability of the hyperspectral images in this setting.
Among the most commonly used species classification methods using hyperspectral data are random forests (RF), support vector machine (SVM) and artificial neural networks (ANN). RF, based on the construction of many decision trees with random subsamples of the training data that are then combined using individual tree votes [18], has successfully been used in land cover and plant species classification [19]. While using SVM, based on a multidimensional feature space where classes are divided using the largest possible separation by applying a kernel function to the training data [20], has resulted in high accuracies, especially in cases of small sample size, and a high number of spectral bands [21]. Moreover, ANN has mostly been used for land cover classification [22]. Raczko and Zagajewski (2017) [23] have compared these three methods for tree species classification and found higher overall accuracy using ANN, but more stability in the overall accuracy using RF and SVM when the sample size is small. A large number of bands increases ANN computational time for model training and testing compared to RF and SVM. Thus, we decided to focus on RF and SVM as our hyperspectral data has a large number of spectral bands and small sample sizes.
In this study, the first one in the páramo ecosystem, we evaluated the potential of RGB and hyperspectral sensors mounted on UAVs for manual (RGB imagery) and supervised (hyperspectral data) species classification. We quantified the accuracy of species detection from RGB imagery using direct observation. From hyperspectral imagery, we identified the reflectance values and evaluated the potential of hyperspectral data to classify individual species using SVM and RF [23][24][25]. All the analyses were performed at the Matarredonda páramo located in the Cruz Verde-Sumapaz páramo system, at the eastern range of the Colombian Andes ( Figure 1).
Drones 2020, 4, x FOR PEER REVIEW 3 of 18 data) species classification. We quantified the accuracy of species detection from RGB imagery using direct observation. From hyperspectral imagery, we identified the reflectance values and evaluated the potential of hyperspectral data to classify individual species using SVM and RF [23][24][25]. All the analyses were performed at the Matarredonda páramo located in the Cruz Verde-Sumapaz páramo system, at the eastern range of the Colombian Andes ( Figure 1).

Materials and Methods.
We explored the potential of RGB and hyperspectral high-resolution UAV-borne data for species classification. To this end, we first evaluated the accuracy in species identification from direct observation of high-resolution RGB images and from the results ( Figure 2B), selected the species to be use for the automated classification using hyperspectral data ( Figure 2C). Secondly, for hyperspectral imagery, we tested various approaches to build species classification models to assess the potential of hyperspectral images and identify the most appropriate modeling protocol that ensured reliability and replicability ( Figure 2D).

Materials and Methods
We explored the potential of RGB and hyperspectral high-resolution UAV-borne data for species classification. To this end, we first evaluated the accuracy in species identification from direct observation of high-resolution RGB images and from the results ( Figure 2B), selected the species to be use for the automated classification using hyperspectral data ( Figure 2C). Secondly, for hyperspectral imagery, we tested various approaches to build species classification models to assess the potential of hyperspectral images and identify the most appropriate modeling protocol that ensured reliability and replicability ( Figure 2D).

Study Site
The study site is located at Parque Ecológico Matarredonda (4°33′38.1″ N and 74°0′7.3″ W) in the eastern range of the Colombian Andes, which is part of the Cruz Verde-Sumapaz páramo complex ( Figure 1). Elevation ranges from 3100 up to 3600 m.a.s.l. The mean annual precipitation is 1178 mm; the mean temperature is 8.8 ºC, and the mean relative humidity is 88% [26]. Matarredonda comprises 690 ha of páramo ecosystem connected to the Cruz Verde-Sumapaz complex and surrounded by a matrix of forest, roads, and agriculture. In this páramo, vegetation is characterized by short and small growth forms composed by rosettes, shrubs, graminoids, forbs, and mosses. The Cruz Verde-Sumapaz complex has 1857 identified plant species. According to previous surveys, Matarredonda has approximately 30 vascular plant species [27]. However, our annual census has counted more than 97 species, including two the emblematic species in the genus Espeletia, Espeletia argentea, and Espeletia grandiflora [28]. Three soil orders have been found in Matarredonda; inceptisols, histosols and entisols that evidence high humidity levels and organic concentration and reveal past agricultural land use [29].

Spectral and Field Data
RGB and hyperspectral data for a 10 ha polygon were taken. RGB data were collected with a camera FC6310 (8.8 mm) mounted on a DJ Phantom Pro drone set at a flying altitude of 79.5 meters, Figure 2. Workflow for species manual detection using red, green, and blue wavelengths (RGB) (B) and automated classification using Hyperspectral data, including preprocessing (C) and automated classification (D). Boxes in dashed lines (A) correspond to input data.

Study Site
The study site is located at Parque Ecológico Matarredonda (4 • 33 38.1" N and 74 • 0 7.3" W) in the eastern range of the Colombian Andes, which is part of the Cruz Verde-Sumapaz páramo complex ( Figure 1). Elevation ranges from 3100 up to 3600 m.a.s.l. The mean annual precipitation is 1178 mm; the mean temperature is 8.8 ºC, and the mean relative humidity is 88% [26]. Matarredonda comprises 690 ha of páramo ecosystem connected to the Cruz Verde-Sumapaz complex and surrounded by a matrix of forest, roads, and agriculture. In this páramo, vegetation is characterized by short and small growth forms composed by rosettes, shrubs, graminoids, forbs, and mosses. The Cruz Verde-Sumapaz complex has 1857 identified plant species. According to previous surveys, Matarredonda has approximately 30 vascular plant species [27]. However, our annual census has counted more than 97 species, including two the emblematic species in the genus Espeletia, Espeletia argentea, and Espeletia grandiflora [28]. Three soil orders have been found in Matarredonda; inceptisols, histosols and entisols that evidence high humidity levels and organic concentration and reveal past agricultural land use [29].

Spectral and Field Data
RGB and hyperspectral data for a 10 ha polygon were taken. RGB data were collected with a camera FC6310 (8.8 mm) mounted on a DJ Phantom Pro drone set at a flying altitude of 79.5 m, which resulted in a set of 94 images with 1 cm pixel resolution. Hyperspectral data were collected with sensor 1003A-20502: Hyperspec Nano VNIR (400-1000 nm with a band width of 2.2 nm), with lenses 1004A-21444: F/1.4, 400-1000 nm, compact barrel, C-Mount, 17 mm, Global Positioning System (GPS), Inertial Motion Unit (IMU), and fiber-optic downwelling irradiance sensor (fodis unit) mounted on the UAS-ART Unmounted Aerial Vehicle (UAV) DJI Matrice 600 Pro. The flying altitude was 118 m and resulted in a total of 4 strips (henceforth images A, B, C, and D in Table 1) of 272 spectral bands with 3 cm pixel resolution covering an area of~1738 m 2 each image. Table 1. Summary of individuals mapped and validated in the field with its corresponding spectral data for each of the hyperspectral images analyzed (A, B, C and D).

Species
Image Individuals Pixels Field data were collected from 10 previously established, permanent plots (1 m × 1 m each), part of an ongoing vegetation experiment, where each plant on the plot has been identified to the species or genus level, and their position has been registered (projection system WGS 84). An additional field campaign was performed to add a 100 m 2 vegetation ground survey around the plots, to include more individuals per species of the 40 species that we manually identified on the RGB imagery based on the 1 m 2 plot vegetation data.

Calamagrostis effusa
RGB data were georeferenced and orthorectified using GRASS GIS software. Hyperspectral data were processed using the Hyperspec III software (Headwall Photonics Inc. Fitchburg, MA, USA) that synchronizes the image cubes with the GPS/IMU data to allow orthorectification. In succession, we transformed the raw image cube digital numbers (DN) to radiance and reflectance values using the real-time solar radiance collected in the fodis unit in the SpectralView software (Headwall Photonics Inc. Fitchburg, MA, USA).

RGB Imagery Analysis-Manual Detection
To evaluate the RGB images' applicability to identify páramo species, we used a manual training and testing approach. At the training 1 m 2 plots, we used the ground locations to join the point location to its crown shape in the RGB image. However, since the plot's area was too small to include more than a couple of individuals per species, we performed another ground survey in the 100 m 2 around each plot, collecting the XY location of individuals of the species already matched in the 1x1 m plot. This step increased the training and testing area to 100 m 2 which was used to develop the identification key and perform the tests.
At the 100 m 2 training areas, characteristics such as growth form (e.g., rosette, grass, shrub), color and size, were used to develop an identification key for each species in the RGB image correctly matched the ground XY location data. The key consisted of a page with a collection of cropped images of the focal species and a written classification of its size, color, and growth form. The key was then used to help 2 trained observers consistently map species in the field, dropping a point with the species ID on a newly created vector file. Then, the identification in the test vector file was revised, comparing ground data points collected with the GPS, and evaluating the results using a confusion matrix and accuracy statistics ( Figure 2B). All the analyses were performed using QGIS software [30].

Hyperspectral Data Analysis-Automated Classification
For the hyperspectral data, we used five of the species correctly identified from the RGB imagery to select at least five individuals per species in each plot and each of the images to perform the classification. Table 1 summarizes the datasets used for each species and image.
The hyperspectral data contained 262 spectral bands (400-1000 nm-band width 2.2 nm). Feature selection is commonly used for this type of data to select spectral bands with the highest prediction ability [18]. We used three feature selection methods: (1) all spectral bands, (2) the spectral bands with the highest importance based on the RF mean decrease in Gini values, and (3) features selected using the sequential floating forward selection (SFFS). For method (2), the spectral bands with the highest importance were identified based on the random forest mean decrease in Gini values, which calculates the importance of each spectral band as a measurement of the purity of class samples gain in each individual RF tree split. For method (3), the bands with the highest spectral separability were identified using SFFS, based on a Gaussian mixture model (GMM) classifier, that selects bands through back and forward iteration until it identifies the bands with the highest spectral separability base on the Jeffries-Matusita distance [18].
To investigate the effect of an approach using a single focal species vs. an approach with multiple focal species, we tested two class-grouping methods: (1) binary training data were divided into focal species class and non-focal species class and, (2) multiple training data were divided into 4 to 5 classes corresponding to each of the focal species present in the image.
The RF classifier is a robust method, especially dealing with complex highly-colinear variables, such as hyperspectral bands, where it has been used to identify various types of landcover, from invasive plant species to crop types [31][32][33][34][35]. In this study, the RF classifier was used with two purposes, to assess the importance of the variables via the Gini index and to perform the image classification. The SVM classifier is a non-parametric free classifier [36] that has been used successfully in hyperspectral image classification [6,18] (Figure 2D).
A dataset was built extracting pixel values for each spectral band of each individual plant in the sample. The dataset was divided into two random sample partitions using the createDataPartition command in the R package caret [37] that divides the dataset into two groups of pixel values while preserving its class distribution; in this case, the number of pixels per species. The resulting datasets consisted of a training sample from 40% of the total data and a testing sample with the remaining 60%. Additionally, to test the effect of the data partition used (per pixel partition vs. per sample partition), we divided the data for images C and D, which have the highest number of individuals sampled, into training and testing samples. The training data for C and D consisted of 40% of the individuals randomly selected and the testing sample had the remaining 60% of the individuals.
The classification models were compared using the same training and test samples. To assess model performance and perform parameter tuning, we evaluated two approaches. The random cross-validation approach does not include spatial autocorrelation, and the spatial validation approach includes the effect of spatially autocorrelated spectral bands in the dataset. For cross-validation, a subsample of the data was left out to test the trained model's performance using the rest of the training data. This process was repeated ten times, and the average of all the tests was used to estimate model performance [38]. For spatial validation, model performance was evaluated using spatial blocks, that is, equally sized polygons dividing the image. Model performance was repeatedly (ten times) tested using the data from all the spatial blocks but one, and the average of all the tests was used to estimate model performance [39].
Finally, the accuracy of all the combinations of selection and classification approaches was assessed using the testing dataset, to calculate overall accuracy (OA), Kappa accuracy (KA), specificity (SP), and sensitivity (SE) for each species. All the analyses were performed using the R software [40]. For the implementation in R of the caret package, we used the raster [41] and randomForest [42] packages, and for the feature selection process we used kernlab [43], varSel [44] and sf [45].

Manual Species Identification Using RGB Imagery
From the 97 species registered in the ground survey, 40 species were identified in the aerial images. From the identified species, we selected the ones from which at least five individuals in the training areas were correctly identified, to include variation in the species typology (shape, color, texture), which resulted in 21 species that were then searched for in the test plots. For these 21 species, accuracy was above 97%, while omission error ranged between 1.49 and 87.5. For 12 of those species, omission error was above 10% and up to 87%, showing that these species were difficult to see in the images (false absences). However, if they were spotted, they could be correctly identified with at least 97% accuracy (Table 2, Figure 3). The species with the highest accuracies, precision values, and lowest commission errors corresponded to large rosettes (E. grandiflora, E. argentea, and P. goudotiana), dense clumps (Sphagnum sp.), and common in the study area (C. effusa and H. goyanesis). These species were selected to perform the automated classification using hyperspectral imagery ( Figure 2B).
Drones 2020, 4, x FOR PEER REVIEW 7 of 18 estimate model performance [38]. For spatial validation, model performance was evaluated using spatial blocks, that is, equally sized polygons dividing the image. Model performance was repeatedly (ten times) tested using the data from all the spatial blocks but one, and the average of all the tests was used to estimate model performance [39]. Finally, the accuracy of all the combinations of selection and classification approaches was assessed using the testing dataset, to calculate overall accuracy (OA), Kappa accuracy (KA), specificity (SP), and sensitivity (SE) for each species. All the analyses were performed using the R software [40]. For the implementation in R of the caret package, we used the raster [41] and randomForest [42] packages, and for the feature selection process we used kernlab [43], varSel [44] and sf [45].

Manual Species Identification Using RGB Imagery
From the 97 species registered in the ground survey, 40 species were identified in the aerial images. From the identified species, we selected the ones from which at least five individuals in the training areas were correctly identified, to include variation in the species typology (shape, color, texture), which resulted in 21 species that were then searched for in the test plots. For these 21 species, accuracy was above 97%, while omission error ranged between 1.49 and 87.5. For 12 of those species, omission error was above 10% and up to 87%, showing that these species were difficult to see in the images (false absences). However, if they were spotted, they could be correctly identified with at least 97% accuracy (Table 2, Figure 3). The species with the highest accuracies, precision values, and lowest commission errors corresponded to large rosettes (E. grandiflora, E. argentea, and P. goudotiana), dense clumps (Sphagnum sp.), and common in the study area (C. effusa and H. goyanesis). These species were selected to perform the automated classification using hyperspectral imagery ( Figure 2B).

Automated Species Classification Using Hyperspectral Data
The selected spectral bands using SFFS and RF approach were spread across the spectrum with two clusters located between 420 and 630 nm, 735 and 840 nm. Different bands were selected for each image, with marked differences in image A and RF feature selection, for which the selected bands are between 550 and 800 nm (Figure 4). The most considerable differences in spectral values among species are located in the NIR zone of the spectrum, where the rosette P. goudotiana has the highest reflectance percentage values while the grass C. effusa has the lowest. These differences are likely to be related to the larger proportion of necromass that usually surrounds grasses compared to the green foliage of the bromeliads. The spectral signature of two species of the same genus, E. grandiflora and E. argentea, important endemic species of the páramo ecosystem, can be visually differentiated with higher reflectance percentage for E. argentea in the visible range in comparison with E. grandiflora, while the NIR zone E.grandiflora shows higher values than E. argentea ( Figure 4).
Of all the classification methods, the combination of two classes (binary), all bands feature selection, SVM classifier, and random cross-validation obtained the highest overall accuracy percentage (91%), followed by the same combination using the RF classifier (90%) ( Figure 5). Overall accuracy was also higher when using two classes (mean 85%) than when including multiple classes (mean 68%), but the differences in the number of samples for the two classes decrease the accuracy to a kappa accuracy mean value of 50%, while it remained higher when using multiple classes (mean 67%). The RF method had, in general, higher overall accuracy values than SVM, and both classifiers were affected by the feature selection method, with higher overall and kappa accuracy values for feature selection using the SFFS classifier compared to RF ( Figure 5).
As presented in Figure 6, overall and kappa accuracy were similar across images when all the bands were included, independent of the number of classes or the classifiers used. However, when a feature selection method was applied (SFFS or RF), lower overall and kappa accuracy values were observed (SFFS > RF) accompanied with a higher variation between images. This pattern was especially visible when features were selected using RF.
Regarding sensitivity and specificity, four of the five species studied (C. effusa, E. argentea, E. grandiflora, and Sphagnum sp.) had higher true negative rate (specificity > 96%) in models constructed from two classes, while the true-positive rate (sensitivity > 53%) was higher for models constructed from multiple classes. The fifth species, P. goudotiana, presented the opposite pattern where higher true-positive rates were found in the models constructed from binary classes, while higher true negative rates were found in the multiple class models (Table 3). Nevertheless, the models with relatively higher sensitivity and specificity were models with binary classes, the SVM classifier in the case of C. effusa, E. argentea, Sphagnum sp., and E. grandiflora. In general, the models constructed using all bands showed the highest specificity (>96%) rates, followed by the models using SFFS feature selection (specificity ± 95%). Spatial cross-validation significantly reduced the sensitivity rates for four of the species studied (C. effusa < 10%, E. argentea < 10%, E. grandiflora < 27%, and Sphagnum sp. < 23%) except for P. goudotiana (<77%). Notably, the highest specificity values for spatial cross-validation were from models constructed using the RF feature selection method and binary classes.
Regarding data partition, the overall accuracy was reduced when using the per individual data partition approach. However, the effect was smaller for the approaches using the binary classes, particularly in combination with the SVM classifier where the variation in accuracy was higher. In comparison, the effect was more substantial when using the multiple classes approach, especially in combination with the RF classifier (Supp. Figure S1).
Drones 2020, 4, x FOR PEER REVIEW 9 of 18 case of C. effusa, E. argentea, Sphagnum sp., and E. grandiflora. In general, the models constructed using all bands showed the highest specificity (>96%) rates, followed by the models using SFFS feature selection (specificity ± 95%). Spatial cross-validation significantly reduced the sensitivity rates for four of the species studied (C. effusa < 10%, E. argentea < 10%, E. grandiflora < 27%, and Sphagnum sp. < 23%) except for P. goudotiana (<77%). Notably, the highest specificity values for spatial crossvalidation were from models constructed using the RF feature selection method and binary classes. Regarding data partition, the overall accuracy was reduced when using the per individual data partition approach. However, the effect was smaller for the approaches using the binary classes, particularly in combination with the SVM classifier where the variation in accuracy was higher. In comparison, the effect was more substantial when using the multiple classes approach, especially in combination with the RF classifier (Supp. Figure S1).

Manual Species Identification Using RGB Imagery
Our study demonstrated the importance of RGB images for low-cost páramo species mapping and monitoring. Despite the size of the study area that limited the number of individuals per species we could use for training and testing, we were able to identify 41% of the species found in this ecosystem and evaluate the accuracy of 20 species with accuracy levels above 97%. Many páramo species were often not visible from the RGB image because of their small size and because they are often covered by larger species (e.g., shrubs, large rosettes). However, here we found that it was relatively easy to correctly identify some of them thanks to its conspicuous structure (Diplostepihum phylicoides) or the large patches they form (Sphagnum sp.). Higher resolution (<1 cm) might allow reducing the omission errors observed in this study as we observed it in preliminary higher resolution (1 cm pixel) RGB images.
Despite the high diversity of short-stature growth forms and the large proportion of cloudy and rainy days throughout the year, acquiring the images was feasible. It only required a couple of hours of clear sky to obtain 1cm pixel-resolution used in this study. Previous research has highlighted the importance of this type of imagery in different types of ecosystems from big trees in highly diverse tropical [46,47] and subtropical forests [48], to small stature species in temperate grasslands [49], in a diverse set of ecological studies that include mapping, restoration and monitoring [50][51][52]. In this study, we conclude that high-resolution imagery (1 cm pixel) has great potential for at least 21% of the species, comprising a range of growth forms from big rosettes (P. goudotiana), endemic species (Espeletia sp.) to mosses (Sphagnum sp.), that are known to be essential in this ecosystem [53][54][55]. Such an outcome shows the potential of UAV-borne RGB imagery for low-cost, in terms of time and money, efforts to visually detect changes in the species composition and enhance ground surveys and inform the development of the automated image classification techniques explored in this study.

Automated Species Identification Using Hyperspectral Data
Our study has shown that hyperspectral data effectively differentiated five important páramo species, two of them of the same genus and endemic from this ecosystem, following other studies in temperate alpine ecosystems [49,50]. Comparing overall model accuracy, combining two classes (binary), all bands and the SVM classifier had high accuracy, followed closely by the model developed using binary classes, SFFS for feature selection, and SVM or RF classifiers (Figure 7). Our results agree with Burai et al. 2015 [56], a study performed on the vegetation of similar characteristics (herbaceous), where, using all the bands, SVN performed slightly better than RF, but the differences were not significant [50]. Additionally, they found that the number of pixels affected the classification accuracy, which was similar to our results at the species level (P. goudotiana, 15,724 pixels, Supp. Figure S2) but not at the image level. In our study, the image with the highest number of pixels (image A, 20,084 pixels) did not have the highest overall accuracy values, reflecting that the characteristics of the image such as the spatial distribution of the classes affects the accuracy independent of the number of pixels ( Figure 6).
Sensitivity and specificity were also higher for all bands, binary classes, and SVM model combinations for all the species classified. When comparing overall and kappa accuracies across images, the RF classifier showed more inconsistencies than the SVM classifier, and the SVM classifier had higher accuracy values for all images than the RF classifier. This finding is relevant for monitoring schemes, where multiple images are taken at different points in time, in which case the SVM classifier is more stable despite differences in the hyperspectral data, and has more consistency. Regarding the data partition, in cases where the sample size is small, and the per-pixel partition is used, we recommend using the binary classes in combination with the SVM classifier. Doing so would result in more consistent outcomes, in terms of variation and decreases in the overall accuracy, when comparing both data partition approaches. This difference in consistency suggests that the combination of binary classes and SVM classifier might be less affected by the effect of autocorrelation among pixels, probably because of a lower number of classes involved in the determination of the separating margin with kernel methods. Nevertheless, including high-resolution RGB imagery and visual classification for developing individual species sample, as we did in our study, increases the number of individuals identified, hence sample size. In our particular case, increasing the RGB imagery resolution would have significantly improved our sample size.
Drones 2020, 4, x FOR PEER REVIEW 14 of 18 separating margin with kernel methods. Nevertheless, including high-resolution RGB imagery and visual classification for developing individual species sample, as we did in our study, increases the number of individuals identified, hence sample size. In our particular case, increasing the RGB imagery resolution would have significantly improved our sample size. Upon visual examination of the classified maps, the results were not as expected, given the high accuracy values, especially in images with small aggregations of the species studied. On the one hand, species classification methods using a pixel-based approach are prone to inflated accuracy values due to autocorrelated pixel values. On the other hand, the data are collected within a spatial context, and therefore the effect of space should also be included in order to construct models relevant across the landscape. Following Meyer et al. 2019, including space, using spatial cross-validation, resulted in more variation in the overall accuracy and kappa values [39]. Additionally, we found that this effect is more substantial in the models constructed using multiple classes, as in the binary classes, overall accuracy did not vary significantly from the random cross-validation. According to our results, binary models appear to be more suited for species with a highly aggregated spatial distribution because it increases the number of non-focal samples, thereby increasing the accuracy in areas where the number of individuals of the focal species is low. For P. goudotiana, the true negative rate did not decrease as much as for the other species; this might be related to the plant size (e.g., a higher number of pixels), but the true positive rate was affected, which might be due to the dense spatial aggregation of this species. Our study demonstrates the potential of hyperspectral images for species classification in tropical alpine ecosystems, in accordance with the findings from Marcinkowska-Ochtyra et al. 2018 [14], but in their study, the SVM classifier performed better with a subset of 40 bands, whereas in our analysis, it performed slightly better when using all bands.
Hennessy et al. (2020) [57], in their review on the hyperspectral classification of plants, highlighted the importance of testing multiple feature selection and classification methods due to the high variation in outcomes among studies. Out study supports this statement. We found that the values of accuracy, kappa, sensitivity, and specificity varied depending on the species and image analyzed and the combination of feature selection, number of classes, and classifier applied.
From our results, the potential of RGB and hyperspectral imagery for species classification at the páramo is relevant, and the advantages of each technology make them better when used together. On the one hand, the low-cost RGB imagery approach allows for higher accuracies with lower computer processing requirements, but has higher omission rate and can be time-consuming when mapping large areas. Classification using hyperspectral imagery, although requiring more computer processing, reaches stable overall accuracies above 75% (when using binary classes, SVM or RF Upon visual examination of the classified maps, the results were not as expected, given the high accuracy values, especially in images with small aggregations of the species studied. On the one hand, species classification methods using a pixel-based approach are prone to inflated accuracy values due to autocorrelated pixel values. On the other hand, the data are collected within a spatial context, and therefore the effect of space should also be included in order to construct models relevant across the landscape. Following Meyer et al. 2019, including space, using spatial cross-validation, resulted in more variation in the overall accuracy and kappa values [39]. Additionally, we found that this effect is more substantial in the models constructed using multiple classes, as in the binary classes, overall accuracy did not vary significantly from the random cross-validation. According to our results, binary models appear to be more suited for species with a highly aggregated spatial distribution because it increases the number of non-focal samples, thereby increasing the accuracy in areas where the number of individuals of the focal species is low. For P. goudotiana, the true negative rate did not decrease as much as for the other species; this might be related to the plant size (e.g., a higher number of pixels), but the true positive rate was affected, which might be due to the dense spatial aggregation of this species. Our study demonstrates the potential of hyperspectral images for species classification in tropical alpine ecosystems, in accordance with the findings from Marcinkowska-Ochtyra et al. 2018 [14], but in their study, the SVM classifier performed better with a subset of 40 bands, whereas in our analysis, it performed slightly better when using all bands.
Hennessy et al. (2020) [57], in their review on the hyperspectral classification of plants, highlighted the importance of testing multiple feature selection and classification methods due to the high variation in outcomes among studies. Out study supports this statement. We found that the values of accuracy, kappa, sensitivity, and specificity varied depending on the species and image analyzed and the combination of feature selection, number of classes, and classifier applied.
From our results, the potential of RGB and hyperspectral imagery for species classification at the páramo is relevant, and the advantages of each technology make them better when used together. On the one hand, the low-cost RGB imagery approach allows for higher accuracies with lower computer processing requirements, but has higher omission rate and can be time-consuming when mapping large areas. Classification using hyperspectral imagery, although requiring more computer processing, reaches stable overall accuracies above 75% (when using binary classes, SVM or RF classifiers and all bands) in a fraction of the time required using RGB imagery. However, the combination of both can give the best results as we show in this study where we have taken advantage of the RGB imagery to identify plant species that could be used to develop the automated hyperspectral image classification. Thus, RGB imagery appears promising to monitor several páramo plant species while providing data to develop automated classification with hyperspectral images for a subset of the species.

Conclusions
In this study, we explored the potential of UAV-borne RGB and hyperspectral imagery for species classification in one type of tropical alpine ecosystem (páramo). Our results regarding species identification using RGB imagery highlight the importance of this low-cost technology as a useful tool for vegetation monitoring in this ecosystem, especially given that all the analyses were performed using free and open source software (FOSS), which keeps the costs down and facilitates applicability. However, given the high omission error due to the small vegetation sizes, characteristic of tropical alpine plant communities, we propose the use of higher resolution images, which would not significantly increase flying time but rather increase species identification performance.
Hyperspectral automated species classification, using a combination of multiple feature selection methods, data partition, classifiers, number of classes, and cross-validation approaches, allowed us to explore the potential of hyperspectral for páramo species classification thoroughly. Even though the pixel resolution used in this study allowed us to perform all the analysis, exploring the spatial resolution would provide insight into the resolution thresholds needed for an accurate species classification. Future directions should also include a test of the effect of larger samples on the model performance and the spatial arrangement of those samples to include the effect of spatial distribution patterns of the species in the classification process.
The páramo ecosystem is highly threatened by land-use change, plant invasions, and climate change; thus, this technology's potential to help understand, monitor, and detect threats at the landscape scale opens a promising alternative to species management and conservation in tropical alpine regions. RGB imagery and hyperspectral data offer several advantages to monitor changes in species survival and distribution patterns. RGB imagery can be used for annual monitoring of selected areas, and hyperspectral imagery can be applied for five-year monitoring at the landscape-level, with the advantage of all the other well-known applications of these data (e.g., biochemical signals of change and invasive species detection). Future directions could build from this method and explore plant trait mapping using UAV-borne hyperspectral images, monitor individuals and changes in spatial distribution patterns and explore the transferability of these methods to other páramos.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2504-446X/4/4/69/s1, Figure S1: Variation in model performance per data partition approach (per pixels, and per individual) in terms of overall accuracy (for spatial and random cross-validation) given the combination of feature selection, class grouping method (Binary or Multiple) and classification method (Random forest (RF) and Support vector machine (SVM). In the case of feature selection: all the spectral bands (ALL), spectral bands selected using Sequential Floating Forward Selection (SFFS) using Jeffries-Matusita distance as a separability index (FSSF) and, spectral bands with the highest importance on random forest the Gini decrease mean value (RF), Figure S2: Variation in model performance per species in terms of producer accuracy (for spatial and random cross-validation) given the combination of feature selection, class grouping method (Binary or Multiple) and classification method (Random forest (RF) and Support vector machine (SVM). In the case of feature selection: all the spectral bands (ALL), spectral bands selected using Sequential Floating Forward Selection (SFFS) using Jeffries-Matusita distance as a separability index (FSSF) and, spectral bands with the highest importance on random forest the Gini decrease mean value (RF). Funding: This research was funded by University of los Andes: "Fondo de Investigaciones para apoyar programas de profesores de la Facultad de Ciencias de la Universidad de los Andes, grant number INV-2019-84-1805 and by Colciencias Patrimonio autónomo fondo nacional de financiamiento para la ciencia, la tecnología y la innovación Francisco José de Caldas", grant number 120471451294.