Classiﬁcation of Mediterranean Shrub Species from UAV Point Clouds

: Modelling ﬁre behaviour in forest ﬁres is based on meteorological, topographical, and vegetation data, including species’ type. To accurately parameterise these models, an inventory of the area of analysis with the maximum spatial and temporal resolution is required. This study investigated the use of UAV-based digital aerial photogrammetry (UAV-DAP) point clouds to classify tree and shrub species in Mediterranean forests, and this information is key for the correct generation of wildﬁre models. In July 2020, two test sites located in the Natural Park of Sierra Calderona (eastern Spain) were analysed, registering 1036 vegetation individuals as reference data, corresponding to 11 shrub and one tree species. Meanwhile, photogrammetric ﬂights were carried out over the test sites, using a UAV DJI Inspire 2 equipped with a Micasense RedEdge multispectral camera. Geometrical, spectral, and neighbour-based features were obtained from the resulting point cloud generated. Using these features, points belonging to tree and shrub species were classiﬁed using several machine learning methods, i.e., Decision Trees, Extra Trees, Gradient Boosting, Random Forest, and MultiLayer Perceptron. The best results were obtained using Gradient Boosting, with a mean cross-validation accuracy of 81.7% and 91.5% for test sites 1 and 2, respectively. Once the best classiﬁer was selected, classiﬁed points were clustered based on their geometry and tested with evaluation data, and overall accuracies of 81.9% and 96.4% were obtained for test sites 1 and 2, respectively. Results showed that the use of UAV-DAP allows the classiﬁcation of Mediterranean tree and shrub species. This technique opens a wide range of possibilities, including the identiﬁcation of species as a ﬁrst step for further extraction of structure and fuel variables as input for wildﬁre behaviour models.


Introduction
Wildfires are the main cause of forest ecosystem disturbance in the Mediterranean basin, modifying the vegetation, fauna, soil, and affecting hydrological and geomorphological processes [1,2]. Although wildfires can have a natural origin, playing an important role in ecological cycles [3], their behaviour is being affected in the Mediterranean basin by anthropogenic activity [4][5][6][7]. Humans are modifying land use and climate causing an increase in wildfires and burned area, modifying their natural frequency and reducing their period of recurrence [8,9]. Since the mid-20th century, socioeconomic development has promoted rural-urban migration, causing the abandonment of the traditional land exploitation and increasing fuel loads, which imply a greater risk of igniting a forest fire [10]. Climate change models indicate that the Mediterranean basin will be one of the most affected globally by the rising of temperatures and reduction of precipitation, considering it as a climate change "hot spot" [4,11]. These factors together with the increasing of soil aridity will affect directly the fire regime [5], expecting the increase of wildfire frequency and burned areas [6]. of penetration into the tree canopy, reducing the information obtained from the forest structure [32].
Some studies have used UAV-DAP data for classification of vegetation with accurate results [27,28,33,34]. Nevalainen et al. [27] detected and classified four tree species in boreal forests using UAV-DAP point clouds and hyperspectral image mosaics and compared different techniques of machine learning (k-nearest neighbours, Random Forest, and MultiLayer Perceptron). Tuominen et al. [33] also used UAV-DAP point clouds and hyperspectral image mosaics together with k-nearest neighbours and Random Forest classifiers with a more ambitious goal: to classify 26 different tree species. In the same line, Camile Sothe et al. [28] similarly used UAV-DAP and hyperspectral data to classify 12 species in a subtropical forest in Brazil. What all these studies have in common is that they provide a map of the classified species as a result. In contrast, Mesas-Carrascosa et al. [34] proposed the classification of UAV-DAP point clouds obtained from RGB imagery to determine the points belonging to vineyards. These points were used to determine vine height. Once the classification of the point cloud is done, three-dimensional information can be derived (e.g., tree heights), enabling a further application of this 3D information (e.g., as fire models' input). In this sense, the classification of the point cloud of a forest area would imply the possibility of extracting information of the vegetation structure. In addition, Mesas-Carrascosa et al. [34] performed a classification using only consumer cameras, far from the capabilities and costs of hyperspectral sensors. There are currently few articles describing the classification of vegetation types using UAV-DAP point clouds. In this regard, to our knowledge there is no study that attempted to classify shrub species in addition to tree species using UAV-DAP point clouds, which is particularly interesting due to their importance in wildfire modelling.
In this paper, we propose a methodology for the classification of UAV-DAP derived point clouds in tree and shrub species' types, using multispectral data with low spectral resolution cameras. We evaluated the methods in two Mediterranean shrub-dominated forest areas and compared the performance of different classifiers.

Study Sites
The study area is divided into two zones located in the Natural Park of Sierra Calderona, between the eastern Spain provinces of Valencia and Castellon ( Figure 1). This Park is one of the most emblematic protected natural areas in the province of Valencia. The Sierra Calderona forms part of the last foothills of the Iberian System, being constituted by a NW-SE mountain range. Most of the area is below 1000 m above sea level and it is a typical example of a pre-coastal Valencian Mediterranean mountain range [29]. The climate is coastal Mediterranean with mild changes in temperature and a mean temperature of 16 • C. Rainfall in summer is low, contrasting with autumn and spring, where most of the annual rainfall is accumulated (450 mm) usually with torrential rains. The main species of the study areas and their description are listed in Table 1.
The study site referred to as Area 1 encompasses 2889 m 2 , where the existence of the current taxa derives from a forest fire that devastated 70 hectares of the Natural Park of Sierra Calderona in 2014. The ecological succession caused by the fire is defined by the presence of shrub species adapted to intense solar radiation, with an indifferent substrate and adapted to soils with a high degree of stoniness and poor in nutrients. The density of shrubs is very high and forms an almost continuous horizontal layer of vegetation, where the different species are mixed without exceeding 150 cm in height. There is no tree cover in the area. The study site referred to as Area 2 encompasses 11,455 m 2 , where the vegetation is subject to preventive silvicultural treatments that have modified the fuel pattern. This area corresponds to a strip with a width between 20 and 30 m. It is characterised by scattered Pinus halepensis trees that are pruned to 2/3 of the tree height. The shrubs in this area are well formed and isolated. . Digital elevation of the Natural Park of Sierra Calderona with the two study areas (B); details of the study Area and 2 (D). The green dots represent vegetation points measured in the field. The reference s is EPSG:25830. Digital elevation model was obtained from ALS data, which, together with t thoimages, were provided by the Spanish National Aerial Orthophotography Program (PNOA CC BY 4.0 www.scne.es (accessed on 11 December 2020).

Overview of the Method
A general overview of the methodology is shown in Figure 2. Firstly, data collection was carried out from three different sources. Several multispectral flights were performed over the study areas and the positions of the Ground Control Points (GCPs) were collected by Global Navigation Satellite System (GNSS) and applying Real-Time Kinematic (RTK). This same technique was used to geolocate the position of the individuals studied; at the same time, the species of each individual were identified. ALS point clouds of the study area were obtained from the Spanish National Aerial Orthophotography Program (PNOA). In the second step we carried out the photogrammetric process to obtain the point clouds, where we performed a radiometric calibration of the multispectral images, aligned the images to reconstruct the flight scene, and densified the point cloud obtained during the alignment. In the next step we started the processing of the point clouds with the normalisation of the heights, where the bare ground points from the UAV-DAP and airborne laser scanning (ALS) point clouds were merged. Next, an extraction of spectral, geometric, and neighbourhood features from the point cloud was performed. Using the extracted features and training samples obtained from field data, a comparison of the following Remote Sens. 2022, 14, 199 5 of 22 classifiers was performed: Decision Tree, Extra Trees, Gradient Boosting, Random Forest, and MultiLayer Perceptron. The method with the highest mean cross-validation scores was selected. Subsequently, a feature selection based on permutation feature importance and Pearson correlation was performed. Once the features were selected, the point cloud was classified using the selected classifier. In the last step of point cloud processing, we performed a segmentation of the point cloud based on its geometry, reclassifying it to regularise the classification. Finally, we validated the results obtained.

GNSS and UAV Data Collection
Fieldwork was carried out on 23-24 July 2020, performing an aerial multispectral data collection and a GNSS data collection campaign. UAV field work consisted of two flights (one per study area). Both flights were conducted close to solar noon to minimize shadowing, under sunny conditions, quite windless for the first flight and a light wind for the second flight. The campaign flights were conducted with a DJI Inspire 2 UAV, a quadcopter drone weighting 3.44 kg, with a maximum payload of 0.81 kg. Its four brushless motors are powered by two LiPo batteries of 4280 mAh, allowing for flights of up to 27 min, depending on the payload and meteorological conditions. The DJI Inspire 2 was equipped with a Micasense RedEdge multispectral camera (Micasense Inc., Seattle, WA, USA), with five spectral bands: blue (475-nm centre, 20-nm bandwidth), green (560 nm, 20-nm bandwidth), red (668 nm, 10-nm bandwidth), red edge (717 nm, 10-nm bandwidth), and near-infrared (840 nm, 40-nm bandwidth). The camera is composed of five sensors (4.8 × 3.6 mm) of 1.2 MP of resolution with focal length fixed at 5.5 mm, giving a sensor pixel size of 3.75 µm. The image format of this camera is 16-bit TIFF.

GNSS and UAV Data Collection
Fieldwork was carried out on 23-24 July 2020, performing an aerial multispectral data collection and a GNSS data collection campaign. UAV field work consisted of two flights (one per study area). Both flights were conducted close to solar noon to minimize shadowing, under sunny conditions, quite windless for the first flight and a light wind for the second flight. The campaign flights were conducted with a DJI Inspire 2 UAV, a quadcopter drone weighting 3.44 kg, with a maximum payload of 0.81 kg. Its four brushless motors are powered by two LiPo batteries of 4280 mAh, allowing for flights of up to 27 min, depending on the payload and meteorological conditions. The DJI Inspire 2 was equipped with a Micasense RedEdge multispectral camera (Micasense Inc., Seattle, Washington, USA), with five spectral bands: blue (475-nm centre, 20-nm bandwidth), green (560 nm, 20-nm bandwidth), red (668 nm, 10-nm bandwidth), red edge (717 nm, 10-nm bandwidth), and near-infrared (840 nm, 40-nm bandwidth). The camera is composed of five sensors (4.8 × 3.6 mm) of 1.2 MP of resolution with focal length fixed at 5.5 mm, giving a sensor pixel size of 3.75 μm. The image format of this camera is 16-bit TIFF.
The flight of study Area 1 lasted 6 min, taking 1150 images (one per band) and covering an extension of 2.12 ha with a mean flying altitude of 49.2 m and a longitudinal and transverse overlap of 80%, whereas the flight over study Area 2 lasted 7 min, acquiring 1295 images over an extension of 2.79 ha with a mean flying altitude of 59.6 m and an overlap equal to the previous flight, 80%. In addition, several images of a calibration panel The flight of study Area 1 lasted 6 min, taking 1150 images (one per band) and covering an extension of 2.12 ha with a mean flying altitude of 49.2 m and a longitudinal and transverse overlap of 80%, whereas the flight over study Area 2 lasted 7 min, acquiring 1295 images over an extension of 2.79 ha with a mean flying altitude of 59.6 m and an overlap equal to the previous flight, 80%. In addition, several images of a calibration panel of known reflectance were taken before and after the flights for the radiometric calibration of the images.
During this campaign, the position and species of a total of 1036 individuals were collected, corresponding to the most representative shrub and tree species in both study areas (Table 1). In this work, the projection on the floor of the barycentre of each individual was taken by GNSS positioning. In addition, to increase the accuracy of georeferencing taken by UAV, during these works the centres of 12 GCPs were georeferenced (6 GCPs per study area). The GNSS survey equipment used during the field campaign was a Leica GPS1200 using a RTK technique with an accuracy of ±(10 mm + 1 ppm) and ±(20 mm + 1 ppm) in horizontal and vertical, respectively.

Photogrammetric Processing
All processes were carried out on a Windows 64-bit system, with the following specifications: RAM 32 GB, CPU IntelI CITM) i7-8700 CPU @ 3.20 GHz, and GPU GeForce RTX 1060. The images were processed using Metashape version 1.5.3 (Agisoft, St. Petersburg, Russia). The workflow for generating geometric and radiometric consistent data from multispectral images is presented in [30]. The workflow in Metashape starts with the Remote Sens. 2022, 14, 199 7 of 22 radiometric calibration and continues with identifying, matching, and monitoring the movement of common features between images. Radiometric calibration compensates for sensor black level, sensor gain and exposure settings, sensitivity of the sensor, and lens vignette effects [31]; this calibration was favoured by the optimal weather conditions. The process continues with the feature extraction, where Agisoft uses algorithms similar to the well-known Scale Invariant Feature Transform (SIFT) object recognition algorithm [32]. The next step was to determine the interior orientation parameters of the camera (main point, focal length, and lens distortion) and the exterior orientation parameters (projection centre coordinates and rotation angles around the three axes), improving subsequently their positions with a bundle-adjustment algorithm [32,35]. The GCPs collected in the field were used in this phase to improve the orientation of the images, as well as to scale the photogrammetric block and provide it with absolute coordinates. During this process, the 3D coordinates of the features extracted in the first processing step were obtained, creating a point cloud commonly referred as tie point cloud. In this step, an additional gradual filtering process was carried out to reduce the overall pixel error and to optimize image alignment. Finally, once we obtained the final position and orientation of the images, a pair-wise depth map computation was performed [32] using the tie point cloud to generate an approximate digital terrain model from which new points were obtained, creating a dense point cloud. After the densification process, the resulting point clouds were clipped within the study areas.

Height Normalisation
To introduce the height as a feature in the point classification it was necessary to perform a height normalisation. This process was divided into three steps: detection of ground points, creating an interpolation surface representing the ground or digital terrain model (DTM), and reduction of heights to zero-level. In the first step, bare ground points were detected carrying out a supervised classification. In Area 1, 5417 bare ground points and 8826 vegetation points were taken as training samples. Regarding Area 2, 17,923 bare ground points and 19,540 vegetation points were collected. Point clouds were classified using Random Forest, applying 10-fold cross-validation over the training samples, where only spectral features were used in the model fit. Figure 3 shows an example of the results of the ground point classification.  Due to the lack of penetration of UAV-DAP point clouds in the vegetation, areas covered by vegetation led to a discontinuity of bare ground points. To obtain points from the ground in these areas, we used freely available ALS data provided by the Spanish PNOA to complement the ground points, identifying bare ground points based on adaptive TIN models [36]. To avoid geolocation errors, a registration of the points detected as ground of both clouds was performed using the Iterative Closest Point (ICP) algorithm. The minimum point density of the ALS data is 0.5 points·m −2 , with the elevation accuracy being 15 cm and the horizontal accuracy 30 cm. The ALS data used in this study were collected between October and November 2015.
Secondly, a ground Triangulated Irregular Network (TIN) was constructed using the points classified as bare ground (UAV-DAP and ALS data). In a next step, the height for the remaining points above this TIN were calculated, obtaining the normalised point cloud. After height normalisation, the points of the UAV-DAP cloud classified as bare Due to the lack of penetration of UAV-DAP point clouds in the vegetation, areas covered by vegetation led to a discontinuity of bare ground points. To obtain points from the ground in these areas, we used freely available ALS data provided by the Spanish PNOA to complement the ground points, identifying bare ground points based on adaptive TIN models [36]. To avoid geolocation errors, a registration of the points detected as ground of both clouds was performed using the Iterative Closest Point (ICP) algorithm. The minimum point density of the ALS data is 0.5 points·m −2 , with the elevation accuracy being 15 cm and the horizontal accuracy 30 cm. The ALS data used in this study were collected between October and November 2015.
Secondly, a ground Triangulated Irregular Network (TIN) was constructed using the points classified as bare ground (UAV-DAP and ALS data). In a next step, the height for the remaining points above this TIN were calculated, obtaining the normalised point cloud. After height normalisation, the points of the UAV-DAP cloud classified as bare ground were removed. Points under 20-cm height were removed, with the aim of studying only the bushes of a certain entity, eliminating grass and very small shrubs of the study area.

Feature Extraction
Once the photogrammetric process of obtaining the point cloud and its subsequent normalisation was done, the coordinates (X, Y, Z coordinates) and the reflectance values (blue, green, red, red edge, and near-infrared bands) for each point were stored and 22 spectral features for these points were calculated (Table 2).
Finally, we conducted a neighbourhood analysis of each point, determining the neighbourhood of a point as p ∈ R 3 , with R 3 being the set of points inside a sphere s, of centre p, and radius 10 cm. From this neighbourhood analysis 10 features were obtained (Table 3).

Machine and Deep Learning Models
To carry out the point cloud classification by species, we performed a supervised classification of the point cloud, also known as point cloud semantic segmentation [57]. The training samples used to fit the models were extracted from field work data, where different geolocated individuals of each species were identified. A planimetric buffer of 15 cm was applied to these geolocated points, with this value being the minimum radius of the smallest individuals identified. The resulting polygon was used to clip the point cloud. After obtaining the samples for each class (Table 1), a process of manual filtering was conducted to avoid the introduction of outliers in the training samples. Figure 4 shows a representation of the training samples of each species based on some features extracted from the point cloud. This figure is a pairwise relationship plot of the dataset created from 100 random points of each of the species studied with their raw spectral bands and normalised height. It can be seen from the Kernel Density Estimator (KDE) located on the diagonals how in area 1 the species Anthyllis cytisoides differs from the others in the blue and red bands or how the species Pistacia lentiscus can be differentiated using only the normalised height information. This figure also shows the difficulty in distinguishing some species studied on the basis of these features. For example, in area 2 we found that the species Cistus albidus and Salvia rosmarinus have a similar spectral response and normalised height, making them difficult to differentiate. The relationships established in the upper and lower corners help to visualise the similarity or the difference between species. On the basis of its spectral features, the best distinguished species in area 1 was Anthyllis cytisoides, while the rest of the species were difficult to differentiate. In area 2, the normalised height was the feature that best distinguishes the Pinus halepensis species, while the spectral response of the different species was similar. In summary, Figure 4 shows that all studied species cannot be distinguished from each other on the basis of their raw spectral features and their height, making an in-depth analysis necessary.
Different machine learning and deep learning methods were evaluated for this classification, using the Python library Scikit-learn [58]. The machine learning methods evaluated were DT [59], Extra Trees [60], Gradient Boosting [61], and Random Forest [62], as well as the deep learning method MultiLayer Perceptron [63]. For the evaluation of these methods, a fine-tuning of the hyperparameters (Table 4) was done with the aim of optimizing the models. This fine-tuning was carried out by setting up a grid of hyperparameters. The accuracy of each combination of hyperparameters was assessed by cross-validation with 10-fold to ensure the independence between training and test data. The chosen hyperparameters for each method were those with the highest mean cross-validated score (mCVs). For the selection of the point cloud classification method, the mCVs of each method were also considered. Table 2. Summary of vegetation indices with their respective equations and references; ρ is defined as the digital number of the point for a given band.

Name (Description) Equation
Dist_mean (Mean distance of the point with its neighbouring points) Dist_std (Standard deviation of the point with its neighbouring points)  After selecting the model, feature selection was applied with the aim of reducing the number of features in the final model to avoid overfitting. This feature selection was based on the permutation feature importance of the fitted model as well as the Pearson correlation between features. The permutation feature importance is based on the decrease in the score of a model when a single feature value is randomly shuffled [62]. Using this technique, the features in each area were ordered based on their permutation feature importance value. The Pearson correlation of the features was then calculated, and the features were split into clusters. Each feature, ordered according to its feature importance, was assigned a weight that decreased according to the repeatability of the cluster to which it belonged. Therefore, even if a feature had a high feature importance, if features from the same cluster with higher importance had been selected, the latter might not be selected. After the feature extraction, a classification was done again with the chosen features, using the same classification method used previously.

Point Cloud Segmentation and Reclassification
After finding the model that best fit the training samples and predicting the class of each point, the high spatial heterogeneity of the classified point cloud was reduced by performing a geometric segmentation of the point cloud, with the aim of obtaining point clusters representing different individuals. We applied the algorithm li2012 [64] from the lastrees function of lidR package [65]. This algorithm is based on region growing to determine whether a point is near or far from existing vegetation, taking advantage of the relative separation between objects to discern between individuals. To achieve the objective of containing only one individual per segment, the algorithm was parameterised to perform an over-segmentation, favouring that there were no segments containing two or more individuals of different species. The selected parameters for segmenting the point clouds were: threshold 1 = 0.1 m, threshold 2 = 0.2 m, limit of threshold number 1 = 1.5 m, minimum height of a detected tree = 0.01 m, maximum radius of a crown is 1 m, and search radii = 0.1 m and 0.5 m, respectively, for areas 1 and 2. After segmenting the point cloud, each segment was assigned the most repeated class to increase the spatial homogeneity of the point classification.

Evaluation
For evaluation purposes, we manually segmented and classified different individuals of each species to create the testing set. These point clouds were taken as a reference for an accuracy assessment of the point classification performed. The evaluation was done by comparing the class obtained from each point with the reference class. The confusion matrix of the reference samples was obtained, as well as the precision (Pr), recall (Re), and F-measure (Fm) values from the following equations. between species. On the basis of its spectral features, the best distinguished species in area 1 was Anthyllis cytisoides, while the rest of the species were difficult to differentiate. In area 2, the normalised height was the feature that best distinguishes the Pinus halepensis species, while the spectral response of the different species was similar. In summary, Figure 4 shows that all studied species cannot be distinguished from each other on the basis of their raw spectral features and their height, making an in-depth analysis necessary.
where A represents the confusion matrix, TP is True Positives, FP is False Positives, FN is False Negatives, Pr is the precision, Re is the recall, and Fm is the F-measure. Figure 5 shows a summary of the intermediate results to obtain the classified point cloud. Firstly, the point cloud was obtained, normalised, and the bare ground points were removed. The normalised point cloud was classified using six species' classes. Subsequently, the point cloud was segmented according to its geometry to perform a reclassification that homogenised the previously performed classification.

Generation and Processing of the Point Clouds
The point cloud obtained for Area 1 was composed of 4,107,175 points with an average density of 1421.5 points·m −2 , whereas the point cloud for Area 2 was composed of 11,514,975 points with an average density of 1005.3 points·m −2 . Their positional error was estimated through the Root Mean Square Error (RMSE) between the GCPs and the position of the computed 3D point, being 3.45 cm for Area 1 and 3.79 cm for Area 2.
Regarding the Random Forest classification of the UAV-DAP point cloud to separate bare ground and vegetation points, with respect to the results obtained in each of the iterations carried out using cross-validation, a mCVs of 0.998 and a standard deviation of cross-validated score (stdCVs) of 0.004 were obtained for Area 1. For Area 2, a mCVs of 0.999 and a stdCVs of 0.003 were obtained.
Prior to the merging of the UAV-DAP and ALS bare ground points, we performed a statistical analysis of the point clouds. In order to compare the correct alignment of the UAV-DAP and ALS bare ground points, the nearest neighbour distance of each point between both clouds was calculated. Analysing the distances in the Z-component, a mean distance of −3.4 cm and a standard deviation of 10.6 cm were obtained for Area 1; for Area 2, these values were 1.8 cm and 21.2 cm, respectively. The high standard deviation was due to two reasons: The main one was the discontinuity of the ground in the UAV-DAP point cloud due to gaps caused by vegetation; the second one was the accuracy in the Z component of the methods, 3.2 cm for the UAV-DAP (measured in the GCPs) and 15 cm for the ALS point cloud. The use of the ALS reduced the gaps in the points classified as ground, adding information where photogrammetric clouds were limited due to the reduced penetration of UAV-DAP data through vegetation. Remote Sens. 2022, 14, x FOR PEER REVIEW 13 of 22

Generation and Processing of the Point Clouds
The point cloud obtained for Area 1 was composed of 4,107,175 points with an average density of 1421.5 points·m-2, whereas the point cloud for Area 2 was composed of

Assessment of Classification Methods
The selection of the classifier among the different methods analysed (Decision Tree, Extra Trees, MLP, Gradient Boosting, and Random Forest) was based on their cross-validation results applying the hyperparameters that best adapted to the training samples of vegetation species. The model that achieved the highest reliability and lowest dispersion was Gradient Boosting (Figure 6), both for Area 1 and Area 2. In both areas, the model with the highest reliability for a unique k-fold was also Gradient Boosting with 0.89 and 0.95, for the study areas 1 and 2, respectively. Analysing the set of repetitions, Gradient Boosting obtained higher average accuracy values with lower dispersion (0.82 ± 0.05 and 0.91 ± 0.02 mean and standard deviation of cross-validation score for study Area 1 and 2, respectively). In this respect, there were slight differences between Gradient Boosting and Extra trees (0.81 ± 0.06 and 0.90 ± 0.02 mean and standard deviation of cross-validation score for study Area 1 and 2, respectively). Using Decision Trees, MultiLayer Perceptron, and Random Forest, less accurate results were obtained. Regardless of the classifier used, more accurate results were obtained for Area 2, since the species analysed in Area 1 had a greater geometric and spectral similarity compared to the species analysed in Area 2 ( Figure 4). In this sense, other studies obtained similar results when classifying UAV-DAP point clouds using geometric and spectral features [66]. In this study, Random Forest and Gradient boosting classifiers were compared, finding that the classifier with the lowest error was also Gradient Boosting. The results are also comparable to those obtained classifying satellite images, where the Gradient Boosting classifier outperformed different deep neural networks and other machine learning classifiers using spectral, spatial, textural, and vegetation index features [67]. Similar results were obtained in [68], where the methods Regression Trees, Random Forest, and Gradient Boosting were compared to classify forest fuel types from ALS data and satellite imagery, concluding that the best results were obtained using Gradient Boosting.  Table 5 shows the results of the different combinations of hyperparameters applied to the Gradient Boosting method. The combination of parameters that obtained the highest mCVs for Area 1 was the one formed by the minimum number of samples required to split an internal node of 3, minimum number of samples required to be at a leaf node of 2, and a maximum depth of the tree of 5, with a mean fit time of 255 s for each of the 10 iterations. On the other hand, the highest mCVs for the second area was obtained by the combination of a minimum number of samples required to split an internal node of 2, minimum number of samples required to be at a leaf node of 5, and a maximum depth of the tree of 10, with a mean fit time of 1317 s. Table 5 also shows how the hyperparameter with the greatest influence on the processing time was the maximum depth of the tree, multiplying up to 8 times the time by setting the value to "none", compared to value "5", without affecting the improvement of the model. If value is set to "none", the nodes are expanded until all leaves are pure or until all leaves contain less than the minimum number of samples to split an internal node, which explains the increase in processing time. In contrast, if we analyse the minimum number of samples at a leaf node, the best results were obtained mostly by setting this hyperparameter to "5". Table 5. Heatmap of the hyperparameters (minimum number of samples at a leaf node, minimum number of samples to split an internal node, and maximum depth of the tree) applied to the Gradient Boosting method. The table shows the mean cross-validated scores obtained over 10 iterations  Table 5 shows the results of the different combinations of hyperparameters applied to the Gradient Boosting method. The combination of parameters that obtained the highest mCVs for Area 1 was the one formed by the minimum number of samples required to split an internal node of 3, minimum number of samples required to be at a leaf node of 2, and a maximum depth of the tree of 5, with a mean fit time of 255 s for each of the 10 iterations. On the other hand, the highest mCVs for the second area was obtained by the combination of a minimum number of samples required to split an internal node of 2, minimum number of samples required to be at a leaf node of 5, and a maximum depth of the tree of 10, with a mean fit time of 1317 s. Table 5 also shows how the hyperparameter with the greatest influence on the processing time was the maximum depth of the tree, multiplying up to 8 times the time by setting the value to "none", compared to value "5", without affecting the improvement of the model. If value is set to "none", the nodes are expanded until all leaves are pure or until all leaves contain less than the minimum number of samples to split an internal node, which explains the increase in processing time. In contrast, if we analyse the minimum number of samples at a leaf node, the best results were obtained mostly by setting this hyperparameter to "5". Table 5. Heatmap of the hyperparameters (minimum number of samples at a leaf node, minimum number of samples to split an internal node, and maximum depth of the tree) applied to the Gradient Boosting method. The table shows the mean cross-validated scores obtained over 10 iterations and the mean fit time used in their calculation. The red gradient is associated with Area 1 values, while the blue gradient represents values from Area 2.

Feature Selection and Final Classification Model
Once the classifier that best suited our study, gradient boosting, was selected, we applied the feature selection process. To reduce the number of features, we obtained Pearson's correlation for both areas 1 and 2. Figure 7 shows the existing correlation between features for Area 1, showing the high correlation between spectral features. These features were not directly removed since our first objective was to discern which spectral indices were most important in the classification. Subsequently, the most correlated and least informative features were removed to obtain the final model. Thus, we evaluated the features based on their ability to differentiate between tree and shrub species. From the hierarchical clustering tree, which groups the features based on their Pearson's correlation, it can be observed that there were five clusters of features. The first was formed by the geometric feature Z and a geometric feature extracted from the neighbourhood analysis, the feature Z_mean. The second cluster was formed by 17 features, all of them spectral features except the NDVI_mean, with this being a spectral feature extracted from the neighbourhood analysis; in this cluster are also SRxNDVI, blue, EVI, RDVI, ARVI, SARVI, SAVI, OSAVI, IPVI, NDVI, GNDVI, MSAVI, SR, RedEdge, NIR, and DVI. The third cluster consisted entirely of the spectral features NGBDI, BI, red, MSR, green, RVI, and NBRDI. The fourth cluster was mainly composed of spectral features. These features had maximum correlation in absolute value with the spectral features NormG, GR, NGRDI, and RGRI. The last cluster consisted entirely of features extracted from the neighbourhood analysis: Dist_std, Zmax_Z, Z_Zmin, Z_std, Dif_Z, Dist_mean, and Numbers.
Analysing the permutation importance of the features when applying the model ( Figure 8A,C), among the top 10 features with the highest permutation importance for the model, three were repeated in the two study areas. These features were one of spectral type, BI, and two extracted from the neighbourhood analysis, Z_Zmin and Zmax_Z. In this sense, we can also observe how, depending on the area, the features with greater importance diverged. In Area 1 the most important feature was the NBRDI index, but in Area 2 this feature was the second to last in order of importance. Since each area has different tree and shrub species, with only one species in common, the common features with high importance values in both areas could be considered as adequate descriptors to differentiate the Mediterranean flora. neighbourhood analysis; in this cluster are also SRxNDVI, blue, EVI, RDVI, ARVI, SARVI, SAVI, OSAVI, IPVI, NDVI, GNDVI, MSAVI, SR, RedEdge, NIR, and DVI. The third cluster consisted entirely of the spectral features NGBDI, BI, red, MSR, green, RVI, and NBRDI. The fourth cluster was mainly composed of spectral features. These features had maximum correlation in absolute value with the spectral features NormG, GR, NGRDI, and RGRI. The last cluster consisted entirely of features extracted from the neighbourhood analysis: Dist_std, Zmax_Z, Z_Zmin, Z_std, Dif_Z, Dist_mean, and Numbers. Depending on the cluster they belonged to, a weight was applied to the features, modifying their order of importance. The learning curves in Figure 8 show the final order applied to the model. Visualising the learning curve, it stabilised from 10 features onwards for Area 1 (NBRDI, GR, BI, Z_Zmin, NDVI_mean, MSAVI, NDVI_std, SR, MSR, and Z) and Area 2 (SR, RDVI, NGRDI, Z_stf, Z_mean, BI, NUMBERS, NDVI_std, MSAVI, and IPVI), obtaining a slight increase in the mCV statistic by using 38 features instead of 10.
Analysing the decreasing trend in the importance of the features and the stabilisation of the learning curve in both areas, only 10 predictor features were used to obtain the model. In particular, the first 10 features, as shown in Figure 8B,D, were used to create the final classification. Some of these features, such as SR or BI, were also reported as relevant in other models applied for tree species' classification using UAV-DAP data [69].
Depending on the cluster they belonged to, a weight was applied to the features, modifying their order of importance. The learning curves in Figure 8 show the final order applied to the model. Visualising the learning curve, it stabilised from 10 features onwards for Area 1 (NBRDI, GR, BI, Z_Zmin, NDVI_mean, MSAVI, NDVI_std, SR, MSR, and Z) and Area 2 (SR, RDVI, NGRDI, Z_stf, Z_mean, BI, NUMBERS, NDVI_std, MSAVI, and IPVI), obtaining a slight increase in the mCV statistic by using 38 features instead of 10. Analysing the decreasing trend in the importance of the features and the stabilisation of the learning curve in both areas, only 10 predictor features were used to obtain the model. In particular, the first 10 features, as shown in Figure 8B,D, were used to create the final classification. Some of these features, such as SR or BI, were also reported as relevant in other models applied for tree species' classification using UAV-DAP data [69].

Vegetation Classification Accuracy
Once the points were reclassified, the point cloud was compared with the testing data set, obtaining the confusion matrix shown in Table 6. The classification results had an overall accuracy of 81.9% in Area 1 and 96.4% in Area 2. These results were highly de-

Vegetation Classification Accuracy
Once the points were reclassified, the point cloud was compared with the testing data set, obtaining the confusion matrix shown in Table 6. The classification results had an overall accuracy of 81.9% in Area 1 and 96.4% in Area 2. These results were highly dependent on the number of samples taken from each class (Table 1). It was, therefore, necessary to carry out an individual study of each species. Analysing the results in depth, the highest values for user's accuracy (precision) and F-measure for Area 1 were obtained for the class Anthyllis cytisoides (0.97 and 0.89, respectively). Figure 4 shows how the Anthyllis cytisoides had a differentiated spectral response in the blue and red bands, compared to the rest of the species analysed in Area 1. In relation to Area 2, the highest values for precision, recall, and F-measure were obtained for class Pinus halepensis, achieving values of 0.99, 1.00, and 1.00, respectively. These results are attributed to the fact that it was the only tree species in the area, with the height of all individuals of this species being much greater than that of other species.
The statistics obtained for the Pistacia lentiscus class were remarkable, as it was the only class present in both study areas. In the two areas, we found similar values of recall (0.99 and 0.98, respectively, for areas 1 and 2), precision (0.72 and 0.77), and F-measure (0.83 and 0.86). These lower values were due to confusion with other species, such as Quercus coccifera or Chamaerops humilis in the Area 1 or the Juniperus oxycedrus in Area 2. The Pistacia lentiscus has a high intraspecies' variability depending on the age (Table 1), causing confusion between the different classes in both study areas. This also explains the relative low recall value for the species Chamaerops humilis (0.69) in Area 1 or Juniperus oxycedrus (0.67) in Area 2. Due to the low number of Pistacia lentiscus individuals, it was not possible to create two classes; but it would be recommended to split this class according to their age (young and mature) in further studies. The low recall (0.66) of Cistus albidus in Area 2 was due to its misclassification with Salvia rosmarinus. Both species showed spectral and shape similarity, as described in Table 1. Rhamnus lycioides was also confused with Salvia rosmarinus and Juniperus oxycedrus.
The results obtained are comparable with other studies where species' classification was carried out based on UAV-DAP data. The study performed by Nevalainen et al. [27] allowed for the classification of four boreal forest tree species. They applied k-nearest neighbours, Random Forest, and MultiLayer Perceptron using hyperspectral imagery and UAV-DAP point cloud features. The last two classifiers obtained 95% of overall accuracy. These accurate results can be explained considering the low number of species analysed compared to our study. Tuominen et al. [33] performed a more challenging study using a similar methodology. Their methodology was based on the analysis of k-nearest neighbours and Random Forest classifiers, using also UAV-DAP point clouds and hyperspectral image mosaics for classifying 26 different tree species in southeastern Finland. The highest global accuracy was obtained for the classifier k-nearest neighbours, with a global accuracy of 82%. Depending on the species analysed, producer and user accuracies ranged from 0% to 100%. Sothe et al. [28] obtained a classification of 12 major tree species in a subtropical forest integrating UAV-DAP point cloud and hyperspectral data and applying a support vector machine classifier. The overall accuracy obtained in this study was 72%. All these studies have in common that the final product is a classification map, being able to extract from it only two-dimensional information (e.g., the location of the trees or the perimeter of their crown). This short literature review highlights the results obtained in this work, where 11 different tree and shrub species were classified using multispectral information. The analysis of the point cloud and the extraction of geometric and neighbourhood features allowed us to differentiate species with a similar spectral response. In addition, the classified point cloud allowed the derivation of information on the forest structure that can be used as input for wildfire models.

Improving Wildfire Behaviour Modelling
The present study achieved a new methodology to classify Mediterranean forest species from UAV point clouds with the aim of improving wildfire behaviour modelling. Current wildfire models need to be fed with 3D fuel data for their correct running. These computational models need information about the geometry of the individuals (adapting it to a geometric body) and their properties (e.g., fuel density, surface-volume ratio, or fuel moisture) [70]. These properties are inherent to the species analysed, making a prior classification of the study area necessary. Individual point clouds can be adjusted to a geometric body, as well as obtaining individual properties using allometric equations. These equations describe the relationship between variables extracted directly from the point cloud (e.g., tree height, area, volume, or crown width) with the properties required as input by the wildfire models (e.g., bulk density) [14]. Therefore, the use of UAVs for the characterisation of forest structure allows for a breakthrough in the improvement of wildfire models; being able to identify tree and shrub individuals by semi-automatic species identification is one of the key parameters to be used in fire models.

Conclusions
This investigation developed and proposed a UAV-DAP method for the classification of forest species. Results were very promising, showing that UAV-DAP points clouds have the potential to provide accurate results in species' classification. The spectral, geometrical, and neighbourhood features derived from multispectral images produced good results classifying shrub and tree species.
To the best of the authors' knowledge, this is one of the first investigations studying the classification of shrub and tree species in Mediterranean forests using multispectral imagery obtained from UAVs. Previous studies have not proposed similar objectives using only a multispectral camera and a consumer drone, which highlights the methodology proposed in this article, which can be exported to other fields. The proposed methodology allows for scalability of the area and number of species to be studied. In this sense, if the number of species increases, there may be some geometric or spectral similarities among those species, making it necessary to increase the spectral resolution of the input data to improve the results. Due to the results obtained, and to previous articles using similar features for the classification of species, some of the features proposed in this article are relevant in the classification of Mediterranean shrub and tree species.
The classified point cloud provides valuable input to the wildfire models. Obtaining a classified point cloud can lead to the automatic extraction of different features by species (height, area, volume, crown width, etc.), allowing the estimation of variables such as bulk density using allometric equations, which are key for the correct parameterisation of wildfire models.