Comparison of Machine Learning Algorithms for Wildland ‐ Urban Interface fuelbreak Planning Integrating ALS and UAV ‐ borne LiDAR Data and Multispectral Images

: Controlling vegetation fuels around human settlements is a crucial strategy for reducing fire severity in forests, buildings and infrastructure, as well as protecting human lives. Each country has its own regulations in this respect, but they all have in common that by reducing fuel load, we in turn reduce the intensity and severity of the fire. The use of Unmanned Aerial Vehicles (UAV) ‐ acquired data combined with other passive and active remote sensing data has the greatest performance to planning Wildland ‐ Urban Interface (WUI) fuelbreak through machine learning algorithms. Nine remote sensing data sources (active and passive) and four supervised classification algorithms (Random Forest, Linear and Radial Support Vector Machine and Artificial Neural Networks) were tested to classify five fuel ‐ area types. We used very high ‐ density Light Detection and Ranging (LiDAR) data acquired by UAV (154 returns ∙ m − 2 and ortho ‐ mosaic of 5 ‐ cm pixel), multispectral data from the satellites Pleiades ‐ 1B and Sentinel ‐ 2, and low ‐ density LiDAR data acquired by Airborne Laser Scanning (ALS) (0.5 returns ∙ m − 2 , ortho ‐ mosaic of 25 cm pixels). Through the Variable Selection Using Random Forest (VSURF) procedure, a pre ‐ selection of final variables was carried out to train the model. The four algorithms were compared, and it was concluded that the differences among them in overall accuracy (OA) on training datasets were negligible. Although the highest accuracy in the training step was obtained in SVML (OA=94.46%) and in testing in ANN (OA=91.91%), Random Forest was considered to be the most reliable algorithm, since it produced more consistent predictions due to the smaller differences between training and testing performance. Using a combination of Sentinel ‐ 2 and the two LiDAR data (UAV and ALS), Random Forest obtained an OA of 90.66% in training and of 91.80% in testing datasets. The differences in accuracy between the data sources used are much greater than between algorithms. LiDAR growth metrics calculated using point clouds in different dates and multispectral information from different seasons of the year are the most important variables in the classification. Our results support the essential role of UAVs in fuelbreak planning and management and thus, in the prevention of forest fires.


Introduction
Since the 1950s, a global phenomenon of rural-to-urban migration has been taking place, mainly in developed countries, leading to profound changes in land use caused by rural abandonment. Though the extent and effects of these changes in rural landscapes vary significantly among regions, in the Mediterranean basin one of the negative consequences is the increase in frequency and intensity of wildfires due to the encroachment of shrublands and young forests into ancient farmlands and pastures [1]. In many cases, the traditional domus, hortus, ager, saltus and silva system has been transformed into a prevalent wildland-urban interface (WUI) [2] in rural areas. This situation becomes particularly unmanageable when the structure of the human settlements is scattered.
In this scenario, controlling vegetation fuels around human settlements is a critical strategy to reduce fire severity in forests, buildings and infrastructures [1]. These specific areas can be synthetically classified into firebreaks and fuel-breaks. In both cases they are fuel-managed areas dedicated to stopping or reducing fire propagation, respectively. Firebreaks are areas of land (usually linear in shape) where the fuel present is completely removed, while fuelbreaks are usually wider, and covered by vegetation, where the fuel is partially removed [3]. When some forest canopy remains after treatment they are referred to as shaded fuelbreaks [1].
Fuelbreaks at the WUI consist of modifying the fuel load in areas adjacent to buildings and infrastructure in order to reduce the probability of ignition and the severity of a potential wildfire and thus to create safer areas for firefighting [4,5]. According to Ascoli et al. [3], in general, a surface fuel load ranging from 0.2 to 0.4 kg•m -2 is recommended for fuelbreaks. Regarding canopy cover, a value between 10% and 50% is desirable, while a crown base height of more than 2.5-5 m is recommended (depending on the surface fuels available) to avoid vertical continuity of the fuel. Finally, they recommend a recurrence in the operations ranging from 1 to 6 years. Even so, the rules for establishing the recommended fuel load and the permitted vegetation type depend on the legislation of each country. This legislation is generally in agreement with the forest and ownership structures of the area. In addition, the rules for firebreaks always depend on two factors; firstly, on the type of vegetation, and then on its characteristics, usually evaluated based on the height of the trees and their canopy cover.
In this case, these rules are based on the fact that the typical Galician forest was mainly made up of deciduous species, such as oak (Quercus robur L.), chestnut tree (Castanea sativa Mill.), and maritime pine (Pinus pinaster Ait.), but these formations have been greatly reduced through centuries in favor of pastures, agricultural land and shrublands. During the 20 th century a remarkable increment in forested area took place, mainly through reforestations with maritime pine and the foreign species blue gum (Eucalyptus globulus Labill.), which has transformed the Galician forest environment into a pine-and eucalyptus-dominated landscape. In fact, these two species, along with the species of the genus Acacia Mill. have been declared forbidden in the fuelbreaks of the WUIs.
To characterize the forest structure, and especially to perform classifications, it is common to use remote sensing technologies, both optical satellite imagery and airborne Light Detection And Ranging (LiDAR). Optical satellite imagery is considered passive data, while LiDAR is considered an active sensor. The purpose of image fusion is to use different types of sensor data to obtain more information from their union than is possible from separate analyses [6]. In this way, the integration of LiDAR data and multispectral imagery provides the geometric and radiometric attributes, respectively [7]. Different authors [8,9] have preferred to use only active sensors, which provide values related to vegetation metrics, while others [10][11][12][13][14] have privileged the combination of multispectral imagery with LiDAR information to improve classification accuracies in forest environments. In all cases, the use of remote sensing substantially reduces the cost of the estimation process [15].
A drone or Unmanned Aerial Vehicle (UAV) is a lightweight flying device that is not operated by an on-board pilot, and also has a ground control station and communication components. These devices can be either self-controlled or remotely piloted. Based on the types of wing, UAVs might broadly be classified as fixed-wing or rotary-wing type. Finally, they are classified according to the weight of their platform; thus, the most interesting ones for evaluating vegetation are usually the small unmanned aerial systems (UAS) whose weight is less than 10 kg [16].
Although the origin of UAS dates back to the military sector, especially to surveillance, in recent decades their professional use has become widespread in aerial photography and also in remote sensing [17]. Nowadays, it has become a common working tool in many fields such as forestry, precision agriculture and other sectors related to civil engineering, as well as emergency response [18]. On the other hand, the incorporation of cloud-based and generally more user-friendly data processing systems has greatly helped its expansion as a working tool [19].
Algorithms based on Machine Learning techniques such as Random Forests (RF) are now widely used for classification and prediction purposes in remote sensing applications [20]. Nevertheless, the Artificial Neural Networks (ANN) and Linear and Radial Support Vector Machine (SVML, SVMR) algorithms are increasingly being taken into consideration [21][22][23]. All these algorithms have been successfully applied to estimate forest biophysical parameters through multispectral data [24], biomass and soil moisture recovery [25], crop monitoring [26] or land use classification [27][28][29]. However, RF is the most widely used algorithm for earth observation and, in particular, for classification of land use and forestry applications [30]. Among different studies, we can highlight the use of RF to detect insect infestations according to the physiological characteristics of plants [31] or to estimate timber production [32] and forest biomass [33]. Nevertheless, other authors [34] consider that SVM is the best algorithm to solve complex classification problems, such as differentiation of tree species. Recently, due to the great success that deep learning is having, ANN are being used extensively for remote sensing in Earth observation and often achieve the same accuracies as SVML or even RF [28].
Based on the initial hypothesis that UAVs could be used to identify action areas in fuelbreaks and WUI areas, the general objective of this work is to test the performance of combined passive and active remote sensing data to predict vegetation types through machine learning algorithms. More specifically, we aim at: (i) analyzing the uncertainty of these four classification methods (RF, SVML, SVMR and ANN) in an object-based approach; (ii) comparing the accuracy obtained by different combinations of active and passive remote data sensed (UAV-LiDAR, low density LiDAR and different satellite images) and their interaction with the four ML algorithms; and finally (iii) mapping the wildland-urban interface fuelbreak planning rules.

Study Area
The study area is located in Porto do Son (Galicia, Northwest Spain). The municipality of Porto do Son encompasses a collection of coastal towns and scattered inland villages. The most represented tree species in the area are chestnut (Castanea sativa), oak (Quercus robur), pine (Pinus pinaster) and eucalyptus (Eucalyptus globulus). The climate of the area is characterized by high rainfall, low temperature variation, mild temperatures and some water deficit in summer. The main characteristics of the atmospheric environment are linked to the influence of the sea. The area of interest of this location corresponded to 3.7 km 2 ( Figure 1).

UAV imagery
Data from the UAV was obtained in July 2019. The integrated LiDAR system comprises a DJI M600 Pro UAV, a Phoenix LiDAR Systems Scout 16 with a Velodyne VLP-16 LiDAR sensor, an A6K RGB camera (based on Sony A600) and an inertial measurement unit (IMU-14). It is a high-precision system (Root Mean Square Error (RMSE) = 30 mm) with a scanning rate of 600k dual return points/s with a 360-degree field of view at a recommended scanning height of 20-60 m. To generate Post-Processing Kinematic paths (PPK) the system uses a dual-frequency L1/L2 Global Navigation Satellite System (GNNS) receiver. The study was conducted with a flight height of 55 m above the ground at a speed of 4 m/s and at an approximate horizontal distance between adjacent flight lines of 15 m, producing a very high density LiDAR point cloud (154 returns•m −2 ) with redundant coverage in the 90% overlap area.
Aeromedia UAV Inc. combined IMU and GNSS data through LiDARMill software (Phoenix LiDAR Systems) to apply differential corrections to generate a smooth and highly accurate trajectory for the computation of planimetric coordinates and ellipsoid height values. The LasTools (rapidlasso software) environment was used to pre-process the raw data, which involved five tasks: i) a total of 586 unbuffered tiles (200 × 200m) was generated using lastile procedure; ii) duplicate points were eliminated and noise was reduced using lasduplicate and lasnoise procedures respectively; iii) the overlap was classified by the procedure lasoverage; iv) ground points were classified using a triangulated irregular network (TIN) algorithm implemented in lasground_new; v) and finally the point cloud was classified into vegetation and buildings through the lasclassify procedure [35].
Aeromedia UAV Inc. generated the RGB-orthomosaic (UAV-ORTHO) using the Pix4D software [36]. The overall workflow of Pix4D consists of the following stages: initial photo matching, point cloud densification and ortho mosaic generation. In the case of forest and dense vegetation it is common to modify some parameters both in flight (increase the overlap between images to at least 85% front overlap and at least 70% side overlap) and in process to ensure that the desired quality, accuracy and format of the final production is obtained. The result was a TIFF image with spatial resolution of 4.14 cm and a high level of geometric accuracy (RMSEX,Y = 2.5 cm and RMSEZ = 2.4 cm).

Large-Scale Remote Sensing Data
Two open access ALS point cloud coverages from the National Program of Aerial Orthophotography (PNOA) of the Spanish Government were used (http://centrodedescargas.cnig.es/). The first coverage (ALS1) was collected between February and April 2011, while the second coverage (ALS2) campaign acquired information between July and September 2015. In both coverages, the nominal laser pulse density was 0.5 points•m −2 and the vertical and horizontal accuracy was 0.20 and 0.30 m, respectively. The LiDAR sensor used in each of the datasets was the RIEGL LMS-Q680i and the LEICA ALS60, respectively. In both cases it was mounted on an airplane operated by an on-board pilot.
The digital orthophoto (ORTHO) supplied by the PNOA was also used in this study. In Spain, the PNOA provides annual country-wide coverage with a spatial resolution of at least 0.5 m. An 8bit RGB orthophoto image, acquired in June 2017, with a spatial resolution of 0.25 m was used.
Two public data sources based on multispectral satellite imagery from the European Space Agency (ESA) Copernicus program were used: (i) Sentinel-2 (S2) satellite images (only the 10 and 20 m resolution bands were considered), and (ii) a Pleiades-1B (P1B) image from the VHR_IMAGE_2015 coverage. We used images from Sentinel-2 captured in February, May, August, and November of the years 2017, 2018 and 2019. These orthorectified and atmospherically corrected images were downloaded from the Copernicus Open Access Hub (https://scihub.copernicus.eu/). Suitable dates were selected to obtain cloud-free and reflection-enhanced images of deciduous trees using easySat® (föra forest technologies) [29]. Pleiades-1B image was captured in July 2015. Copernicus delivers this coverage with an approximate geometric correction. For precise geometric correction, a minimum of nine control points per scene and the DTM with 5 m mesh pitch from the National Plan of Aerial Orthophotography needs to be provided. The maximum allowed error has been 1 m. Table 1 provides detailed information on the spectral and spatial characteristics of each of the bands used. The ALS from PNOA can be downloaded already pre-processed, cleaned and classified. The LiDAR data processing, both from UAV and ALS, consisted also of several steps and was executed with easyLaz® (föra forest technologies), a proprietary tool based on the FUSION/ LDV software [37]. First, Digital Elevation Models were created. The Digital Terrain Model (DTM), using the GridSurfaceCreate procedure, and the Canopy Height Model (CHM), using the CanopyModel procedure, were generated at 0.5 m and 1 m resolution for UAV and at 1 m for ALS, and their respective slope rasters were also obtained. A series of descriptive statistics for a LiDAR data set were calculated by the GridMetrics procedure (Canopy Relief Ratio (CRR), height percentiles (HP) values and Canopy Cover (CC)), all of them with a resolution of 1 m and 5 m for UAV and 5 m for ALS. Finally, and only for data from UAVs, point density metrics were calculated using elevation-based slices in every 1-meter height layer through the DensityMetrics procedure. Densities were reported as the proportion of the returns within the layer. Table 2 shows the description of all the raster layers obtained after the LiDAR processing, as well as their spatial resolution. 100 × (total point between 40 to 50 m)/(total points) When two LiDAR datasets were available, height growth metrics were calculated. In the case of the ALS data, since there are two point-clouds coverages, the height growth between the second coverage (ALS2) and the first coverage (ALS1) was calculated, i.e., between 2015 and 2011. Finally, only when UAV and ALS data were combined, then height growth between 2019 (UAV) and 2015 (ALS2) was calculated. A description of the computed variables is shown in Table 3.

Denomination Spatial Resolution
Range All variables were also calculated as absolute values

Multispectral Analysis
In both image sources (Sentinel-2 and Pleiades-1B), four vegetation indexes were calculated based on imagery data: Enhanced Vegetation Index (EVI), Soil Adjusted Vegetation Index (SAVI), Green Normalized Difference Vegetation Index (GNDVI) and Normalized Difference Vegetation Index (NDVI) [29,38]. Table 4 shows the description of all the raster layers obtained.

Object-Based Image Analysis
Image segmentation is the key to an object-based classification approach. In this way, homogeneous image-objects representing the elements to be classified (e.g., roads, buildings, different types of vegetation) are created by grouping adjacent pixels with homogeneous characteristics [39,40]. An Object-Based Image Analysis was performed through the eCognition (Trimble Geospatial Imaging) software package. This software, like OrfeoToolBox [41] is usually used in remote sensing applications to carry out the segmentation [42]. This object identification is defined accordingly to the specific parameterization of certain attributes such as shape, spectral criterion of homogeneity, scale, and their compactness ratio. Fractal Net Evolution Approach (FNEA) is a multiresolution segmentation algorithm widely used in object-oriented image analysis. It was first introduced by Baatz and Schäpe [43]. It is based on bottom-up region fusion, i.e., starting with each image pixel as a separate object to merge pixels into large objects at each step, based on relative homogeneity criteria. This homogeneity criterion is a combination of spectral and shape criteria, which are customizable. Higher values of the scale parameter produce larger image objects, and vice versa. However, the homogeneity criterion measures how homogeneous or heterogeneous an image object is within itself. To do this, a combination of the objectsʹ color and shape properties is used [44]. Finally, eCognition uses a modular programming language (Cognition Network Language) that defines not only the import routines but also the analysis phases of the different objects. In this work three scale parameters (5, 10 and 15) were tested with the aim of finding an optimum parameterization to accurately define objects, and with enough size to compute the spectral information from Sentinel-2 (10 × 10 m) inside each object.

Field Data
The ground truth was taken at the segment level. Five ground truth classes were defined, as follows: Class 1: No Vegetation; Class 2: Crops; Class 3: Bush and Grass; Class 41: Permitted trees (oak, chestnut tree), and Class 42: Forbidden trees (eucalypt, maritime pine and acacias). During the field work we identified 434 segments (Table 5) distributed through all classes. Once the ground truth was collected, each segment got assigned its vegetation class. Subsequently, zonal statistics were computed for all the variables processed (Tables 1-4) for the segments of both the ground truth and the entire population.

Data Analysis
To reduce processing time in model training, feature selection was conducted using the Variable Selection Using Random Forest (VSURF) [45]. VSURF is a consistent three step wrapper-based algorithm which uses RF as the base classifier [46] operating as follows [45]: (i) the thresholding step is focused on removing irrelevant variables from the dataset, (ii) the interpretation step is dedicated to selecting all variables related to the response for interpretation purposes, and (iii) the prediction step improves the selection by removing redundancy in the set of variables selected. The selection of variables is based on the Mean Decrease in Gini (MDG). It is a measure of how important a variable is for estimating the value of the target variable across all the trees that make the Random Forest up. A higher Mean Decrease in Gini indicates higher variable importance. The greater the importance of a variable, the larger the MDG and the higher the position in the plot, and vice versa.
Following Raczko and Zagajewski [47] we compared four nonparametric classification algorithms (ANN, SVML, SVMR, and RF). The Support vector machine (SVM) classifier, developed by Vapnik [48], looks to find the optimal hyperplane in a n-dimensional classification space with the largest margin between classes. SVM was computed using linear and kernel variants. Cost and gamma SVM parameters were established in 1 and the inverse of the number of predictors, respectively. The Random Forest (RF) classifier, developed by Breiman [49], consists of an ensemble of individual decision trees. The maximum number of variables to try in each individual tree was the squared root of the variables selected by VSURF. Maximum number of trees was established in 500. Finally, an ANN classifier can be explained as a parallel computing system consisting of a very large number of simple processors with interconnections. The learning rate was set to 0.3, with a maximum of 10 5 iterations. To allow the four algorithms compete, the nnet, svmLinear, svmRadial and rf methods were executed using caret package in R software [50].
The ground truth dataset was randomly split in two samples, one for training (70%) and one for testing (30%). The latter was excluded from any training and reserved for testing the performance of the generalization of the model, as this procedure is a key point in supervised classifications aiming at management goals. Next, we executed random k-fold Cross-validation (CV) on the training data set randomly partitioned into folds, which is probably the most popular approach to estimate the error rate, or accuracy, of machine learning-based classifications [51]. A Repeated k-fold CV was performed using 10 folds with three replicas to control overfitting. A 10-fold CV involves dividing the training data set randomly into 10 parts and then using nine parts in training and one part in validation. When using three repeats of 10-fold CV, we will get the average of the error from performing the same analysis three times. This analysis was carried out through the function trainControl (method = ʺrepeatedcvʺ, number = 10, repeats = 3) of the caret package in R software [50]. Finally, for each classifier, a confusion matrix was computed for predictions of both training and testing datasets.

Mapping Application of Wildland-Urban Interface Fuelbreak Planning Rules
Once the data had been acquired, processed, the ground truth obtained, the databases created and the prediction of the vegetation class of each segment computed, the wildland-urban interface fuelbreak planning rules were applied (Figure 2). These rules usually include two steps. Firstly, we need to classify the vegetation in every segment following the prediction of the top-performance model. Secondly, we need to calculate the metric for every segment (CHM and CC), which is computed with up-to-date LiDAR information (in our case from UAV-LiDAR). A general overview of the entire materials and methods used is shown in Figure 3.  Figure 4 shows the segmentation output. The best result was obtained with a scale parameter of 15. Those segments are large enough to integrate the information from Sentinel-2 indexes (10 × 10 m). A smaller segment size implies too many segments without spectral information because few Sentinel-2 pixel centroids fall inside them. Depending on the data used, VSURF selects different sets of variables. Table 6 shows the variables selected by VSURF for the different combinations of datasets. In all cases, the variables are sorted by relative importance; the first variable in each column represents the most influential variable in the classification. When two LiDAR point clouds are used, the growth between both point clouds is a crucial variable. In the case where the dataset includes images of Pleiades, these are usually relevant. No fourth-quarter images were selected. Normally, Sentinel images that were selected were from the first quarter and mainly from the NIR or blue band. UAV-LiDAR data are always very significant.

Results and discussion
When focusing on results by data source (Table 7), we observe that six of them reach OA >90% for the training dataset at least with one of the algorithms under discussion. Differently, only two of them (ALS + S2 and UAV + ALS + S2) exceeded that figure when generalized to the testing datasets. We hypothesize that, on the one hand, as ALS includes two LiDAR coverages, allowed to estimate the LiDAR-based vegetation growth, which in turn proved to be a relevant variable in the classification procedure. On the other hand, the time series of three complete years (characterized by mean February, May, August and November images) derived from S2 data can satisfactorily characterize the phenology of the different plant species in the area. For those reasons, we consider that the combination ALS + S2 (with or without UAV) is essential to perform optimum classifications in this context. If UAV data are included, results are roughly similar. Nevertheless, we suggest that the use of UAV data should be mandatory to achieve optimum prediction of the actions to be used in the fuelbreak planning. First, UAV data combined with two open access datasets (ALS + S2) reached OA above 90% for all algorithms. Secondly, UAV makes available accurate and updated metrics of the vegetation, which are essential for a correct application of the rules for fuelbreak planning, as they are based not only on the type of vegetation but also on its height.
Regarding the algorithms (Table 7), in 19 combinations of algorithm-data source the OA values surpassed 90% (seven with SVML and SVMR, four with ANN and two with RF). Conversely, only RF accomplished OA values above 90% with more than one data source (i.e., ALS + S2 and UAV + ALS + S2) in the testing phase. In addition, RF was the unique algorithm with OA above 90% with a data source including UAV testing dataset. Furthermore, SVML, SVMR and ANN exhibited larger differences between training and testing OAs, which clearly suggests they tend to overfit. Therefore, we suggest that the best combination analyzed is that combining the dataset UAV + ALS + S2 with the RF algorithm. Consequently, the performance of UAV to identify action areas in fuelbreaks and WUI areas has the greatest accuracy combined with passive and active remote sensing data to predict vegetation types through machine learning algorithms.
Machine Learning-based techniques are suitable tools to optimize classification in land use for fuelbreak planning. Because the differences between the algorithms used are almost negligible (in almost all cases we have an accuracy close to 0.9), it is essential in land use classifications to make all available algorithms compete and also include different databases. In this work, we have found how RF has been the algorithm that offered the most robust results in land use classification purposes in fuelbreak planning due to the small deviation between accuracy values and, more interestingly, more similar values of OA between training and testing datasets. This is particularly relevant when the goal of a supervised classification is to provide managers with operational tools to be applied on large territories far beyond ground truth sample. Over 90% of land use classes (1, 2, 3, 41 and 42) classified from remote sensing data are correct, despite the relatively small size of the ground truth sample. Nevertheless, it is expected that the generalization to a distinct area, even with comparable vegetation attributes, would need an ancillary ground truth sample to retrain the model.
The detailed results of each of the classes separately for all combinations (dataset × algorithm) are shown in Table 8 (training) and Table 9 (testing). When only LiDAR information and orthophotos were used, errors were higher (nearly 20%). When multispectral images were included, errors decreased by 50%, and never exceeded 25%. Class 3 (Bush and Grass) was the one that obtained the less accurate classification results, with errors between 8.3% (SVMR -UAV + SENTINEL) and 36.9% (ANN-UAV). Forbidden tree species (Class = 42) are always classified very precisely, normally obtaining an error below 10%. The Classes (=3; =41 and =42) requiring silvicultural treatments (clearing, thinning, pruning, or felling) tend to be confused with each other when they give false positives. In practice this is not a critical error, since it indicates that a silvicultural treatment must be applied. In any case, the results of both confusion matrices show that it would be advisable to increase the ground truth in the classes with larger errors.
The reliability of Machine Learning for land use classification, in particular for vegetation analysis, has been evaluated in many studies developed in different environments, some of them also based on UAVs combined with other remote sensing data. For example, [22] found that the best ML algorithm to classify forest development stages are RF and ANN, and [23] obtained better results with RF and SVM when classifying mountain forests and shrubland land cover classes. Together with UAV and hyperspectral data [52] used RF to map species in a tropical environment and [53] successfully identified tree species in a mixed coniferous-deciduous forest in the USA. Also, applying Random Forest on UAV images [54] were able to identify seedling stands. The accuracy levels obtained in the present study are consistent with the ones reached in all these previous studies. All this research demonstrates the potential and the efficiency of artificial intelligence to deal with this type of analysis.
Moreover, the practical implications of using classification techniques for land-use planning at large scale and, particularly for the costly and time-consuming processes usually involved in fuel management, stand out when examining Figure 5. Nearly half of the area of interest studied here is classified as "no action" area, which has the clearest economic consequences. In addition, these data make possible to adequately gauge the required human and machinery resources to carry out the most suitable fuelbreak amount in a given area. Furthermore, since the predicted classification is spatially explicit, it permits not only to plan how many of the attainable resources to deploy before any field exploration, but also where to do it. This would make it feasible to implement optimization procedures [55]. In this regard, the use of high-and very high-resolution DTMs and DSMs derived from the UAV-LiDAR data makes it possible to identify areas with steep slopes and stone fences or other obstacles for machinery, which is crucial for an optimal solution on distribution of resources.
The use of Artificial Intelligence through ML algorithms to classify different land uses is a robust and widely used technique today. It is necessary to make compete algorithms although the differences between precision errors may not seem significant. These differences can become considerable when working on a large scale and without using multispectral imagery. Moreover, differences may not be negligible when classifying larger areas with more types of objects. Thus, the performance of different algorithms and data sources should be tested in every land classification analysis, as their behavior may largely change depending on spectral characteristics, size and variety of the object types included in the study area. Table 6. Data sources and respective variables ordered according to their importance derived from the mean decrease in Gini index of VSURF.

Conclusions
In this study, we have shown that UAVs are suitable tools to provide precise and operational data to identify action areas in fuelbreaks of WUI zones. They have the greatest accuracy combined with other passive and active remote sensors to predict vegetation types through machine learning algorithms. No clear differences in the performance of the different ML algorithms have been found. However, we consider RF to be the most robust, providing similar results between training and testing datasets. The rest of the algorithms tend to slightly overfit.
On the contrary, clear differences in prediction ability have been found among different data sources. The use of more than one LiDAR point clouds to calculate vegetation growth provides particularly useful information. The combination of UAV data with large-scale remote sensing data and RF algorithms has an accuracy greater than 0.9 both in training and generalization phases, which makes it an appropriate asset for optimizing vegetation classification for fuelbreak planning. Moreover, the use of UAV should be mandatory whenever other updated LiDAR data are not attainable, as height and cover metrics of vegetation are ineludible to apply actions rules in fuelbreak management planning. Accurate prediction of vegetation type makes possible to adequately gauge the required human and machinery resources to carry out the most suitable fuelbreak amount in a given area, reducing management costs and optimizing field work.
Our results support the essential role of UAVs in fuelbreak planning and management and thus, in the prevention of forest fires and the reduction of damages in human infrastructures and natural environments.