1. Introduction
Since the 1950s, a global phenomenon of rural-to-urban migration has been taking place, mainly in developed countries, leading to profound changes in land use caused by rural abandonment. Though the extent and effects of these changes in rural landscapes vary significantly among regions, in the Mediterranean basin one of the negative consequences is the increase in frequency and intensity of wildfires due to the encroachment of shrublands and young forests into ancient farmlands and pastures [
1]. In many cases, the traditional domus, hortus, ager, saltus and silva system has been transformed into a prevalent wildland-urban interface (WUI) [
2] in rural areas. This situation becomes particularly unmanageable when the structure of the human settlements is scattered.
In this scenario, controlling vegetation fuels around human settlements is a critical strategy to reduce fire severity in forests, buildings and infrastructures [
1]. These specific areas can be synthetically classified into firebreaks and fuel-breaks. In both cases they are fuel-managed areas dedicated to stopping or reducing fire propagation, respectively. Firebreaks are areas of land (usually linear in shape) where the fuel present is completely removed, while fuelbreaks are usually wider, and covered by vegetation, where the fuel is partially removed [
3]. When some forest canopy remains after treatment they are referred to as shaded fuelbreaks [
1].
Fuelbreaks at the WUI consist of modifying the fuel load in areas adjacent to buildings and infrastructure in order to reduce the probability of ignition and the severity of a potential wildfire and thus to create safer areas for firefighting [
4,
5]. According to Ascoli et al. [
3], in general, a surface fuel load ranging from 0.2 to 0.4 kg·m
−2 is recommended for fuelbreaks. Regarding canopy cover, a value between 10% and 50% is desirable, while a crown base height of more than 2.5–5 m is recommended (depending on the surface fuels available) to avoid vertical continuity of the fuel. Finally, they recommend a recurrence in the operations ranging from 1 to 6 years. Even so, the rules for establishing the recommended fuel load and the permitted vegetation type depend on the legislation of each country. This legislation is generally in agreement with the forest and ownership structures of the area. In addition, the rules for firebreaks always depend on two factors; firstly, on the type of vegetation, and then on its characteristics, usually evaluated based on the height of the trees and their canopy cover.
In this case, these rules are based on the fact that the typical Galician forest was mainly made up of deciduous species, such as oak (Quercus robur L.), chestnut tree (Castanea sativa Mill.), and maritime pine (Pinus pinaster Ait.), but these formations have been greatly reduced through centuries in favor of pastures, agricultural land and shrublands. During the 20th century a remarkable increment in forested area took place, mainly through reforestations with maritime pine and the foreign species blue gum (Eucalyptus globulus Labill.), which has transformed the Galician forest environment into a pine- and eucalyptus-dominated landscape. In fact, these two species, along with the species of the genus Acacia Mill. have been declared forbidden in the fuelbreaks of the WUIs.
To characterize the forest structure, and especially to perform classifications, it is common to use remote sensing technologies, both optical satellite imagery and airborne Light Detection And Ranging (LiDAR). Optical satellite imagery is considered passive data, while LiDAR is considered an active sensor. The purpose of image fusion is to use different types of sensor data to obtain more information from their union than is possible from separate analyses [
6]. In this way, the integration of LiDAR data and multispectral imagery provides the geometric and radiometric attributes, respectively [
7]. Different authors [
8,
9] have preferred to use only active sensors, which provide values related to vegetation metrics, while others [
10,
11,
12,
13,
14] have privileged the combination of multispectral imagery with LiDAR information to improve classification accuracies in forest environments. In all cases, the use of remote sensing substantially reduces the cost of the estimation process [
15].
A drone or Unmanned Aerial Vehicle (UAV) is a lightweight flying device that is not operated by an on-board pilot, and also has a ground control station and communication components. These devices can be either self-controlled or remotely piloted. Based on the types of wing, UAVs might broadly be classified as fixed-wing or rotary-wing type. Finally, they are classified according to the weight of their platform; thus, the most interesting ones for evaluating vegetation are usually the small unmanned aerial systems (UAS) whose weight is less than 10 kg [
16].
Although the origin of UAS dates back to the military sector, especially to surveillance, in recent decades their professional use has become widespread in aerial photography and also in remote sensing [
17]. Nowadays, it has become a common working tool in many fields such as forestry, precision agriculture and other sectors related to civil engineering, as well as emergency response [
18]. On the other hand, the incorporation of cloud-based and generally more user-friendly data processing systems has greatly helped its expansion as a working tool [
19].
Algorithms based on Machine Learning techniques such as Random Forests (RF) are now widely used for classification and prediction purposes in remote sensing applications [
20]. Nevertheless, the Artificial Neural Networks (ANN) and Linear and Radial Support Vector Machine (SVML, SVMR) algorithms are increasingly being taken into consideration [
21,
22,
23]. All these algorithms have been successfully applied to estimate forest biophysical parameters through multispectral data [
24], biomass and soil moisture recovery [
25], crop monitoring [
26] or land use classification [
27,
28,
29]. However, RF is the most widely used algorithm for earth observation and, in particular, for classification of land use and forestry applications [
30]. Among different studies, we can highlight the use of RF to detect insect infestations according to the physiological characteristics of plants [
31] or to estimate timber production [
32] and forest biomass [
33]. Nevertheless, other authors [
34] consider that SVM is the best algorithm to solve complex classification problems, such as differentiation of tree species. Recently, due to the great success that deep learning is having, ANN are being used extensively for remote sensing in Earth observation and often achieve the same accuracies as SVML or even RF [
28].
Based on the initial hypothesis that UAVs could be used to identify action areas in fuelbreaks and WUI areas, the general objective of this work is to test the performance of combined passive and active remote sensing data to predict vegetation types through machine learning algorithms. More specifically, we aim at: (i) analyzing the uncertainty of these four classification methods (RF, SVML, SVMR and ANN) in an object-based approach; (ii) comparing the accuracy obtained by different combinations of active and passive remote data sensed (UAV-LiDAR, low density LiDAR and different satellite images) and their interaction with the four ML algorithms; and finally (iii) mapping the wildland-urban interface fuelbreak planning rules.
3. Results and Discussion
Figure 4 shows the segmentation output. The best result was obtained with a scale parameter of 15. Those segments are large enough to integrate the information from Sentinel-2 indexes (10 × 10 m). A smaller segment size implies too many segments without spectral information because few Sentinel-2 pixel centroids fall inside them.
Depending on the data used, VSURF selects different sets of variables.
Table 6 shows the variables selected by VSURF for the different combinations of datasets. In all cases, the variables are sorted by relative importance; the first variable in each column represents the most influential variable in the classification. When two LiDAR point clouds are used, the growth between both point clouds is a crucial variable. In the case where the dataset includes images of Pleiades, these are usually relevant. No fourth-quarter images were selected. Normally, Sentinel images that were selected were from the first quarter and mainly from the NIR or blue band. UAV-LiDAR data are always very significant.
When focusing on results by data source (
Table 7), we observe that six of them reach OA >90% for the training dataset at least with one of the algorithms under discussion. Differently, only two of them (ALS + S2 and UAV + ALS + S2) exceeded that figure when generalized to the testing datasets. We hypothesize that, on the one hand, as ALS includes two LiDAR coverages, allowed to estimate the LiDAR-based vegetation growth, which in turn proved to be a relevant variable in the classification procedure. On the other hand, the time series of three complete years (characterized by mean February, May, August and November images) derived from S2 data can satisfactorily characterize the phenology of the different plant species in the area. For those reasons, we consider that the combination ALS + S2 (with or without UAV) is essential to perform optimum classifications in this context. If UAV data are included, results are roughly similar. Nevertheless, we suggest that the use of UAV data should be mandatory to achieve optimum prediction of the actions to be used in the fuelbreak planning. First, UAV data combined with two open access datasets (ALS + S2) reached OA above 90% for all algorithms. Secondly, UAV makes available accurate and updated metrics of the vegetation, which are essential for a correct application of the rules for fuelbreak planning, as they are based not only on the type of vegetation but also on its height.
Regarding the algorithms (
Table 7), in 19 combinations of algorithm-data source the OA values surpassed 90% (seven with SVML and SVMR, four with ANN and two with RF). Conversely, only RF accomplished OA values above 90% with more than one data source (i.e., ALS + S2 and UAV + ALS + S2) in the testing phase. In addition, RF was the unique algorithm with OA above 90% with a data source including UAV testing dataset. Furthermore, SVML, SVMR and ANN exhibited larger differences between training and testing OAs, which clearly suggests they tend to overfit. Therefore, we suggest that the best combination analyzed is that combining the dataset UAV + ALS + S2 with the RF algorithm. Consequently, the performance of UAV to identify action areas in fuelbreaks and WUI areas has the greatest accuracy combined with passive and active remote sensing data to predict vegetation types through machine learning algorithms.
Machine Learning-based techniques are suitable tools to optimize classification in land use for fuelbreak planning. Because the differences between the algorithms used are almost negligible (in almost all cases we have an accuracy close to 0.9), it is essential in land use classifications to make all available algorithms compete and also include different databases. In this work, we have found how RF has been the algorithm that offered the most robust results in land use classification purposes in fuelbreak planning due to the small deviation between accuracy values and, more interestingly, more similar values of OA between training and testing datasets. This is particularly relevant when the goal of a supervised classification is to provide managers with operational tools to be applied on large territories far beyond ground truth sample. Over 90% of land use classes (1, 2, 3, 41 and 42) classified from remote sensing data are correct, despite the relatively small size of the ground truth sample. Nevertheless, it is expected that the generalization to a distinct area, even with comparable vegetation attributes, would need an ancillary ground truth sample to retrain the model.
The detailed results of each of the classes separately for all combinations (dataset × algorithm) are shown in
Table 8 (training) and
Table 9 (testing). When only LiDAR information and orthophotos were used, errors were higher (nearly 20%). When multispectral images were included, errors decreased by 50%, and never exceeded 25%. Class 3 (Bush and Grass) was the one that obtained the less accurate classification results, with errors between 8.3% (SVMR - UAV + SENTINEL) and 36.9% (ANN-UAV). Forbidden tree species (Class = 42) are always classified very precisely, normally obtaining an error below 10%. The Classes (=3; =41 and =42) requiring silvicultural treatments (clearing, thinning, pruning, or felling) tend to be confused with each other when they give false positives. In practice this is not a critical error, since it indicates that a silvicultural treatment must be applied. In any case, the results of both confusion matrices show that it would be advisable to increase the ground truth in the classes with larger errors.
The reliability of Machine Learning for land use classification, in particular for vegetation analysis, has been evaluated in many studies developed in different environments, some of them also based on UAVs combined with other remote sensing data. For example, ref. [
22] found that the best ML algorithm to classify forest development stages are RF and ANN, and ref. [
23] obtained better results with RF and SVM when classifying mountain forests and shrubland land cover classes. Together with UAV and hyperspectral data [
52] used RF to map species in a tropical environment and [
53] successfully identified tree species in a mixed coniferous-deciduous forest in the USA. Also, applying Random Forest on UAV images [
54] were able to identify seedling stands. The accuracy levels obtained in the present study are consistent with the ones reached in all these previous studies. All this research demonstrates the potential and the efficiency of artificial intelligence to deal with this type of analysis.
Moreover, the practical implications of using classification techniques for land-use planning at large scale and, particularly for the costly and time-consuming processes usually involved in fuel management, stand out when examining
Figure 5. Nearly half of the area of interest studied here is classified as “no action” area, which has the clearest economic consequences. In addition, these data make possible to adequately gauge the required human and machinery resources to carry out the most suitable fuelbreak amount in a given area. Furthermore, since the predicted classification is spatially explicit, it permits not only to plan how many of the attainable resources to deploy before any field exploration, but also where to do it. This would make it feasible to implement optimization procedures [
55]. In this regard, the use of high- and very high-resolution DTMs and DSMs derived from the UAV-LiDAR data makes it possible to identify areas with steep slopes and stone fences or other obstacles for machinery, which is crucial for an optimal solution on distribution of resources.
The use of Artificial Intelligence through ML algorithms to classify different land uses is a robust and widely used technique today. It is necessary to make compete algorithms although the differences between precision errors may not seem significant. These differences can become considerable when working on a large scale and without using multispectral imagery. Moreover, differences may not be negligible when classifying larger areas with more types of objects. Thus, the performance of different algorithms and data sources should be tested in every land classification analysis, as their behavior may largely change depending on spectral characteristics, size and variety of the object types included in the study area.
4. Conclusions
In this study, we have shown that UAVs are suitable tools to provide precise and operational data to identify action areas in fuelbreaks of WUI zones. They have the greatest accuracy combined with other passive and active remote sensors to predict vegetation types through machine learning algorithms. No clear differences in the performance of the different ML algorithms have been found. However, we consider RF to be the most robust, providing similar results between training and testing datasets. The rest of the algorithms tend to slightly overfit.
On the contrary, clear differences in prediction ability have been found among different data sources. The use of more than one LiDAR point clouds to calculate vegetation growth provides particularly useful information. The combination of UAV data with large-scale remote sensing data and RF algorithms has an accuracy greater than 0.9 both in training and generalization phases, which makes it an appropriate asset for optimizing vegetation classification for fuelbreak planning. Moreover, the use of UAV should be mandatory whenever other updated LiDAR data are not attainable, as height and cover metrics of vegetation are ineludible to apply actions rules in fuelbreak management planning. Accurate prediction of vegetation type makes possible to adequately gauge the required human and machinery resources to carry out the most suitable fuelbreak amount in a given area, reducing management costs and optimizing field work.
Our results support the essential role of UAVs in fuelbreak planning and management and thus, in the prevention of forest fires and the reduction of damages in human infrastructures and natural environments.