1. Introduction
Wildfires represent a hazardous and harmful phenomenon to people and the environment, especially in populated areas. Among the weather-induced emergencies, they constitute one of the most complex scenarios, since wildfires constitute a nonlinear, multi-scale and multi-physics process. According to the EFFIS Annual Report on Forest Fires in Europe, Middle East and North Africa [
1], in the European Union, over 3400 km
of land were burned in 2020. Around 40% of the burned area was part of the EU’s Natura 2000 network [
2], that is, the European coordinated network of protected areas, focused on ensuring the long-term survival of Europe’s most valuable and threatened habitats and species. The damage caused in many of these ecosystems will likely take many years to be restored. Moreover, the summer wildfire season of 2021 ravaged Mediterranean countries, with more than half a million burnt hectares, and took a high toll on firefighter and civilian lives.
Climate change effects on wildfire regimes are becoming more noticeable year after year too. According to the EFFIS report [
1], there is a clear trend showing increasing levels of fire danger, longer fire seasons, and more frequent fast spreading ‘mega fires’, over which the traditional firefighting strategies have little power. In general, fires no longer affect Southern European states only, but they constitute a growing threat also for central and northern Europe.
However, it should also be noted that, if a larger temporal time window is analyzed, a decrease in burned area and number of fires is observed in Italy and most Southern European countries, as demonstrated by Turco et al. [
3]. Such negative trends at the large time scale can be explained by an increased effort in fire management and prevention after the big wildfire seasons of the 1980s. According to Turco et al., a local increase in wildfire activity may be related to recent socioeconomic transformations, which may lead to more dangerous landscape configurations and also to climate change effects. Currently, fire management is focused mainly on fire suppression, which can lead to higher fuel load and fuel connectivity [
3,
4]. The next fire management strategies should thus improve prevention and adaptation measures, in addition to the sole suppression. The use of wildfire static and dynamic models, such as the seasonal susceptibility maps proposed in this work, helps in landscape management, prevention, and land use planning, during this overall decrease in fire activity, and gives to Civil Protection Authorities and decision makers the correct tools to tackle occasional outliers in wildfire seasonal trends.
Speaking of the causes, more than nine out of ten fires in the EU are human-caused [
1,
5]. In addition, Italy is characterized by human-caused fires. Only two percent of fires are due to natural causes (lightning) while the other part is due to human activities. The latter can be divided in intentional fires due to renewal pastures and acts of arson, and unintentional fires due to equipment use and malfunctions, negligently discarded cigarettes and burning plant debris during agriculture and forestry activities [
1,
5]. Italy is characterized by an increasing wildfire activity during recent years. In particular, in 2020, more wildfires were recorded than in 2019. The number of fires and the burned area increased respectively by 12% and 38%. This increase is mainly related to the severe wildfire activity in Sicily and Sardinia regions. This trend has then been validated by the catastrophic outcomes of the 2021 summer season of wildfires, where the fire events of Sardinia, Sicily and Calabria made the latter season even more severe in terms of burned hectares than the
annus horribilis of 2017.
Socio-demographic changes in rural areas, such as the abandonment of agricultural lands in the crops-forest interface combined with climate change effects, have created unprecedented and challenging circumstances, which call for an improvement in methodologies and tools for wildfire management and fire reduction. Susceptibility maps represent a valid tool for wildfire management activities.
Despite its widespread use, the concept of wildfire susceptibility has been defined differently by several authors [
6]. For example, Leuenberger et al. [
7] defined wildfire susceptibility as “the probability that fire occurs in a specific area without considering a temporal scale, assessed on the basis of predisposing factors related to terrain’s intrinsic characteristics,” while in other works, such as in the one of
Cao et al. [
8], the wildfire susceptibility is defined as the spatially distributed “likelihood of suffering harm,” thus allowing for factors that are not related to the terrain’s intrinsic characteristics (e.g., the simple probability of wildfire occurrence, retrieved from wildfire historical databases). In this work, the definition of Leuenberger et al. is adopted. The authors nonetheless stress the fact that the “probability” term used in the definition, however, widely used in a large number of works [
9], should not be intended in the strictly rigorous mathematical sense, considering the susceptibility as an indicator, which ranges from 0 to 1, useful for discriminating the areas that are more fire-prone, giving priorities and useful information to several stages of wildfire management [
10].
This quantitative evaluation is carried out by taking into account two aspects: the location and spatial extent of past wildfire occurrences, and the climatic, geo-environmental and anthropogenic predisposing factors (features) that are likely to be connected with wildfire spread. Among these features, the main chosen geo-environmental predisposing factors are the terrain elevation, slope, aspect (northing and easting) and vegetation cover while the historical means of temperature and cumulative precipitation account for climate features. Finally, anthropogenic factors like the distance from urban areas, roads, and cultivated land have been taken in consideration as they could give a hint on the human presence and anthropic activities, which could be related to a possible fire ignition. The definition itself of susceptibility maps relies on the assumption that future occurrences of wildfires are expected to take place under anthropic, climatic, and geo-environmental conditions similar to the already occurred ones. Providing accurate and reliable susceptibility map is the first step when the objective is to develop an accurate risk mapping for wildfire risk management, with the potential wildfire intensity assessment and the identification of exposed assets and their vulnerabilities being the next natural steps.
Several approaches about wildfire susceptibility assessment can be found in literature. Several simple yet robust statistical models can be employed, see, e.g., [
11,
12,
13]. Another approach involves the iteration of wildfire spread model runs with random ignition points and weather conditions, see, e.g., [
14,
15]. Recent advances in machine learning (ML) algorithms have attracted interest from the scientific community in susceptibility mapping for environmental problems [
16,
17,
18]. Such approaches have been pursued in recent times to produce wildfire susceptibility maps in different world regions [
19,
20,
21,
22,
23,
24]. ML techniques are capable of learning from data, modeling the hidden relationships between input variables (features) and output (labels). In Ref. [
25], a stochastic ML approach based on Random Forest (RF) elaborated wildfire susceptibility mapping for the Liguria region, Italy. The same approach, with some slight change, is adopted in this paper. In this work, a wildfire susceptibility mapping is performed in Italy on the national scale, based on an ML (RF) model. The ML model links the topographic, anthropic and climatic characteristics of the zones that experienced wildfires, producing season-specific maps. In the transition from regional to national mapping, it may be necessary to refer to more general global or European level data-sets, circumventing the problem of harmonizing local detailed information available at the regional scale [
26]. For this reason, a coarser classification of the fuel map is adopted with respect to the one used in the pilot wildfire susceptibility study in Liguria [
25]. In this experiment at the national scale, climatic variables, such as mean annual temperature and precipitation are considered, while the use of land that retrieved burned polygons is maintained in both approaches. This study demonstrates the ability of the RF technique to discern the most susceptible areas, even with national scale data, with the adopted set of predisposing factors.
2. Study Area
Italy is particularly affected by wildfires for its remarkable heterogeneity in topography and vegetation cover, its population density, and climatic conditions [
27,
28]. Blasi et al. [
29,
30,
31] identified and mapped two divisions of the Italian territory, the Temperate and the Mediterranean one. The Temperate Division includes the Alps, the Po Plain, and most of the Apennines. It accounts for 64 percent of the national territory. This area is characterized by a general lack of summer aridity (less than two months) and by marked differences between summer and winter temperatures. The natural vegetation mainly consists of forests, with broad-leaved deciduous plants (
Quercus, Fagus and
Carpinus species). The Mediterranean Division includes the southern Apennines, the Tyrrhenian and Ionian coasts, the southern Adriatic coast, and the Islands; it accounts for almost 36 percent of the Italian territory. This area is characterized by summer aridity, with precipitations concentrated in autumn and winter. The natural vegetation mainly consists of mixed woods of evergreen and deciduous species, shrublands, and Mediterranean maquis. A representation of the CORINE 2018 land cover [
32] classes corresponding to vegetated areas can be found in
Figure 1. The label for each code is reported in the
Supplementary Materials.
Italy has a total area of 301,340 km
, about 20 percent of which is covered by protected areas of Europe’s Natura 2000 network. These areas include the Sites of Community Importance (SCIs) and the Special Areas of Conservation (SACs), where the natural habitat is protected (
Figure 2).
Italy’s varied geological structure contributes to its high climate and habitat diversity. The Italian peninsula is located at the center of the Mediterranean Sea, forming a corridor between central Europe and North Africa, with a total of 8000 km of coastline. Because of the length of the peninsula and the mostly mountainous hinterland, the climate of Italy is highly diverse. In most of the inland northern and central regions, the climate ranges from humid subtropical to humid continental and oceanic. In particular, the climate of the Po river valley is mostly continental, with harsh winters and hot summers. The coastal areas of Liguria, Tuscany and most of the South are generally characterized by the Mediterranean climate.
5. Discussion
The proposed approach of assessing the wildfire susceptibility using an ensemble ML algorithm provides good results in terms of performance indicators. This is in line with the current literature in ML methods for evaluating the wildfire susceptibility, since ML for wildfire mapping has proven in general good accuracy and generality. However, most literature works focus on Study Areas at the regional scale [
19,
25,
44,
50] (with several notable exceptions of susceptibility maps at the national scale [
21,
24,
51]). This has indeed motivated the presented work, in order to assess the solidity of the adopted framework. The spatial CV phase gave good results: of course, the final model, trained on the 75% of the available pixels, is characterized by better AUC, but satisfying results (AUC greater than 0.8 for any of the folds) have also been reached for the cross validation runs of the model, for both wildfire seasons. This means that the model, which is reaching high AUC values when run on the whole training dataset, is not lacking generality.
The produced susceptibility maps, similarly to what had been developed at the regional scale in [
25], allow for assessing the zoning of wildfire prone areas for both winter and summer fire regimes. Typically, northern regions exhibit a winter fire regime [
52,
53], with particular focus on the Appeninic chain, and the alpine and pre-alpine areas. On the other hand, the southern region, also including the Italian islands of Sardinia and Sicily, is characterized by a summer wildfire regime.
The usefulness of the produced static maps lies in its ability to detect the areas where wildfire is more likely to occur in the future. The advances in this respect have been assessed by dividing the produced maps according to selected percentile intervals (because, of course, the probabilistic value given by the Random Forest prediction has not had any intrinsic physical meaning per se) and then makes use of tested burned pixels. This has been done in two ways. The first one used the randomly sampled burned pixels of the test data-set, which comes from a merging of the ground retrieved burned scars of past wildfires. In this case, the results were good, with more than 83% of the burned pixels assigned to the two highest susceptibility classes, for both seasons. Notably, in the winter case, the provided susceptibility map would allow in principle to concentrate fire fighting resources and prevention/prepredness activities in the five percent of the vegetated territory that would account for half of the total wildfire occurrences.
However, since the usefulness of a static map for wildfire management purposes has to be proven for the future, possibly catastrophic, events, the remote-sensing retrieved burned scars of the severe wildfire summer season of 2021 allowed a thorough testing of the produced maps. The results have shown a good prediction capability also in this more challenging case, with around 70% of the burned area of these wildfires belonging to the two highest classes of susceptibility. In particular, more than 30% of the satellite-retrieved burned area belongs to the top 5% percentile of the Italian summer susceptibility map.
However, this analysis was not only limited to the production of static maps since the built ML models allow for a variable importance in order to rank input factors by their relevance, as described in
Section 3 and presented in
Table 4. The method is based on a mean decrease in Gini impurity [
40,
46] provided by a function of the Python
scikit-learn package [
41].
Such feature importance ranking shows how the neighbouring vegetation plays a very important role in determining whether a pixel may experience a wildfire or not, whatever the wildfire season. The other important variables are related to the climate (precipitation and temperature), followed by the aspects’ components (northing and easting), and by the anthropic factors (distances from urban areas, roads, and crops). The least relevant feature is represented by the binary information related to the presence of protected areas (Natura 2000 network).
The detailed importance of each vegetation type, for both wildfire seasons, and both for the single-pixel vegetation or the neighboring vegetation are represented in detail in the
Supplementary Materials, while the neighboring vegetation importance for the summer season is portrayed in
Figure 11. Those importance values are the Gini importances relative to the single CLC code, before their aggregation in the list of
Table 4. In addition, in the
Supplementary Materials, the most important vegetation classes are examined in detail, for both summer (CLC codes 211, 321, 311, 323) and winter ( CLC codes 211, 311, 324, 242). Every pixel of the susceptibility map corresponding to each of latter CLC codes has been analyzed, and the distribution of the susceptibility values has been plotted. For the summer case, the four most important types of neighboring vegetation are represented in
Figure 12. Those plots highlighted different behaviours of those important classes: some classes are important to the ML algorithm because they are immediately associated with low susceptibility, such as arable land (211), while others are important because they are strongly associated with high susceptibility output (such as Sclerophyllous/ maquis vegetation, class 323). Other classes exhibit more complex behaviour, such as broad-leaves (311) and natural grassland (321). In this case, the interactions with other predisposing factors, such as DEM, slope and climate, are needed by the ML algorithm in order to assign a susceptibility value to the pixels characterized by such vegetation types.
As previously mentioned, the CLC18 is considered here at the third level of detail, and many other used pieces of data come from open data-sets, except for the synoptic database of wildfire occurrences.
The good results achieved applying the described ML framework and demonstrate that, as long as the spatial perimeters of the considered wildfire events have a good level of accuracy, ML Techniques such as Random Forest can make use of quite general predisposing factors, combining them in order to explore all the possible configuration and interactions, overcoming the limitations that may originate from the broad classes of land use. In
Figure 20, the different susceptibility distribution of the same CLC class (311, broadleaves) is shown for the northwestern part of Italy. Areas characterized by lower height above sea level and with vegetation heterogeneity (that is represented in the model by the neighbouring variables) exhibit higher susceptibility values when compared to the broadleaves located in the Alps and Apennines mountain ranges. These findings would thus motivate at any level the systematic and precise burned area retrieval that are of utmost importance in producing susceptibility and risk maps.
6. Conclusions
In this work, a methodology to assess the wildfire susceptibility at the Italian national scale has been proposed. Two separate analyses have been performed for each of the wildfire regimes occurring in Italy, the summer and the winter one. The adopted model is based on the RF Classifier, and the problem is structured as a classification problem. The ML algorithm assigned to each pixel a value of susceptibility, after training on a balanced data-set based on the past wildfires’ occurrences (label) and the pixels’ geo-climatic and anthropic characteristics (features). RF being an ensemble model, it can return a probabilistic output, thanks to the contribution of different estimators (trees) to the classification task. The resulting classification on each pixel of the study area is associated with the wildfire susceptibility distribution. The proposed model gave satisfying results on the spatial CV and on the data-set, composed of randomly selected pixels, in terms of AUC, MSE and accuracy. A more operational oriented test has been performed on recent burned areas corresponding to the particularly severe 2021 summer season. In addition, for this benchmark, the performance remained good, confirming the good prediction capabilities of the adopted ML framework. Further analyses on the output of the model and on the input importance ranking showed how the ML is capable of obtaining good results relying on a rather simple description of the vegetation cover, when reliable wildfire polygons are provided along with the other explanatory variables.
Selected test cases give interesting food for thought on how to operationally use the produced maps that could potentially help with depicting the possible final extent of recently started wildfire events. Needless to say, event-specific studies should also take into account the dynamic effect of wind data, fuel moisture content, phenological state, and fire fighting actions on the fire front extent. Such contributions are not taken into account by the proposed static mapping, but are of course considered by the wildfire spread model and tools available in literature [
54,
55,
56,
57]. When the focus is on the single wildfire event, fires developed under particularly severe weather conditions or when fire fighting operations are compromised may also affect areas with low fire susceptibility [
25].
Even if the proposed susceptibility maps can help Civil Protection Authorities and decision makers in wildfire management and long-term land use planning, they constitute the backbone of several possible hazard and risk mapping procedures, which may range from the static assessment to the dynamic one. Actually, analogous ML based susceptibility maps are used operationally by the Italian Civil Protection in order to modulate the outputs of dynamic forecasts of Fine Fuel Moisture Content, the potential rate of spread and fire-line intensity, embedded in the RISICO fire danger rating system [
58,
59,
60]. The very same principle expressed in this work—the identification of reduced size areas where most of the wildfire events occur—is adopted in selected outputs of RISICO forecasts, in order to reduce overestimation of wildfire danger in low susceptibility areas.
The work presented in this paper constitutes a milestone of the modeling approach who started at the regional scale [
25] and is now established at the national scale. Future works will be devoted to trans-boundary case studies, where susceptibility maps at the macro-regional scale could help in transboundary risk assessment procedures.