Modeling Fire Danger in Galicia and Asturias ( Spain ) from MODIS Images

Forest fires are one of the most dangerous natural hazards, especially when they are recurrent. In areas such as Galicia (Spain), forest fires are frequent and devastating. The development of fire risk models becomes a very important prevention task for these regions. Vegetation and moisture indices can be used to monitor vegetation status; however, the different indices may perform differently depending on the vegetation species. Eight different spectral indices were selected to determine the most appropriate index in Galicia. This study was extended to the adjacent region of Asturias. Six years of MODIS (Moderate Resolution Imaging Spectroradiometer) images, together with ground fire data in a 10 × 10 km grid basis were used. The percentage of fire events met the variations suffered by some of the spectral indices, following a linear regression in both Galicia and Asturias. The Enhanced Vegetation Index (EVI) was the index leading to the best results. Based on these results, a simple fire danger model was established, using logistic regression, by combining the EVI variation with other variables, such as fire history in each cell and period of the year. A seventy percent overall concordance was obtained between estimated and observed fire frequency.


Introduction
Forest fires have important effects on vegetation but also affect the soil, causing its degradation and erosion [1], the diversity of species [2] and even human lives.They are also a significant source of CO 2 and other greenhouse gases [3].
In Spain, according to statistics from the Ministry of Environment [4], an average number of 20,887 fires per year were registered during the period 1996-2005, of which approximately 60% were deliberated.The average forested area burned reached 123,459 ha per year (including tree-covered and treeless areas).The Galicia region is an area particularly affected by the devastating effects of forest fires.Around 50% of all fires in Spain take place in Galicia, while in the adjacent region of Asturias only 8% occur.Focusing on forest land cover, burned forest extension in Galicia represents 25% of the total in Spain, and 9% in the case of Asturias.
The plan of prevention and defense against fire in Galicia includes different aspects, such as increasing awareness of the population, planning, organization of the forestry and agricultural spaces, infrastructures, silviculture, preventive vigilance, detection and fight [5].Prevention is essential in the fight against fire, and one of the measures to adopt is to have an index of fire danger.Different models provide fire danger indices based on the combination of variables such as weather, fire history, proximity to roads and/or people, etc. [6][7][8].However, field measurements are sometimes required, which are very costly in economic terms, time and human resources.Remote sensing techniques facilitate these tasks.They have been already used in many studies with the aim of obtaining parameters required in fire hazard models [9][10][11][12].
Several studies have demonstrated the existence of a relationship between fire occurrence and vegetation conditions [12][13][14].In particular, the vegetation water content is a key factor in the generation and spread of fires [11,15].The most widely used parameter for characterization of vegetation moisture content is the FMC (Fuel Moisture Content).However, the measure of this parameter requires field measurements, limiting its direct application to fire prediction.Remote sensing techniques offer viable alternatives to field measurements allowing indirect estimations of these parameters.The vegetation has a characteristic spectral curve when healthy, presenting a relative maximum in the green band, a minimum in the red band and a maximum in the near infrared (NIR).This curve changes with different factors, such as moisture, chlorophyll content, the structure of the plant, etc.In the middle infrared area of the electromagnetic spectrum, there are three water absorption bands centered at 1.4 µm, 1.9 µm and 2.5 µm.Reflectance in these bands increases with decreasing vegetation moisture content.However, a decrease in water content affects the chlorophyll amount in the plant and the internal structure of leaves, and hence the photosynthetic activity and the spectral response of the vegetation, resulting in a decrease in reflectance in the green band and near infrared and an increase in the red band.Remote sensing allows measuring reflectivity response in the visible and near infrared spectral bands.The combination of this reflectivity information from different bands can be used to characterize the vegetation status.Numerous studies have related the FMC with satellite data [16][17][18], especially using indices involving middle infrared bands.Moreover, in [19] a sensitivity analysis has been done in order to analyze the relationship between spectral indices and different measurements of vegetation water content (FMC and EWT, Equivalent Water Thickness) and the conclusion is that generally the spectral indices are more sensitive to EWT than to FMC.Furthermore, some studies have shown that vegetation indices can be used directly to characterize the water status of the vegetation [20,21].Moreover, vegetation indices have been shown to be useful as fire risk indicators [10,22].However, each study yields different conclusions regarding the most appropriate vegetation index to be used for fire risk estimation.The reason is that the spectral response of vegetation, according to the moisture content, can vary depending on the species and region.Therefore, the best index accounting for the changes in vegetation status must be determined for each study site.In [20], a study was conducted to find the best index to characterize the FMC in an area of grassland and another of shrubs.These authors showed that indices based on red/NIR (EVI: Enhanced Vegetation Index, NDVI: Normalized Difference Vegetation Index, SAVI: Soil Adjusted Vegetation Index) provided better results in grassland areas, while those based on NIR and short-wavelength infrared (SWIR) (NDII: Normalized Difference Infrared Index, GVMI: Global Vegetation Moisture Index, NDWI: Normalized Difference Water Index) worked better in shrubland areas.In [21] a comparison between different spectral indices was performed in a crop and shrubs area, and in a conifer forest.Best results were obtained using EVI in the crop and shrubs sites, while NDWI lead to better results in the forested area than the other indices.
Fire danger models usually combine different input variables, such as vegetation condition, fire history, etc. Logistic regression has been widely used in fire danger modeling, since it is very useful for obtaining fire occurrence probability from one or more independent variables that can be continuous or categorical; examples of the use of logistic regression are in the work of [23] where the logistic regression is used to compare the use of different spectral indices as fire risk indicators.Also in [14], logistic regression is used to assess a comparison of several spectral indices combined with topographic variables for fire occurrence modeling.In [24] logistic regression is used to assess the risk of forest fire by combining different static and dynamic variables.In [25], 10 predictor variables, including continuous and categorical variables, were combined using the binary logistic regression for mapping fire occurrence probability.
The objective of this work was to select the most appropriate index to be used in fire danger estimation in the region of Galicia and Asturias.With this aim, a comprehensive study was carried out, comparing different spectral indices from the literature.Finally, a fire danger model was defined using logistic regression for combining the vegetation index selected with the following variables: fire history in each period of the year and in each cell, and the region of study.

Study Area and Dataset
Galicia and Asturias cover an area of 29,575 km 2 and 10,604 km 2 , respectively.About 70% of the surface of both regions is forest, according to the Second National Forest Inventory [26].In Galicia, coniferous forests (pinus pinaster) are predominant, but there is also a very important population of hardwoods such as quercus, eucalyptus and castanea sativa.In Asturias, hardwood forests are more abundant, mainly formed by castanea sativa, fagus and quercus, but there are also populations of conifers (pinus pinaster, pinus resinosa and pinus sabiniana).In both regions, there are also large areas of grasslands.
Climate in these regions is humid with an average rainfall of 1200 mm per year [27], and mean annual values of relative humidity approaching 80%, due to the west prevailing winds bringing moist air masses.Summer months are drier, often resulting in moderate drought conditions with at least one month per year usually recording less than 40 mm of rain.Galicia and Asturias are characterized by year-round mild temperatures, with maximum values of 12 °C in winter and minimum values of 15 °C in summer [28].Agencies fighting forest fires work on the basis of a grid that divides Spain into 10 × 10 km cells, based on the UTM (Universal Transverse Mercator) projection [7].The historical database on forest fires in Spain is based on this grid.For this reason, the vegetation status is characterized with the same spatial resolution.
For the present study, images from MODIS sensor onboard TERRA satellite were used.In particular, version 5 of the products MOD13Q1 [29] and MOD09A1 [30] were selected.MOD09A1 is a composite of eight daily images containing reflectivity in seven different spectral bands (MODIS bands 1:7) at 500m spatial resolution.Each pixel contains the best observation during the eight-day period, selected on the basis of high observation coverage, low view angle, the absence of clouds or cloud shadow, and aerosol loading [30].These reflectivities were used to obtain the different indices selected for the study.MOD13Q1 is a composite of 16 daily images containing processed NDVI and EVI data at 250m spatial resolution.The compositing algorithm of this product is optimized for use in vegetation indices.This algorithm is based on the same criteria of those in the MOD09A1 but is implemented differently.The main difference is that it includes a correction of the angular effects in the cases where at least five clear pixels are available.The inversion of a BRDF (Bidirectional Reflectance Distribution Function) model is applied at reflectance level; afterwards the vegetation indices are calculated.If less than five clear pixels are available, the maximum value of the vegetation index of the pixels with lower view angle is selected [29].MODIS images were provided by the NASA Land Processes Distributed Active Archive Center (LPDAAC, https://lpdaac.usgs.gov/),covering a six-year period, from 2001 to 2006.By using composite images, problems related to cloud presence (frequent in these regions) can be mitigated since only cloud-free values (based on the quality band included in each MODIS product) are assigned to each particular pixel.

Methodology
The methodology can be divided into four parts: (i) processing of MODIS images, (ii) calculation of the different spectral indices, (iii) search for relationships between the changes experienced by the indices and the fire frequency in each cell, and (iv) combining the vegetation index with other variables in order to define a fire danger model.

MODIS Image Processing
The composition of several days in the MOD09A1 and MOD13Q1 products allows for greater spatial coverage minimizing the effects of clouds.Nevertheless, pixels with no reliable information due to the presence of clouds, sensor errors, etc. were still present.Thus, a filtering process was carried out to remove those pixels.The processing of the MOD09A1 images consisted firstly in a filter based on the information contained in the quality band, provided in each product.Pixels containing clouds, shadows or snow were removed.A mask of water bodies (oceans, rivers and lakes) and large cities was created, from information of the Corine Land Cover 2000 [31] to remove urban and water pixels.Afterwards, a spatially-based filling process was carried out to assign a value to those pixels or small groups of pixels previously masked.The filling process was performed in a moving window that obtained the average value of the non-masked neighbor pixels; this value was assigned to the central pixel only if it had been previously masked.Then, a composite of two consecutive images was performed with the aim of obtaining an image every 16 days.Average values were considered for those pixels containing information in both images, whereas the single value of one of them was assigned to the composite image when missing information in the other image.This compositing step makes another difference with the MOD13Q1 product.Although the product MOD09A1 provides data with eight-day frequency, we decided to keep working at the 16 day interval because a decline in the vegetation index over a longer period is more significant for fire prediction.Moreover, given the climate of the area, MOD09A1 presented many cloud problems.Finally, the 10 × 10 km grid was superimposed and the average value of all pixels with information contained in each cell was calculated when the cell was occupied by at least 10% of land pixels and 80% of the pixels contained valid information.When working with a resolution of 10 × 10 km, some detail is lost, but doing so is necessary to match the information of the vegetation indices with the number of fires.
MOD13Q1 images are provided with a temporal resolution of 16 days and without noise problems, so only the filtering, filling and cell averaging processes were applied.
Table 1.Spectral indices for the estimation of the vegetation status using MODIS.[33] Soil Adjusted Vegetation Index ( )

Normalized Difference Vegetation Index
, L=0.25 [32] Normalized Difference Infrared Index  [39] Most studies are applied to grassland and shrubland areas, showing that NIR and SWIR indices (NDII, GVMI, NDWI) represent better the water status of vegetation in shrubland areas [17,20].However, in [11] the study region was a conifer area, and they concluded that the NDWI can be used to obtain the humidity, but the amount of vegetation is also needed to estimate the fire risk, and they used the NDVI with this aim.However, the presence of dense vegetation in our study areas saturates some indices such as NDVI.Under these conditions, NDVI does not represent properly the changes in the vegetation status.Moreover, the NDVI is affected by the contributions of soil and atmosphere.The SAVI was defined in order to avoid soil contributions.This index is a modification of the NDVI and includes a parameter related to the amount of vegetation.This parameter (L) was set equal to 0.25 because this is the recommended value in [32] for an area with dense vegetation.GEMI and VARI indices were designed to remove atmospheric disturbances, and the EVI takes into account both, soil contributions and atmospheric disturbances.

Relationship between Fire Frequency and Changes in the Spectral Indices
Figure 1 shows the histogram of the average value of fire ignitions between 2001 and 2006.Most fires in Galicia are concentrated between the months of February and the first half of October, while in Asturias, the second half of October is also important, so the study of comparison of the eight spectral indices was restricted to these periods.The fire regime is also different in the two regions.In both cases there are two fire peaks, one in March and another during the summer.However, in Asturias the peak centered in March is more important than the one during the summer, while in Galicia the most important peak is the one during the summer.The number of fire ignitions is much higher in Galicia than in Asturias.Some difficulties may arise from this fact when searching for relationships in Asturias, due to the limited statistics.For all these reasons, the study was conducted separately in each region.A vegetation index variation between two dates indicates the increase or decrease in vegetation greenness, which is related to plant water status.Based on the idea that fire risk is greater in areas and periods of drought, index changes between two periods of 16 days were fitted to the frequency of fires in the next period.
The process was repeated for all the indices.First, the difference between two consecutive images was obtained.Then, cells were grouped in intervals of 0.01 in index variation, and fire-affected cells were counted in each interval.To avoid non-realistic index variations produced by a fire event, the database was filtered, removing those cells in which a fire ignition had been registered during the two weeks prior to the study period, since a fire would yield a significant decrease in the vegetation index.The fire probability was obtained as the ratio between the number of fire-affected cells and the total number of cells in each interval.By representing the fire probability (frequency of fire-affected cells) vs. the index variations, the eventual relationship between these two parameters could be obtained.Fifty percent of the time series was used to obtain the relationship, which was validated using the other 50%.Dates are selected at random to form the training and validation datasets.

Combining the Vegetation Index with Other VariaSDAbles
For developing a model that combines the vegetation index together with other variables the logistic regression was used.Logistic regression consists of adjusting Equation (1), where the dependent variable, P, is the probability of fire and must be included as a dichotomous variable, and the independent variables (x i ) can be continuous or categorical.α and βi are the coefficients of the equation to be adjusted.
) exp( 1 For this analysis three new variables (in addition to the vegetation index) were selected relating to the performing of fires in different areas and at different periods of the year.The three variables selected were: period fire history, cell fire history and region.Period fire history was selected because important annual cyclical patterns of fire occurrence were observed.This variable was calculated as a weight and is the fire frequency in each period of the year (including all periods).However, the seasonal patterns observed in Galicia and Asturias are different; that is why a categorical variable (region) was also included in order to establish differences between these regions.Regional patterns were also observed: while some cells had been recurrently affected by fire, others had had no fires during the six year period.For this reason, a new variable related to the fire history in each cell was included: cell fire history.This variable is calculated, similarly to the period fire history, as the frequency of fires in each cell.The weight of the cell history accounts for the times we observe fires in a particular cell in the entire time series.This may reflect the areas where burning is a habitual practice, in agriculture for example.Cell fire history may be related to sociological factors, while the period fire history may be related to both climatological and sociological factors.The cell and period fire history were obtained from data used for training the model (50% of the total data, randomly selected).
The present analysis used the statistical software PASW Statistics 17 [40].
Figures 2 and 3 show the percentage of fire-affected cells (probability of fire) vs. the variation of the indices in the regions of Galicia and Asturias.Good adjustment is shown between the variations suffered by three of the indices used and fire probability.An increasing tendency of fire probability was observed when the value of the indices decreases (Figure 2).A variation in the indices is an indicator of plant health; when the plant does not have enough water or nutrients it may eventually become yellowish and the vegetation index will then decrease.Under these conditions the vegetation is more vulnerable to fire ignition.A linear relationship was observed between their variations and the fire probability for the following indices: EVI, GEMI and SAVI, calculated using reflectivities from product MOD09 and EVI directly given by product MOD13.Determination coefficients obtained ranged between 0.78 and 0.84 in Galicia and between 0.60 and 0.70 in Asturias.Note that the best correlations were obtained when using the EVI (MOD13) in Galicia and the EVI (MOD09) in Asturias.The rest of the indices in Table 1 did not show any correlation with fire probability (Figure 3).Similarly to [11], we observed that moisture-related indices did not show a clear relationship with the fire probability.In areas such as Galicia and Asturias, where the forests are dominated by tall trees, atmospheric disturbances will be more important than soil contributions.In fact, we showed the GEMI and EVI as the indices that best characterize the state of vegetation.According to this reasoning, VARI would be expected to offer similar results to GEMI and EVI, however, this index was defined in a wheat area very different from the vegetation of Galicia and Asturias.Thus, it is not surprising that VARI is not a good index to characterize the vegetation state in our study areas.Moreover, the proposed methodology is based on the variations of the indices, which is an important difference with literature where usually the index value itself is used, which can lead to different conclusions.
Table 2. Statistical analysis of the linear fit between the observed and estimated fire probabilities using the validation data for three different indices in Galicia and Asturias.( ) ; 4 Non-systematic Root Mean Square Difference: ( ) 5 Mean Absolute Difference: For validation, the equations previously obtained for each index were applied to the data reserved for validation, and results were compared to the fire data registered during those years.Table 2 presents the results of this adjustment and a statistical analysis [41].As noted in Figure 2, the best determination coefficients, as well as the lowest estimation errors, were obtained when using the EVI.In this case, the EVI (MOD13) led to the best validation results in both regions.Based on this analysis, it can be concluded that the EVI provided by the MOD13 product is the best indicator of fire risk in Galicia and Asturias.Therefore, using this index the probability of fire occurrence in a cell can be estimated with an error around 10% in Galicia and less than 20% in Asturias.Best results were obtained in Galicia since the high number of fires improves the statistics.
The differences observed when using EVI from MOD13 product and MOD09 product may be due to the compositing technique used, in MOD13 product a specific compositing technique for vegetation indices is used, so it is expected to obtain better results from this product.

Fire Danger Estimation by Using Logistic Regression
Table 3 shows the coefficients of Equation ( 1) obtained for each variable, as well as their standard error and the significance obtained by applying the Wald test.This test analyses the efficiency of each variable: a value less than 0.05, obtained for all the variables included in the present analysis, means that the analyzed variable is statistically significant.The corresponding concordance levels obtained with the logistic regression for the training and the validation set are included in Table 4.A 70% overall concordance was obtained.Observed fires were predicted in more than 70% of the occasions.The predictive ability of the model was assessed with reference to the receiver operating characteristic curve (ROC) which is a graphical plot of the sensitivity vs. (1-specificity) for a binary classifier system [24].The area under the curve (AUC) is the value that indicates the goodness of the model: a value of 0.5 indicates random prediction, higher values indicate better predictions.In the present study, a value of 0.78 for the AUC was obtained, which indicates a good model.
Another method for analyzing the goodness of the model was performed.The method consisted in analyzing some statistical errors [41] from a contingency table dividing the database into 10 groups on the basis of the predicted fire probability (using percentiles) and comparing the number of predicted events by the model and the number of real events occurred.By representing the predicted fire probability vs. the observed fire frequency in each of the groups selected (Figure 4), we obtained a linear curve with 0.99 determination coefficient.From the statistical errors analyzed, we obtained a negligible BIAS, a RMSE of 0.02, with a systematic component (RMSES) of 0.007 and a non systematic component (RMSEU) of 0.02, an absolute error (MAD) of 0.02 and an absolute error in percent (MAPD) of 6%.

Fire Risk Levels
Operational systems to predict fire risk work on a graduated scale based on a previous classification.In this paper, we defined four risk levels from the estimated fire probabilities obtained from the logistic regression.A proper classification of risk levels was proposed so that the probability of fire at different levels does not overlap and that the percentage of cases in the four levels were significant.
Low risk: Probability < 21% Medium risk: 21% ≤ Probability < 32% High risk: 32% ≤ Probability < 52% Extreme risk: Probability ≥ 52% The observed fire frequency in each level is 10%, 27%, 43% and 66% in low, medium, high and extreme levels, respectively.This classification allows obtaining fire risk maps each 16 days.Figure 5 is an example of fire risk maps for each region.Note that most fires occur in high risk cells, while in low risk cells there are very few fires.
The fire danger model described in this paper might be affected by several sources of error, ranging from the inaccuracy in the EVI values extracted from the MODIS product to the incorrect location coordinates of the fire datasets used to develop the model.Efficiency of the model depends also on the human hand, since only natural causes and fire history are accounted in this work.The combination of the present fire risk model with other models based on different variables, for example on climatic variables, would lead to a more robust index.

Conclusions
Eight different indices were tested as indicators of fire risk in Galicia and Asturias, however, only three of these indices (EVI, SAVI, GEMI) showed a high correlation between their variation and the frequency of fires.The EVI (MOD13) provided the best results in the validation, with relative errors of around 10% in Galicia and less than 20% in Asturias.The combination of EVI variation with other variables, such as fire history in each period of the year, fire history in each cell and the region of study (Galicia or Asturias), by using logistic regression permitted defining a fire danger model with a 70% overall concordance between observed and predicted fires.Four levels of fire risk (low, medium, high and extreme) were defined from the probability predicted by the logistic regression that will allow us to foresee the higher or lower probability of fire ignition in each singular cell that divides the regions of Galicia and Asturias, for fire prevention and suppression purposes.

Figure 1 .
Figure 1.16-day cumulative fire events in Galicia and Asturias averaged over 2001-2006.

Figure 2 .Figure 3 .
Figure 2. Linear adjustment between the percentage of fire-affected cells in Galicia and Asturias and the indices variation in the previous two weeks, for the 50% of the total temporal series available and for the indices: (a) EVI (MOD13), (b) EVI (MOD09), (c) GEMI, (d) SAVI.
Pi y Oi are predicted and observed values, respectively, Pi ' = aOi + b, and < Oi > is the average of the observed values Oi.

Figure 4 .
Figure 4. Comparison between the probability of fire occurrence predicted by the logistic regression and the fire occurrence observed.

Figure 5 .
Figure 5. Fire risk maps for Galicia (25 May 2006 to 9 June 2006) and Asturias (26 June 2005 to 11 July 2005).Black points represent the number of fires registered in each cell.

Table 3 .
Coefficients of the logistic regression.

Table 4 .
Concordance levels for the logistic regression.