A Linear Regression Model for Live Fuel Moisture Content Estimation during the Fire Season in Shrub Areas of the Province of Valencia in Spain Using Sentinel-2 Remote Sensing Data †

: Live Fuel Moisture Content (LFMC) describes the amount of water present in any type of vegetation and helps quantify the amount of fuel available in a wildfire. In this paper, a multivariate linear regression model was built to estimate the LFMC of the weighted average of all shrub-type species present, using the fraction of canopy cover (FCC) of each forest species as weights. Sample training was conducted with field data obtained during the fire season of the years 2019, 2020 and 2021 in 15 plots of a Mediterranean area where vegetation composed of the shrub-type species dominates. Different spectral indices extracted from Sentinel-2 together with the mean surface temperature, the accumulated precipitation and the seasonal parameters were considered as predictors. The results were compared with the extrapolation of another model trained with field data collected in the year 2019.


Introduction
Live Fuel Moisture Content (LFMC) is the measure of the percentage of water that a plant species contains in relation to its total dry mass and plays a fundamental role in the dynamics that a wildfire can have at its start and development [1].It is directly related to the amount of energy required to evaporate the water before ignition.Therefore, a high percentage of moisture contained in the vegetation reduces or completely prevents flammability and the consequent spread of fire [2].
Through satellite images, a spatiotemporal monitoring of vegetation cover on the Earth's surface is possible.These images allow us to analyze the spectral response captured in the portions of bands separated by the wavelength range of the electromagnetic spectrum, which, depending on the type of sensor used, will offer a series of possibilities to describe the LFMC [2].
In the work of Costa-Saura et al. [3], an LFMC estimation model was created using data obtained in the fire season of 2019 in 15 shrub plots in the province of Valencia in Spain, using as predictors a Sentinel-2 spectral index of 10 m of spatial resolution, smoothed by the Savitzky-Golay filter, in addition to a mean surface temperature and a mean wind speed.Later, Arcos et al. [4] proved that a higher accuracy was achieved in the models that used unsmoothed spectral indices and considered an accumulated precipitation variable when using data from two different years (2019 and 2020) in three plots belonging to the study area of [3].
This paper shows an extension of the regression model described in [3] using data obtained from the same 15 plots but in three fire seasons (in the years 2019, 2020 and 2021).Our model uses unsmoothed Sentinel-2 spectral indices as in [4] and replaces the average temperature with a static variable that reflects the periodic behavior introduced together with an accumulated precipitation variable to obtain a model for years with different rainfall regimes.

Study Area
This research was carried out in the Province of Valencia located to the south-east of the Iberian Peninsula on the Mediterranean coast.Fifteen plots described in [3] where shrub type species dominate were considered.Each plot was defined in a circular area of 30 m in radius where the density and type of vegetation were homogeneous and where samples of existing species were collected.

Field Data
Biweekly field samples for all species in each plot were collected between the months of June and October of the years 2019, 2020 and 2021.The LFMC for each specie, date and plot was calculated as the percentage of water contained in vegetation on a dry weight basis following Equation (1): where W f is the fresh weight and W d is the dry weight.To have a single indicator by date for all shrub species in each plot, LFMC WAS (LFMC weighted average in shrubs) was calculated as the weighted average of the LFMC in all the shrub species existing in the plot, using the fraction of canopy cover (FCC) of each one as weights according to Equation (2), where j varies according to the shrub species present in the plot.

Spectral Indices, Meteorological Parameters and Static Variables
Three types of spectral indices were considered: first, indices that respond to the photosynthetic activity of vegetation, e.g., Normalized Difference Vegetation (NDVI), Visible Atmospherically Resistant Index (VARI) and Soil Adjusted Vegetation Index (OSAVI); second, indices related to soil and vegetation water content, e.g., Normalized Difference Water Index (NDWI) and Normalized Difference Moisture Index (NDMI); and third, indices related to the greenness of vegetation, e.g., Transformed Chlorophyll Absorption Index (TCARI) and Vegetation Index-Green (VIGreen) [3]; and finally, TCARI_OSAVI = TCARI/OSAVI [3].Spectral indices were generated from Sentinel-2 satellite imagery, using Google Earth Engine and a spatial resolution of 20 meters, and their value was defined from the mean of the pixels that intersected with a circular buffer of a 30 meter radius centered on each plot.In addition, the average and range of such indices for the period studied in each plot were also calculated.The values registered at the meteorological observatories were interpolated using the Meteoland R package to obtain the accumulated precipitation in the previous 15, 30 and 60 days (p15, p30 and p60) and the mean surface temperature in the 7, 15, 30 and 60 days (t7, t15, t30, t60) prior to each field sample from the years 2019, 2020 and 2021.Other computed indices were the average wind speed in Km/h for 600 s of the maximum daily wind gusts in the previous 7 and 15 days [3].Variables sin_DOY and cos_DOY were calculated, respectively, as the sine and cosine for the day of the year [1].

Multivariate Linear Regression Models
A multivariate linear regression model (MLR_19_20_21) was calculated using three years of data and by following the steps described in Figure 1.First, a stepwise forward linear regression was applied using variables described in Section 2.3 as predictors.Second, the selection of a maximum of six predictors was carried out using the Akaike information criterion.And finally, the Variance Inflation Factor (VIF) of predictors in general linear models was calculated to analyze the quality of contribution that the independent variables present to the model.Moreover, the best model (R 2 adj = 0.70) in Costa-Saura et al. [3] of the year 2019 was replicated with our 2019 data to later extrapolate and compare their adjustability in the years 2020 and 2021.

Multivariate Linear Regression Models
A multivariate linear regression model (MLR_19_20_21) was calculated using three years of data and by following the steps described in Figure 1.First, a stepwise forward linear regression was applied using variables described in Section 2.3 as predictors.Second, the selection of a maximum of six predictors was carried out using the Akaike information criterion.And finally, the Variance Inflation Factor (VIF) of predictors in general linear models was calculated to analyze the quality of contribution that the independent variables present to the model.Moreover, the best model (R 2 adj = 0.70) in Costa-Saura et al. [3] of the year 2019 was replicated with our 2019 data to later extrapolate and compare their adjustability in the years 2020 and 2021.

Results and Discussion
Table 1 shows selected variables and fitted coefficients in MLR_19_20_21 and MLR_19 models.The spectral indices used in the first regression model were VARI and TCARI_OSAVI together with two statistics of spectral indices in each plot (mean_VARI and range_NDVI) calculated using all spectral data in the fire season of the three years.A static seasonal parameter (sin_DOY) and an accumulated precipitation variable (p60) helped to describe the LFMCWAS.Furthermore, in the replicated model, the fit and coefficients achieved were very similar to the original [3].All predictors were statistically significant with VIF parameters below 5, indicative of non-multicollinearity.The adjusted R 2 was smaller in the model using the data from all three years due to the inter-annual variability of LFMCWAS.This causes the RMSE and MAE to be higher in the MLR_19_20_21 model, but their values are below 12.

Results and Discussion
Table 1 shows selected variables and fitted coefficients in MLR_19_20_21 and MLR_19 models.The spectral indices used in the first regression model were VARI and TCARI_OSAVI together with two statistics of spectral indices in each plot (mean_VARI and range_NDVI) calculated using all spectral data in the fire season of the three years.A static seasonal parameter (sin_DOY) and an accumulated precipitation variable (p60) helped to describe the LFMC WAS .Furthermore, in the replicated model, the fit and coefficients achieved were very similar to the original [3].All predictors were statistically significant with VIF parameters below 5, indicative of non-multicollinearity.The adjusted R 2 was smaller in the model using the data from all three years due to the inter-annual variability of LFMC WAS .This causes the RMSE and MAE to be higher in the MLR_19_20_21 model, but their values are below 12. Figure 2 shows LFMC WAS values obtained in plot number 2 described in [3], which coincides with plot G5 given in [4].Model MLR_19 achieved higher accuracy for the year 2019, but when it was extrapolated to the other years, it lost accuracy due to annual differences, especially in the rainfall regime according to that described by Arcos et al. [4].The MLR_19_20_21 model estimated values closer to the field values for the years 2020 and 2021 where the minimum values were reached.In addition, the values estimated by MLR_19 at the beginning and at the end of the fire season during the years 2020 and 2021 were higher than those obtained in the field.Figure 2 shows the p60 values obtained in field dates, which indicated the importance of the precipitation variable in the MLR_19_20_21 model, since it fitted better with the minimum and maximum values and more effectively followed the trend of the field LFMC WAS .
Figure 2 shows LFMCWAS values obtained in plot number 2 described in [3], which coincides with plot G5 given in [4].Model MLR_19 achieved higher accuracy for the year 2019, but when it was extrapolated to the other years, it lost accuracy due to annual differences, especially in the rainfall regime according to that described by Arcos et al. [4].The MLR_19_20_21 model estimated values closer to the field values for the years 2020 and 2021 where the minimum values were reached.In addition, the values estimated by MLR_19 at the beginning and at the end of the fire season during the years 2020 and 2021 were higher than those obtained in the field.Figure 2 shows the p60 values obtained in field dates, which indicated the importance of the precipitation variable in the MLR_19_20_21 model, since it fitted better with the minimum and maximum values and more effectively followed the trend of the field LFMCWAS.LFMCWAS values estimated on the dates of sampling in the field in plot number 2, described in [3] (G5 in [4]), using MLR_19_20_21 and MLR_19 models, the coefficients of which are given in Table 1.Field LFMC corresponds to the measured data used to calibrate regression models and p60 corresponds to the accumulated precipitation in the last 60 days.

Conclusions
This paper allowed us to verify that the changes in the precipitation pattern caused LFMCWAS forecast errors when using a regression model obtained with data from previous years.The model obtained with data collected in the fire season for three consecutive years selected as its predictors an accumulated precipitation variable, together with a seasonal parameter that depends on the day of the year, instead of other meteorological variables obtained such as the average temperature or the wind speed.To describe the spatial differences of LFMCWAS between plots, temporal averages of the spectral indices obtained in each plot had to be used together with the range (the difference between the maximum and minimum), calculated using all possible dates in our study period, so they only varied at the spatial level.In addition, vegetation indices predominate over water content spectral indices.

Figure 1 .
Figure 1.Multivariate linear regression process used to estimate the LFMC.

Figure 2 .
Figure2.LFMCWAS values estimated on the dates of sampling in the field in plot number 2, described in[3] (G5 in[4]), using MLR_19_20_21 and MLR_19 models, the coefficients of which are given in Table1.Field LFMC corresponds to the measured data used to calibrate regression models and p60 corresponds to the accumulated precipitation in the last 60 days.

Table 1 .
Multivariate linear regression models for LFMCWAS defined by (2) in shrubs.The columns represent the name of model and variables, coefficients, p-value of each coefficient, variance inflation factor (VIF), adjusted R 2 (R 2 adj), root mean square error (RMSE) and mean absolute error (MAE).Only data from the year 2019 was used to replicate the model and its consequent statistics.
* Figure 1.Multivariate linear regression process used to estimate the LFMC.

Table 1 .
Multivariate linear regression models for LFMC WAS defined by (2) in shrubs.The columns represent the name of model and variables, coefficients, p-value of each coefficient, variance inflation factor (VIF), adjusted R 2 (R 2 adj ), root mean square error (RMSE) and mean absolute error (MAE).Only data from the year 2019 was used to replicate the model and its consequent statistics. *