Characterizing Live Fuel Moisture Content from Active and Passive Sensors in a Mediterranean Environment

: Live fuel moisture content (LFMC) inﬂuences many ﬁre-related aspects, including ﬂamma-bility, ignition, and combustion. In addition, ﬁre spread models are highly sensitive to LFMC values. Despite its importance, LFMC estimation is still elusive due to its dependence on plant species traits, local conditions, and weather patterns. Although LFMC mapping from active synthetic aperture radar has increased over the past years, their utility for LFMC estimation needs further analysis to include additional areas characterized by different vegetation species and ﬁre regimes. This study extended the current knowledge using medium spatial resolution (20 m) time series acquired by active (Sentinel-1) and passive (Sentinel-2) sensors. Our results show that optical-based LFMC estimation may achieve acceptable accuracy (R 2 = 0.55, MAE = 15.1%, RMSE = 19.7%) at moderate (20 m) spatial resolution. When ancillary information (e.g., vegetation cover) was added, LFMC estimation improved (R 2 = 0.63, MAE = 13.4%). Contrary to other studies, incorporating Sentinel-1 radar data did not provide for improved LFMC estimates, while the use of SAR data alone resulted in increased estimation errors (R 2 = 0.28, MAE = 19%, RMSE = 25%). For increased ﬁre risk scenarios (LFMC < 120%), estimation errors improved (MAE = 9.1%, RMSE = 11.8%), suggesting that direct LFMC retrieval from satellite data may be achieved with high temporal and spatial detail.


Introduction
Fire danger, an important component of fire management systems, largely depends on meteorological variables and fuel conditions, as the topography is invariable over time [1,2]. Fuel condition is a critical parameter as it influences flammability, ignition, combustion, and fire spread [3][4][5]. While dead fuel condition is driven by weather patterns (e.g., heat, dryness, wind, rain), the live fuel moisture content (LFMC) depends on plant species traits [6,7]. LMFC spatial variation affects fire occurrence, intensity, and spread [8][9][10][11], with inverse relationships between ignition probability and live fuel moisture content being demonstrated in semi-arid environments [3,12]. In addition, reduced vegetation moisture is related to increasing large fire intensity and occurrence [13,14]. As similar atmospheric conditions can result in differentiated effects due to the physiological characteristics of individual trees or species-related resistance to drought, LFMC estimation based on meteorological information alone may be affected by errors [9,15] or produce results at the coarse spatial resolutions (e.g., 2 km) of the gridded weather data [16]. Therefore, remote sensing technologies [17][18][19][20][21][22] were increasingly used to understand LFMC spatio-temporal dynamics at improved spatial resolutions [23].
Until recently, remote sensing-based LFMC retrieval relied on information acquired within the visible and infrared spectrum (optical sensors) at different time lags, days to weeks, and at spatial resolutions varying from meter to kilometers [17,24,25]. Such studies used empirical relationships to link in situ measurements with surface reflectance or spectral indices [17,18,21,25,26] as well as physically based radiative transfer models [17,20,27]. Such approaches take advantage of the direct effects of tissue water content on near and shortwave infrared (NIR and SWIR) spectral reflectance absorption. However, changes in leaf structure and pigment concentrations may obscure such relationships [28]. In addition, LFMC estimation based on optical sensors may be affected by persistent cloud cover [29], decoupled reflectance values from dry matter content, or variations related to canopy properties [28,30]. The performance of optical-based LFMC retrieval varies with the vegetation type, with lower (10%) root mean squared estimation errors being observed for forests and shrublands when compared to grasslands (~30%) [17,20,22,25,30,31]. For more detailed information on LFMC and its retrieval from remote sensing data, the reader is referred to [28].
To reduce LFMC estimates uncertainties some studies focused on the use of the microwave region of the spectrum taking advantage of passive [32,33] or active microwave [19,29,34] satellite sensors. Past studies showed that vegetation dielectric properties and liquid water content are correlated strongly with water not bound to the vegetation material having a significant influence on the radar signal [35]. Such influence can be used to monitor plant water diurnal variations or water stress [36][37][38]. Indeed, the use of passive microwave improved LFMC estimation error to about 20% using indirect methods based on the time-lagged correlation of LFMC with soil moisture or vegetation optical depth derived from passive radiometers. However, as the spatial resolution of passive radiometers is coarse (e.g., 36 km for Soil Moisture Active Passive mission, SMAP), the derived products are available with low spacings (e.g., 9 km for the SMAP L4 soil moisture product). Therefore, the LFMC estimates derived through such approaches may not provide adequate spatial detail for some components of the fire management systems (e.g., fire spread simulation). Active microwaves sensors such as synthetic aperture radar (SAR) have been used to estimate dead fuel moisture [39,40] as well as vegetation water content [41,42], but until recently, only two studies were available on their use for LFMC retrieval [19,29]. Although these studies suggested estimation errors ranging between 10% and 15% in Canadian and Australian forests, the LFMC monitoring from SAR data was hindered by the low temporal frequency of older SAR satellites or the use of airborne sensors. However, with the launch of the Sentinel-1 mission in 2014, such limitations were removed as it provides for both the spatial resolution and the temporal frequency needed for operational LFMC monitoring. Indeed, a recent study carried out over dry shrubland-dominated sites suggested that using active microwave data and semi-empirical modeling may provide improved LFMC estimates when compared to using spectral reflectance [34]. Further, other authors suggested that combining optical and active microwave sensors within a deep learning framework may enhance modeling performance over the diverse ecological conditions in the western US [23], with the LFMC root mean squared error (RMSE) decreasing from 32% when using optical data alone, to 25% when using both sensor types.
As fire spread models are highly sensitive to LFMC [5], with over 1000% difference in fire rate of spread being induced by only a 10% difference in LFMC estimates [43], the utility of LFMC estimates for fire management systems ultimately depends on their accuracy which in turn may vary with vegetation type, sensor, and modeling approach. SAR-based LFMC estimation is still scarce, with all available studies being limited to short time spans (one or two fire seasons) and few forest stands spread over relatively small areas [19,29,34]. In addition, although optical and microwave sensors may provide complementary information to estimate LFMC, the joint use of active and passive sensors is limited to one study [23].
This study extended the work in [23] to the Mediterranean basin. More important, we derived LFMC estimates at a spatial resolution an order of magnitude higher (20 m vs. 250 m) while also testing the unconventional "handcrafted" variables (i.e., ratios of optical and radar data) proposed in [23] using comparable modeling approaches. We assessed the utility of Sentinel-1 (S1) and Sentinel-2 (S2) sensors individually, as well as their joint use, for LFMC estimation. The influence of temporal differences between satellite data acquisition and in situ measurements was appraised together with the importance of using static variables (e.g., canopy height, vegetation fractional cover, slope, orientation) as LFMC predictors. The study considered LFMC estimation over both the entire year as well as under high fire risk scenarios (LFMC < 120%) associated with increased flammability and fire spread [11]. LFMC modeling was based on artificial intelligence algorithms [44] as such non-linear models may capture parametric relationships without assuming an a priori analytical form. Such approaches allow for understanding the relationships between the dependent (LFMC) and the independent remote sensed based on the data [23]. This is the only second study addressing LFMC retrieval using active and passive sensors and long-term datasets.

Study Area
The study was carried out in the Madrid Region, which extends over 8030 km 2 in the center of the Iberian Peninsula ( Figure 1) at elevations between 400 m above sea level in the south and 2400 m in the north. The relief changes from planes in the Tagus valley in the south to mountains in the north which results in a significant variation in ecological conditions. According to the data from the Spanish state meteorological agency (AEMET), the climate is Mediterranean, with hot summers and an average annual rainfall of 450 mm, which occurs mainly in spring and autumn with an important gradient from the Tagus valley (<500 mm year −1 ) towards the mountains (1500 mm year −1 ). The average complementary information to estimate LFMC, the joint use of active and passive sensors is limited to one study [23]. This study extended the work in [23] to the Mediterranean basin. More important, we derived LFMC estimates at a spatial resolution an order of magnitude higher (20 m vs. 250 m) while also testing the unconventional "handcrafted" variables (i.e., ratios of optical and radar data) proposed in [23] using comparable modeling approaches. We assessed the utility of Sentinel-1 (S1) and Sentinel-2 (S2) sensors individually, as well as their joint use, for LFMC estimation. The influence of temporal differences between satellite data acquisition and in situ measurements was appraised together with the importance of using static variables (e.g., canopy height, vegetation fractional cover, slope, orientation) as LFMC predictors. The study considered LFMC estimation over both the entire year as well as under high fire risk scenarios (LFMC < 120%) associated with increased flammability and fire spread [11]. LFMC modeling was based on artificial intelligence algorithms [44] as such non-linear models may capture parametric relationships without assuming an a priori analytical form. Such approaches allow for understanding the relationships between the dependent (LFMC) and the independent remote sensed based on the data [23]. This is the only second study addressing LFMC retrieval using active and passive sensors and long-term datasets.

Study Area
The study was carried out in the Madrid Region, which extends over 8030 km 2 in the center of the Iberian Peninsula ( Figure 1) at elevations between 400 m above sea level in the south and 2400 m in the north. The relief changes from planes in the Tagus valley in the south to mountains in the north which results in a significant variation in ecological conditions. According to the data from the Spanish state meteorological agency (AEMET), the climate is Mediterranean, with hot summers and an average annual rainfall of 450 mm, which occurs mainly in spring and autumn with an important gradient from the Tagus valley (<500 mm year −1 ) towards the mountains (1500 mm year −1 ). The average   monthly temperature ranges between 1 • C in winter and 32 • C in summer, with a gradient from south (hotter) to north (colder). The area covered by natural vegetation is about 0.44 M ha corresponding to 55% of the Madrid region. The main tree species are oaks (i.e., Quercus ilex L., Q. pyrenaica W.) and pines (i.e., Pinus halepensis M., P. sylvestris L., P. pinea, P. pinaster A.), which account for 26% and 11.5%, respectively, of the natural vegetation cover. An additional 12% of the natural vegetation area consists of mixed forest species. Grasslands and riparian vegetation represent 16.5% and 2%, respectively, with the remaining areas (33%) being covered by shrublands (i.e., Cistus ladanifer L., Q. coccifera L., Retama sphaerocarpa L., Thymus sp., Erica sp. Macrochloa tenacissima L.).

Live Fuel Moisture Content Sampling
The field data consisted of LFMC samples for the main tree (oaks and pines) and rockrose (Cistus sp.) shrub species from 16 plots grouped around six locations IDs (LIDs 1 to 6 in Figure 1). Field samples have been collected since April 2016 with a variable temporal frequency: every two weeks during winter, every week during spring and autumn, and twice a week during summer. Each sample consists of 50-100 g of fine live fuels, including leaves and branches smaller than 6 mm. Random linear transects were established in shrubland areas to avoid re-sampling the same plants. Samples were collected from at least five individuals along the transect. For tree species, five random individuals were sampled at each date to account for intra-species variability. Samples were collected around noon, placed in individual airtight plastic containers, and transported to the laboratory before 3:00 PM to prevent moisture loss. Fresh samples were weighed and oven dried at 100 • C for 24 h. Subsequently, each sample's dry weight was measured to estimate the moisture content as the percentage of water contained in the vegetation with respect to the dry weight. A total of 4109 field samples were available for this study covering the period from April 2016 to December 2020. At each location, the available samples depended on the present species (Table 1). A manual screening was carried out to exclude anomalous samples caused by incorrect handling (e.g., container incorrectly closed, water in the container) that resulted in anomalous LFMC values or errors during field data collection (e.g., samples that included flowers, fruits or woody material, samples collected on rainy days, samples too small). In total, 2962 samples were retained for analysis after data screening.

Remote Sensing Data
Images acquired by the Sentinel-2 Multi-Spectral Instrument (MSI) and the Sentinel-1 C-band SAR sensors were used in this study. Six hundred and thirteen Sentinel-2 granules for the Military Grid Reference System (MGRS) tiles 30TUK, 30TVK and 30TVL were downloaded as atmospherically corrected, topographically normalized surface reflectance images from Theia Data and Services Centre, a French public center that facilitates the use of images from spatial observations. The downloaded data covered the period between April 2016 and December 2020. Only data with a cloud cover below 90% were downloaded. Theia data are derived from the original L1C level imagery through the multi-sensor atmospheric correction and cloud screening (MACSS). MACCS detects clouds and shadows, estimates aerosol optical thickness and water vapor, and corrects for atmospheric effects. MACCS is currently integrated within the MACCS-ATCOR joint algorithm (MAJA), open-source software that takes advantage of multi-temporal methods for optical image corrections [45][46][47]. Surface reflectance data from the 13 visible and infrared bands of the MSI was used to compute 21 spectral indices (SI, Table 2) sensitive to vegetation properties and live fuel moisture content [17,20,21,[23][24][25]30].
In total, 3550 ground range detected (GRD) images (10 m pixel spacing) acquired by the C-band Sentinel-1 satellites in interferometric wide (IW) mode were downloaded from the Copernicus Open Access Hub repository and processed through the s1tiling tool that uses SAR specific algorithms included with the Orfeo ToolBox (OTB), an open-source software developed and maintained by the Centre National D'Etudes Spatiales (CNES), France [48]. Sentinel-1 processing involves thermal noise removal, radiometric calibration to gamma naught, orthorectification, and temporal filtering [49]. Thermal noise removal and radiometric calibration were based on product metadata, while the orthorectification process, carried out at 20 m pixel spacing to match the Sentinel-2 data, used the Shuttle Radar Topographic Mission (SRTM) digital elevation model (DEM) at 30 m spacing. The resulting images, providing the backscatter coefficient for the VV or VH polarizations, are subsequently tiled and registered to the Sentinel-2 images as the same MGRS geographic reference is used for orthorectification. Further, the radar SPAN (VV + VH) and polarization ratio (VV/VH) were computed. Out of all Sentinel-1 relative orbits intersecting the three tiles of interest (30TUK, 30TVK, and 30TVL), only the relative orbits 1 ascending (1A) and 81 descending (81D) were used as these orbits covered the entire area of interest.

Matching LFMC Samples with Ground Areas
As LFMC field sampling was not designed for use with satellite imagery, the field crew collected samples close to specific xy coordinates (i.e., location ID) without spatially explicit identification of each sampled tree or shrubland transect. Therefore, the LFMC samples needed to be matched to ground areas (plots) covered by the sampled species. Such matching was achieved based on fieldwork and high-resolution (0.5 m) orthophotographs of the Spanish National Program of Aerial Photogrammetry (PNOA). Plots of homogeneous vegetation conditions (pure or mixed species) of at least 800 m 2 (two satellite pixels) were delineated on the ground with a handheld GPS in close vicinity (<250 m) to the location ID coordinates and maintained elevation, slope, and aspect. LFMC samples were subsequently matched with the plots identified at each location. In addition, for mixed vegetation plots, the corresponding LFMC values were averaged using as weight the proportion of fractional cover for each species as estimated during the fieldwork. Matching the field samples with ground areas of homogeneous vegetation resulted in 16 plots totaling 2962 LFMC estimates (Table 3). To avoid border effects and location uncertainties, the 20 m Sentinel-2 satellite imagery grid was used to select full pixels well within the GPS-delineated homogeneous area ( Figure 1). The plots were subsequently used to extract information on the surface reflectance and backscatter coefficient from the optical and the radar images, respectively, acquired at the closest date for a given maximum time difference when compared to the field sampling date.

Exploratory Analysis and LFMC Estimation
An exploratory analysis was carried out to identify LFMC annual trends for each sampled species, evaluate LFMC variability between locations and years, and assess the relationships between satellite data and field sampled LFMC in different configurations (i.e., plot level, species level, yearly samples vs. all samples). The exploratory analysis also compared the use of SI against relative SI values, as well as the influence of the allowed time difference between field sampling and satellite data acquisition. The relative values were computed as in Equations (1) and (2). The allowed temporal range was ±3 days and ±6 days. The range allows for one clouded image over a 5-day period for the Sentinel-2 satellite and one missed acquisition over a 6-day period for the Sentinel-1 satellite. It should be noted that while haze, clouds, and cloud shadows may often obscure the landscape in the case of optical data, missed data acquisitions are uncommon for Sentinel-1 and may arise due to changes in the basic operation scenario during emergencies (e.g., natural disasters), orbiting maneuvers, or sensor malfunction. Within the exploratory analysis, we also evaluated the influence of Sentinel-1 processing (temporally filtered vs. unfiltered data), the satellite-looking geometry (ascending vs. descending passes), and the use of the so-called handcrafted variables, "ratios of microwave and optical data", suggested in [23].
where r is relative, SI is the spectral index, LFMC is the live fuel moisture content, and min and max values are relative to the plot/species combination. LFMC estimation was carried out considering the results of the exploratory analysis using random forests (RF) regression, a non-parametric modeling approach [44], found to minimize LFMC estimation errors compared to other machine learning approaches [50]. As non-parametric models have no assumptions regarding the statistical properties of the data and offer the opportunity to include non-linearly related variables, they are often used when enough samples are available for model calibration. RFs use ensemble learning to improve the predictive power by aggregating predictions from constituent sub-models (i.e., trees). Each tree is built using a deterministic algorithm by selecting a random set of variables and a random sample from the training dataset [44].
The LMFC estimation was based on both remote sensing (i.e., optic, radar) and static (St) variables. The use of additional static variables, i.e., vegetation height and fractional cover, species, elevation, slope, eastness (i.e., sine of aspect), and northness (i.e., cosine of aspect), was also evaluated to ascertain the opportunity for LFMC retrieval improvements. Each sensor was individually tested to generate a reference baseline and allow for crosssensor evaluation. For parsimony, the models were trained using a subset of predictor variables as evaluated within the exploratory analysis. The predictor variables were selected by clustering highly correlated variables (|r| > 0.7) using both correlograms and principal component analysis (PCA) and retaining only the most important variable in each cluster. Variable importance was evaluated through the increase in the mean squared error (MSE) when the variable was removed from the predictor pool during RF modeling. To better understand predictor variables' role for relevant fire risk scenarios, lower values (LFMC < 120%) were modeled independently. LFMC below 120% is associated with increased fire occurrence in Mediterranean vegetation [9,11].
Following the preliminary analysis (see Section 3.1), LMFC estimation (model fitting) was based on the absolute LFMC values (i.e., as opposed to using relative LMFC values) with a maximum difference of ±6 days. In the case of the Sentinel-1 SAR data, the temporally filtered backscatter coefficient was used. All models were fitted with a minimum common dataset (1486 samples: 516 oak, 520 rockrose, 450 pine) using 1250 decision trees. The proportion of samples used for training was 60%, with the remaining samples being used for out-of-bag (OOB) validation. The tested models were split into seven sets depending on the included predictors: all groups of variables, only remote sensing variables (S1, S2, and S1 + S2), only static variables, and combinations of remote sensing and static variables (i.e., S1 + St, S2 + St). Models were fitted to three different datasets, namely the entire set, by vegetation type (i.e., oak, pine, and rockrose) subsets as well as to a subset corresponding to potentially higher fire risk (LFMC < 120%).
Model assessment (goodness of fit) was based on metrics computed based on the OOB set and included the correlation between actual and predicted values (r), the explained variance, the mean absolute error (MAE, 3), and the bias (4). In addition to the OOB-based validation, an ex-situ cross-validation was carried out by training the models using data from all but one location and using the remaining location for validation.
where P is the predicted values, O is the in situ observed values, and n is the number of samples, and O and P are the mean values.

Exploratory Analysis
There were statistically significant differences between the LFMC values depending on the sampled species and location, with mean LFMC values being highest for pines and lowest for oaks ( Figure 2). Across locations, mean LFMC values varied for the same vegetation types, with samples from location ID (LID) 1 and 6 showing higher mean values when compared to the remaining ones. Over time, rockrose (Cistus sp.) showed the highest variability, although oak trees showed a similar temporal pattern (highs and lows) albeit with a lower amplitude, particularly during dry years ( Figure 3). Pine trees showed higher LFMC in summer, while the temporal trend was the opposite (peaks instead of lows) when compared to rockrose and oaks. Pine samples from lower elevations (LID 3) showed smaller LFMC values, while samples from pines located on northern aspects and higher elevations (LIDs 1, 6) were characterized by higher LFMC values. Notice that such patterns were not observed for oak and rockrose samples. Lastly, the forest structural parameters influenced LFMC (pines and oak stands), with increased values being related to increased canopy height and cover.
Sentinel-2 SIs were highly correlated (r > 0.7) with the main group associated with the first PCA axis and a remaining few SIs (e.g., VIgreen, TCARI, ARVI, TC_B, RVI, MSI) orthogonal to the rest (see Figure S1 in the Supplementary Materials). Similarly, the Sentinel-1 polarizations (VV, VH), the SPAN (VV + VH), and the polarization's ratio (VV/VH) were also correlated and covaried in the PCA analysis, suggesting three main groups related to (1) the ascending (asc) satellite pass, (2) the descending (des) satellite pass, and (3) the polarization ratio for each pass ( Figure S2). Filtered and unfiltered Sentinel-1 data were highly correlated (r > 0.75), with mean values and dispersion being similar. A comparison of RF models based on Sentinel-2 SIs showed very similar results regardless where P is the predicted values, O is the in situ observed values, and n is the number of samples, and ̃ and ̃ are the mean values.

Exploratory Analysis
There were statistically significant differences between the LFMC values depending on the sampled species and location, with mean LFMC values being highest for pines and lowest for oaks ( Figure 2). Across locations, mean LFMC values varied for the same vegetation types, with samples from location ID (LID) 1 and 6 showing higher mean values when compared to the remaining ones. Over time, rockrose (Cistus sp.) showed the highest variability, although oak trees showed a similar temporal pattern (highs and lows) albeit with a lower amplitude, particularly during dry years ( Figure 3). Pine trees showed higher LFMC in summer, while the temporal trend was the opposite (peaks instead of lows) when compared to rockrose and oaks. Pine samples from lower elevations (LID 3) showed smaller LFMC values, while samples from pines located on northern aspects and higher elevations (LIDs 1, 6) were characterized by higher LFMC values. Notice that such patterns were not observed for oak and rockrose samples. Lastly, the forest structural parameters influenced LFMC (pines and oak stands), with increased values being related to increased canopy height and cover.
Sentinel-2 SIs were highly correlated (r > 0.7) with the main group associated with the first PCA axis and a remaining few SIs (e.g., VIgreen, TCARI, ARVI, TC_B, RVI, MSI) orthogonal to the rest (see Figure S1 in the Supplementary Materials). Similarly, the Sentinel-1 polarizations (VV, VH), the SPAN (VV + VH), and the polarization`s ratio (VV/VH) were also correlated and covaried in the PCA analysis, suggesting three main groups related to (1) the ascending (asc) satellite pass, (2) the descending (des) satellite pass, and (3) the polarization ratio for each pass ( Figure S2). Filtered and unfiltered Sentinel-1 data were highly correlated (r > 0.75), with mean values and dispersion being similar. A comparison of RF models based on Sentinel-2 SIs showed very similar results regardless .   of the time difference between in situ and remote sensing data. Marginal improvements in the explained variance (62% vs. 61%) were observed when using the longer (6-days) time difference. One should note that data pairs with time differences above six days were limited (<2%) and mostly occurred for Sentinel-2 during winter/spring, due to increasing clouds, when the accuracy of LFMC prediction is less relevant. Removing such data from the analysis would have little impact on the modeling outcomes.
The use of relative LFMC as a dependent variable rendered lower prediction accuracies when compared to using absolute values (62% vs. 45%). Using either absolute or relative satellite variables as predictors did not render differences in model fits for Sentinel-1 (16% vs. 16%) nor Sentinel-2 (54% vs. 56%) data but including both the relative and the absolute satellite variables increased the explained variance by approximately 10% (R 2 = 28% Sentinel-1, R 2 = 63% Sentinel-2). Therefore, all subsequent analyses used a time difference of six days, absolute LFMC values, temporally filtered SAR data, and both absolute and relative satellite variables as predictors.

Predictor Variables
The evaluation of the static predictor variables showed that FCC and CH were highly correlated (r = 0.99), with the latter being retained as it had a higher importance in RF modeling ( Figure S3). The most important radar variables for LFMC estimation were the VH polarization and the SPAN acquired during both ascending and descending passes ( Figure S4). After removing the highly correlated radar variables, nine were kept for modeling (Table 4). Among the Sentinel-2 SIs, the most important variable was the VARI. In general, the relative version of the SIs had lower importance except for the relative version of the EVI and VARI ( Figure S5). A total of 11 variables were retained for RF modeling after removing the highly correlated ones in each PCA cluster.  of the time difference between in situ and remote sensing data. Marginal improvements in the explained variance (62% vs. 61%) were observed when using the longer (6-days) time difference. One should note that data pairs with time differences above six days were limited (<2%) and mostly occurred for Sentinel-2 during winter/spring, due to increasing clouds, when the accuracy of LFMC prediction is less relevant. Removing such data from the analysis would have little impact on the modeling outcomes.
The use of relative LFMC as a dependent variable rendered lower prediction accuracies when compared to using absolute values (62% vs. 45%). Using either absolute or relative satellite variables as predictors did not render differences in model fits for Sentinel-1 (16% vs. 16%) nor Sentinel-2 (54% vs. 56%) data but including both the relative and the absolute satellite variables increased the explained variance by approximately 10% (R 2 = 28% Sentinel-1, R 2 = 63% Sentinel-2). Therefore, all subsequent analyses used a time difference of six days, absolute LFMC values, temporally filtered SAR data, and both absolute and relative satellite variables as predictors.

Predictor Variables
The evaluation of the static predictor variables showed that FCC and CH were highly correlated (r = 0.99), with the latter being retained as it had a higher importance in RF modeling ( Figure S3). The most important radar variables for LFMC estimation were the VH polarization and the SPAN acquired during both ascending and descending passes ( Figure S4). After removing the highly correlated radar variables, nine were kept for modeling (Table 4). Among the Sentinel-2 SIs, the most important variable was the VARI. In general, the relative version of the SIs had lower importance except for the relative version of the EVI and VARI ( Figure S5). A total of 11 variables were retained for RF modeling after removing the highly correlated ones in each PCA cluster.

LFMC Estimation
The RF model based on predictor variables from all groups (i.e., optic, radar, static) and using all data (full model) performed similarly (Table 5) when compared to the model containing only the optic and static (S2 + St) variables (r = 0.79 and MAE = 13.3%). By type of predictor variables, the most accurate LFMC estimates were obtained when using the optic variables (S2 model), with the r values decreasing slightly (0.74) and MAE values slightly increasing (15%) when compared to the Full model. Radar-derived LFMC (S1 model) showed higher errors, with MAE reaching 19.0% and r decreasing to 0.53. For the St model, canopy height was the most important variable when the entire dataset (all locations, all samples) was considered. The most important optic predictor variable (S2 model) was, by a significant margin, the VARI followed by the relative version of the EVI and the RVI2 (Figure 4a). The VH polarization and the SPAN from the ascending pass were the most important radar variables when radar metrics alone were used for LFMC Table 5. LFMC retrieval error as a function of the predictor variables. S1 stand for Sentinel-1 variables, S2 for Sentinel-2 variables, and St for static variables. In bold are the most accurate estimates.

LFMC Estimation
The RF model based on predictor variables from all groups (i.e., optic, radar, static) and using all data (full model) performed similarly (Table 5) when compared to the model containing only the optic and static (S2 + St) variables (r = 0.79 and MAE = 13.3%). By type of predictor variables, the most accurate LFMC estimates were obtained when using the optic variables (S2 model), with the r values decreasing slightly (0.74) and MAE values slightly increasing (15%) when compared to the Full model. Radar-derived LFMC (S1 model) showed higher errors, with MAE reaching 19.0% and r decreasing to 0.53. For the St model, canopy height was the most important variable when the entire dataset (all locations, all samples) was considered. The most important optic predictor variable (S2 model) was, by a significant margin, the VARI followed by the relative version of the EVI and the RVI2 (Figure 4a). The VH polarization and the SPAN from the ascending pass were the most important radar variables when radar metrics alone were used for LFMC Table 5. LFMC retrieval error as a function of the predictor variables. S1 stand for Sentinel-1 variables, S2 for Sentinel-2 variables, and St for static variables. In bold are the most accurate estimates.

Data Set
Metric S1 S2 St S1 + S2 S1 + St S2 + St Full  SPANasc_r S1 S2 St S1+St S2+St S1+S2 estimation (S1 model). In general, the full model showed that LFMC was overestimated for values below 75% and underestimated for values above 150%, with the highest variability of residual errors being observed for the rockrose. The residual error was homogeneous throughout the year for all species ( Figure S6).
When models were trained using data from all but one location and tested on the excluded location (ex situ), the average performance decreased ( Figure 5) for all locations except LID1. For the full model, r decreased by 0.24 points (from 0.79 to a minimum of 0.49), while MAE increased by 12% (to a maximum of 25%). On average, the use of radar (S1 model) or optical (S2 model) data made little difference in the LFMC retrieval error across locations (2.7%) except for LID5, where the use of SIs resulted in almost 10% MAE improvement. The LFMC difference between the most and the least accurate models at each location was, on average, 6.1%, with the highest difference (~9%) being observed for LID5 and the lowest (~3%) for LID 2 and LID3. The variables' importance varied slightly across locations, although the most important ones (VARI, EVI_r, and CH) remained unchanged (Figure 6b).
Generating individual models for each vegetation type resulted in similar trends across species (10% < MAE < 27%), with the most accurate models (10% < MAE < 12%) being observed for oak and pine. Compared to the model based on the entire dataset (MAE = 13.4%), by species MAE increased for rockrose (16.6%) and decreased for oak (11.2%) and pine (11.9%). For species-specific models, the most important variables for LFMC estimation differed, with the relative EVI being the most important one for rockrose, the VARI for oak, and the static ones (CH, elevation, northness) for pines (Figure 6a). estimation (S1 model). In general, the full model showed that LFMC was overestimated for values below 75% and underestimated for values above 150%, with the highest variability of residual errors being observed for the rockrose. The residual error was homogeneous throughout the year for all species ( Figure S6).
When models were trained using data from all but one location and tested on the excluded location (ex situ), the average performance decreased ( Figure 5) for all locations except LID1. For the full model, r decreased by 0.24 points (from 0.79 to a minimum of 0.49), while MAE increased by 12% (to a maximum of 25%). On average, the use of radar (S1 model) or optical (S2 model) data made little difference in the LFMC retrieval error across locations (2.7%) except for LID5, where the use of SIs resulted in almost 10% MAE improvement. The LFMC difference between the most and the least accurate models at each location was, on average, 6.1%, with the highest difference (~9%) being observed for LID5 and the lowest (~3%) for LID 2 and LID3. The variables' importance varied slightly across locations, although the most important ones (VARI, EVI_r, and CH) remained unchanged (Figure 6b).
Generating individual models for each vegetation type resulted in similar trends across species (10% < MAE < 27%), with the most accurate models (10% < MAE < 12%) being observed for oak and pine. Compared to the model based on the entire dataset (MAE = 13.4%), by species MAE increased for rockrose (16.6%) and decreased for oak (11.2%) and pine (11.9%). For species-specific models, the most important variables for LFMC estimation differed, with the relative EVI being the most important one for rockrose, the VARI for oak, and the static ones (CH, elevation, northness) for pines (Figure 6a).  estimation (S1 model). In general, the full model showed that LFMC was overestimated for values below 75% and underestimated for values above 150%, with the highest variability of residual errors being observed for the rockrose. The residual error was homogeneous throughout the year for all species ( Figure S6). When models were trained using data from all but one location and tested on the excluded location (ex situ), the average performance decreased ( Figure 5) for all locations except LID1. For the full model, r decreased by 0.24 points (from 0.79 to a minimum of 0.49), while MAE increased by 12% (to a maximum of 25%). On average, the use of radar (S1 model) or optical (S2 model) data made little difference in the LFMC retrieval error across locations (2.7%) except for LID5, where the use of SIs resulted in almost 10% MAE improvement. The LFMC difference between the most and the least accurate models at each location was, on average, 6.1%, with the highest difference (~9%) being observed for LID5 and the lowest (~3%) for LID 2 and LID3. The variables' importance varied slightly across locations, although the most important ones (VARI, EVI_r, and CH) remained unchanged (Figure 6b).
Generating individual models for each vegetation type resulted in similar trends across species (10% < MAE < 27%), with the most accurate models (10% < MAE < 12%) being observed for oak and pine. Compared to the model based on the entire dataset (MAE = 13.4%), by species MAE increased for rockrose (16.6%) and decreased for oak (11.2%) and pine (11.9%). For species-specific models, the most important variables for LFMC estimation differed, with the relative EVI being the most important one for rockrose, the VARI for oak, and the static ones (CH, elevation, northness) for pines (Figure 6a). When only lower (<120%) LFMC values were modeled, the MAE varied between 9.5% and 12.5% depending on model configuration, with the highest errors being observed when using SAR metrics (S1 model). The importance of the variables did not change (Figure 4b) when compared to modeling the entire dataset.

Discussion
Overall, when static variables were included, the performance of the radar-based LFMC model (S1 + St model, MAE = 17.0%) was largely similar when compared to the optical-based model (S2 + St model, MAE = 13.4%). The lowest LFMC errors were observed at LID1 (15%), with all the remaining locations showing higher MAE (19%-25%). The ordination of locations by environmental conditions (data not shown) indicated largely different conditions for locations 3 and 4 which were drier areas dominated by vegetation with lower canopy height and cover. As vegetation types varied from low shrubs to medium-height oaks and taller pine forests, the static variables had an important impact on LFMC estimation, with canopy height being the second most important variable after VARI despite the reduced variability of climatic and edaphic conditions and species compositions. Removing the static variables degraded model performance by about 2% when using radar (MAE = 19.0%) and, respectively, optical (MAE = 15.1%) data. However, the large differences in the explained variance (27 vs. 54%) suggested that SIs reflectance provided additional information not available in the backscatter coefficient.
The full model estimated LFMC with a cross-validated accuracy of 0.62 (R 2 ), a mean absolute error (MAE) of 13.4%, and an RMSE of 18.0% when jointly using optic, radar, and static variables. Contrary to [23], the addition of Sentinel-1 backscatter data did not improve LFMC estimation, i.e., model performance did not decrease when removing the radar-derived variables. The relatively high importance of the VV polarization and the open Mediterranean vegetation suggests that a large part of the SAR backscatter may originate at the soil surface, thus explaining the reduced importance of the radar metrics for LFMC estimation as soil surface moisture may not linearly relate to vegetation water content, particularly for forest tree species [15]. Further, the inclusion of the so-called "handcrafted inputs", ratios of microwave and spectral reflectance (or indices), did not provide any improvements for LFMC retrieval as suggested in [23]. Such disagreement may be related to the different vegetation types and spatial extents and thus increased variability, spatial resolution (20 m vs. 250 m), or to the inadequacy of such unconventional indices hardly used in the remote sensing literature. Further research should be carried out to validate the utility of such indices.
The importance of individual SI contributions for pooled models (all sites) confirmed previous results [21,25] as three of the SIs selected here (EVI, NMDI, TCARI/OSAVI) were also identified as the best-performing indices, either in their original or relative form, by these previous studies suggesting similar spectral sensitivity to variation in LFMC over a range of species and fire regimes. Although [25] suggested using relative SI and LFMC values to increase model performance (i.e., they observed a decrease in the RMSE from 33% to 19%), such improvements were not observed with our data when using relative indices alone. The discrepancy may be related to the estimation of LFMC over both shrub and forest vegetation as opposed to only shrublands. Indeed, a more detailed analysis of the variables' importance by species revealed that relative indices were the most important for rockrose but not for oaks and pines, for which the original indices and structural variables, respectively, were more important. Overall, using both original and relative SIs resulted in the most accurate estimates, although the improvement of the RMSE was rather marginal (from 17.9% to 13.4%, data not shown).
Individual models for each vegetation type performed slightly worse for rockrose (MAE of 16.6%) but not for oak (10.4%) and pine (12.0%). Such differences were attributed to the smaller dataset available for training at the species level (~500 observations). Indeed, the use of random subsample (500 observations) from all sites and vegetation types for LFMC estimation resulted in an MAE of 16.1% for the full model, a value similar to or worse when compared to LFMC estimation at species level.
The use of predictor variables with high seasonality (e.g., day of the year), such as those used in [22], may increase overall model precision but with negligible effect on LFMC retrieval errors during higher fire risk periods. Indeed, estimation errors observed when using static variables alone (St model) decreased when the day of the year (DOY) was included (25% vs. 20%), while the explained variance doubled, reaching 53.6% (data not shown). However, the inclusion of DOY in the full model improved performance only marginally (e.g., MAE decreased by 0.6%), suggesting that remote sensing data can characterize the LFMC variability along the year. Further, when estimating only low LMFC values (<120%), the inclusion of DOY did not improve model performance, with MAE changing by only 0.1%. Such differences suggest that seasonality is important for LFMC prediction only when remote sensing information is not available and that LFMC estimates based on static variables alone may be affected by larger errors.
Different patterns related to the predictor variables' importance were observed for rockrose and pine. For rockrose, the most important variables were the relative indices which may be related to the higher interannual variability in shrublands and the potential compensatory effect relativization has on such variations. In contrast, for pines, the static variables were by far the most important ones, which may be related to more stable LFMC values at the species level regardless of variation in environmental conditions [51].
Our results are slightly more accurate than those of [23,25] (RMSE of 20%-33%) and similar to those from [22] (RMSE 16%-20% and MAE 13%-15%). When compared to [21] (RMSE = 12.5%), our errors were higher, but the explained variance was significantly lower (53.8% vs. 29%) despite the use of a much longer data series. One should note, though, that like-for-like comparisons are difficult due to the different areal extents (continental vs. subcontinental vs. regional), the spatial resolution of estimates (250 m vs. 500 m vs. 20 m), and differences in predictor variables (e.g., day of year, land surface temperature) and modelling approaches (e.g., random forest, recurrent neural network). Each of these factors may influence the precision of the estimates. For example, LFMC estimation over larger areas may result in an increased error as models need to cater to more species and environmental conditions. Conversely, estimation errors at lower spatial resolution (i.e., 250-500 m) improve when compared to higher (20 m) spatial resolution products due to the increased correlation among spatial phenomena as areal unit size increases (i.e., the modifiable areal unit problem), partly related to decreased variability caused by averaging over many pixels [52].
Our study was limited by several factors, including the number of locations used for in situ data collection. Although their spatial density was an order of magnitude higher when compared to other studies [23,25], a limited number of locations may fail to comprehensively characterize spatial and temporal LFMC patterns within the entire region of interest, which could affect any wall-to-wall mapping product. The relatively high proportion (~30%) of in situ samples excluded from the analysis, coupled with the need to establish a common dataset for the Sentinel-1 and Sentinel-2 imagery, has reduced the number of samples available for model training and validation, which may have positively influenced the overall accuracy estimates as a drop in r and MAE was observed for the ex situ analysis (Table 5 and Figure 5). Lastly, extrapolating LFMC from a few sampled individuals to all trees within one pixel may increase uncertainties of the modeled relationships when sampled trees fail to represent the surrounding conditions accurately. Such limitations may be partially addressed by using only the Sentinel-2 data or the Harmonized Landsat Sentinel-2 (HLS) dataset [53] and thus increasing the number of samples available for model training and validation. In addition, the operational LFMC content predictions may be recalibrated on-the-fly with in situ data collected within the target year.

Conclusions
This study is only the second assessing the utility of medium-resolution Sentinel-1 and Sentinel-2 sensors for LFMC retrieval in a typical Mediterranean environment, as well as their synergy. We showed that machine learning techniques, particularly random forests, may be used to estimate LFMC on a weekly basis in Mediterranean vegetation from remote sensing data with acceptable model performance when input satellite variables (optical) and ancillary information related to vegetation structure and site (static variables) are available. Model performance decreased when static variables were excluded, but opticalbased models were able to largely compensate for such shortcomings. The use of static variables alone to estimate LFMC resulted in increased error estimates, particularly over the lower LFMC ranges, with only 20.6% of variance explained when compared to over 47% of the optical-based model and 54% of the combined optical-static-based model. Such results suggest that spectral indices add unique information that cannot be obtained from the static or the radar variables alone. In contrast with previous studies, the information provided by the radar variables was not relevant nor unique in our region, as removing such variables did not result in decreased model performance. However, our results also showed that in areas of persistent cloud cover, radar-based models might be used to estimate LFMC, albeit with lower accuracies.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/f13111846/s1, Figure S1: Relationships between Sentinel 2 spectral indices (SI). Left panel shows the contribution of each absolute SI to the PCA axes. Right panel shows the correlogram of absolute SIs; Figure S2: Figure S2 Relationships between Sentinel 1 polarizations acquired within the ascending (asc) and descending (des) satellite passes. The left panel shows the contributions to the first PCA axes. The right panel shows the correlogram (VV-vertical transmit vertical receive, VH-vertical transmit horizontal receive); Figure S3: Correlogram of static variables (left panel) together with the utility as predictor variables (increase in mean squared error, node purity) for LFMC estimation using random forests. Eastness and northness were computed as the sine and cosine, respectively, of the aspect angle; Figure S4: Correlogram of Sentinel-1 variables (left panel) together with their importance as predictor variables estimated as mean decrease accuracy (i.e., increment in MSE when variable is excluded from the model) for LFMC estimation using random forests; Figure S5: Correlogram of Sentinel-2 spectral indices (left panel) together with their utility as predictor variables (increase in mean squared error, node purity) for LFMC estimation using random forests. Only values above 0.7 are shown in the correlogram; Figure  Funding: This work was funded by the Madrid regional government (grant CM/JIN/2021-024) and the Spanish Ministry for Science and Innovation (grants PID2020-114062RA-I00, RYC-2017-22555 and RYC2018-024614-I, projects RTA2017-00042-C05-01, PID2020-116494RR-C41).

Informed Consent Statement: Not applicable.
Data Availability Statement: Sentinel-1 and Sentinel-2 data are freely available. The in situ data was obtained from the Madrid regional government under a specific research agreement.