1. Introduction
Wildfires are among the most significant natural disturbances affecting terrestrial ecosystems globally [
1,
2]. In the Mediterranean Basin, fire represents both a natural ecological process and a recurrent socio-environmental hazard [
3,
4]. A unique combination of long dry summers, flammable sclerophyllous vegetation, and dense human settlements creates highly favourable conditions for ignition and rapid fire spread. This “fire-prone equilibrium” has been further destabilized in recent decades by increasing anthropogenic pressure, land-use change, and climate variability [
5,
6]. As a result, Mediterranean countries regularly experience some of the world’s most intense and damaging wildfire seasons, with profound implications for ecosystems, public safety, and carbon dynamics.
Over the last two decades, growing population density, urban expansion, and agricultural intensification have increased human interactions with fire-prone landscapes. Anthropogenic ignitions now account for the vast majority of wildfire events across southern Europe and North Africa, exceeding 80-90% of recorded ignitions in several countries [
7,
8,
9]. Ignition sources are diverse and include agricultural residue burning, power-line failures, transportation activities, waste burning, and intentional land clearing [
10,
11,
12]. This growing dominance of human-caused ignitions illustrates the transition of the Mediterranean Basin from a predominantly climate-driven fire regime to a human-dominated fire regime [
13].
Despite this well-established human influence, many operational fire danger systems, including those used by civil protection and forest agencies, still rely primarily on meteorological and vegetation-based indices such as the Fire Weather Index (FWI) or the Normalized Difference Vegetation Index (NDVI) [
14,
15]. While these indices are valuable for estimating fuel moisture and flammability, they inadequately capture the spatial heterogeneity of human access and the land-use intensity that govern ignition likelihood. Consequently, existing fire risk maps may underestimate ignition hotspots associated with roads, peri-urban corridors, and agricultural interfaces, where most ignitions actually occur [
16]. This limitation highlights the need to complement climatic and vegetation indicators with proxies of anthropogenic activity, such as night-time light emissions, population density, or human modification indices.
Recent advances in open access satellite imagery and cloud-based geospatial platforms have transformed wildfire research, enabling the integration of diverse environmental and socio-economic data layers within a unified analytical framework. Platforms such as
Google Earth Engine (GEE) facilitate large-scale computation of Earth observation products, including Sentinel-2 reflectance composites, MODIS or VIIRS burned-area records, and ancillary predictors such as vegetation indices (NDVI), land surface temperature, terrain slope, ERA-5 re-calculated climate data, or human modification indices (gHM) [
17]. These datasets, when combined with machine-learning (ML) algorithms, allow for modelling non-linear interactions between human accessibility and environmental flammability at unprecedented spatial scales [
18].
Ensemble learning techniques such as
Random Forest (RF) [
19] and
Extreme Gradient Boosting (XGBoost) [
20] have emerged as powerful tools for wildfire susceptibility and ignition modelling. These algorithms can capture complex, non-linear relationships between predictor variables while remaining robust to noise and overfitting. In wildfire research, RF and XGB have demonstrated strong performance in mapping ignition likelihood, burned-area susceptibility, and post-fire recovery [
21,
22]. Their ability to integrate multi-source predictors such as topography, vegetation structure, weather, and human footprint makes them particularly suitable for trans-regional geospatial modelling.
Although numerous studies have examined wildfire susceptibility in individual Mediterranean countries or subregions [
23,
24,
25], cross-basin comparative approaches remain rare. Most existing models are geographically constrained to southern Europe, whereas North African regions such as Morocco, Algeria, and Tunisia remain under-represented in the fire science literature [
26]. This geographic asymmetry hinders the understanding of whether anthropogenic drivers of ignition exhibit consistent spatial signatures across the Mediterranean or vary under distinct climatic and socio-economic conditions. A trans-Mediterranean approach spanning both the northern and southern shores of the basin is essential to characterize fire risk as a coupled human–environmental process rather than a set of isolated regional phenomena.
Research Rationale and Objectives
To address these gaps, this study develops an open and reproducible multi-model framework for mapping and comparing anthropogenic wildfire ignition risk across four representative Mediterranean regions: Morocco, Italy, France, and Spain. The framework integrates open access remote-sensing data with two complementary machine-learning algorithms (RF and XGB) to quantify the combined influence of human and environmental predictors on ignition probability. All models are implemented within a spatially explicit, cross-validated workflow designed to test transferability across diverse biogeographical contexts.
The specific objectives of this study are to achieve the following:
Quantify and compare the relative influence of anthropogenic and biophysical factors on wildfire ignition probability across Mediterranean landscapes;
Evaluate model performance and spatial transferability through within-country cross-validation and leave-one-out cross-validation (LOOCV);
Generate harmonized ignition-risk maps at 500 m resolution to support fire prevention, land-use planning, and climate adaptation strategies in the Mediterranean Basin.
2. Data and Methods
2.1. Study Areas
Four Areas of Interest (AOIs) were selected across the Mediterranean Basin, located in Spain, Italy, Morocco, and France (
Figure 1). These AOIs were chosen as representative
test sites capturing strong bioclimatic and anthropogenic gradients characteristic of Mediterranean fire-prone landscapes, rather than as exhaustive representations of national-scale variability.
The Mediterranean Basin constitutes one of the world’s most fire-prone regions, where recurrent wildfires are driven by the combined effects of climatic variability, vegetation flammability, and human activity [
3,
4,
27]. It is also recognized as a global biodiversity hotspot [
28], where fire functions both as a natural ecological process and as a socio-environmental hazard amplified by land abandonment, urban expansion, and climate change [
5,
29].
The selected AOIs encompass contrasting physiographic and socio-ecological conditions that typify distinct Mediterranean fire-regime contexts and provide a controlled setting to evaluate model transferability across regions with differing environmental and human pressure characteristics:
Seville (Spain): A densely populated and fire-prone landscape dominated by pine–oak woodlands and agricultural mosaics. The area represents a strongly anthropogenic ignition context, where agricultural burning, infrastructure density, and urban expansion are major fire drivers.
Sicily (Italy): An
insular Mediterranean system with complex topography and mixed land use. Fire activity in Sicily reflects the combined influence of human pressure and seasonal drought, illustrating intermediate ignition controls between climatic and anthropogenic factors [
30].
Tangier (Morocco): A North African coastal–mountain interface characterized by steep slopes, maquis vegetation, and expanding peri-urban areas. This AOI serves as a test case for model transferability under comparatively limited data conditions, representing the under-studied southern Mediterranean context.
Corsica (France): A mountainous island dominated by maquis shrublands and pine–oak forests. Hot, dry summers and strong seasonal winds (Libeccio and Mistral), combined with expanding coastal settlements and tourism-driven land use, contribute to elevated human-induced ignition susceptibility [
31].
Collectively, these AOIs span a broad range of Mediterranean environmental and socio-economic settings. While they do not capture the full heterogeneity within each country, they provide a robust comparative framework for assessing wildfire ignition susceptibility and evaluating cross-regional model transferability across selected Mediterranean landscapes.
2.2. Ignition and Background Samples
Wildfire ignition events were approximated using the MODIS MCD64A1 Collection 6.1 burned-area product [
32]. Pixels with a positive BurnDatewere converted to point centroids, which were treated as
ignition proxies representing the first detected burned location rather than the exact fire origin. Given the 500 m spatial resolution of MODIS, the true ignition point may have occurred anywhere within the pixel, including near pixel edges or along linear anthropogenic features (e.g., roads or agricultural boundaries). This spatial uncertainty is explicitly acknowledged and considered in the interpretation of model results.
It is further noted that MCD64A1 under-detects small fires (typically < ha), which are often human-caused and concentrated in peri-urban or agricultural landscapes. As a result, the ignition dataset likely under-represents such events, and estimated relationships with anthropogenic predictors should be interpreted as conservative.
To construct a balanced modelling dataset, an equal number of background (non-ignition) samples were randomly drawn from unburned burnable land-cover classes within and adjacent to each AOI. Permanent water bodies, bare rock, and other non-burnable surfaces were excluded to ensure that model discrimination reflected relative ignition likelihood within vegetated landscapes rather than fuel presence alone. A spatially stratified sampling scheme was applied to background points to reduce spatial clustering and sampling bias. Each observation was labelled as 1 (ignition proxy) or 0 (background).
2.3. Predictor Variables
Environmental and anthropogenic predictors were compiled from open access datasets processed in Google Earth Engine and normalized to a common range at 500 m resolution. Predictors were grouped as described below.
2.3.1. Topography
Slope is a driving factor for wildfires. In this study, the topographic variable is derived from the Digital Elevation Model (DEM) product ASTER GDEM [
33]. The SRTM DEM is a high-accuracy high-resolution product at 30m resolution. Computation of the slope was performed on GEE.
The relationship between topography and wildfire risk in Mediterranean ecosystems is highly correlated [
34]. Topography influences climatic and vegetation distribution directly affecting the ignition patterns and therefore the likelihood of the occurrence of fires [
35,
36].
Additionally, topography directly affects the ability for civil protection to access actively burning areas and their ability to distribute human and logistical resources to respond to a fire event [
37].
The slope statistics for the selected AOIs show a varied topography with mean values ranging from 2.6 deg to ≈10 deg (
Table 1).
2.3.2. NDVI
NDVI was computed directly using the Landsat–Sentinel-2 harmonized dataset available on the GEE platform. NDVI served as an indicator of vegetation health and greenness. This greenness was reflected by the chlorophyll content of the leaves [
38]. The NDVI mean, median and standard deviation are described in the
Table 2.
2.3.3. Weather Variables
Weather data were obtained from the ERA5-Land reanalysis dataset [
39]. The meteorological variables considered in this study included near-surface air temperature at 2 m, wind speed at 10 m, and relative humidity.
For each variable, long-term seasonal averages representative of typical fire-season conditions were extracted, rather than instantaneous weather conditions at the time of ignition. These variables therefore characterized the broader climatic context influencing wildfire susceptibility, such as fuel dryness and prevailing atmospheric conditions, rather than short-term ignition triggers driven by day-to-day weather variability
Table 3.
Meteorological conditions influence wildfire occurrence primarily through their effects on fuel moisture and landscape flammability. Air temperature and relative humidity jointly regulate vegetation moisture content, with higher temperatures and lower humidity promoting fuel drying and increasing susceptibility to ignition [
40]. Wind speed further contributes to fuel desiccation by enhancing evapotranspiration and atmospheric mixing, in addition to its well-known role in fire spread dynamics [
41]. Within this framework, weather variables are treated as contextual controls on long-term ignition susceptibility rather than predictors of event-specific fire behaviour.
2.4. Anthropogenic Variables
Human activity plays a critical role in defining wildfire patterns, wildfire start points, and fuel distribution. Therefore, human activity is considered one of the most important factors driving wildfires [
37]. For this study, we chose three variables: Global Human Modification (gHM) index, VIIRS night-time lights, and population density.
2.4.1. Global Human Modification (gHM)
The Global Human Modification dataset (gHM) provides a cumulative measure of human modification of terrestrial lands globally at a one-square-kilometre resolution [
42]. Various studies suggest that the gHM is considered one of the most influential factors in the ignition of wildfires [
43,
44]. gHM affects the distribution of the Wild-land Urban Interface and fire patterns (burning for agriculture). The gHM variation in each AoI is described in the
Table 4.
2.4.2. VIIRS Night-Time Light (NTL)
VIIRS night-time light (NTL) is a time series produced from monthly cloud-free average radiance grids spanning 2013 to 2021. Studies have found that the NTLs directly affect the patterns of ignitions [
45]. We show the VIIRS night light statistics in each AoI in the
Table 5.
2.4.3. Population Density
Population density is a an open access archive of high-resolution gridded population estimate datasets on tiles of 100 m [
46]. A detailed description of the population density is presented in
Table 6.
2.5. Correlation Study of the Different Predictor Variables
A correlation analysis was conducted to assess the degree of association between the environmental and anthropogenic predictors used in the wildfire ignition-modelling framework. Understanding inter-variable correlations is essential to detecting potential redundancy among predictors and to ensuring model robustness by minimizing multicollinearity effects [
47].
This analysis included seven variables: the
Normalized Difference Vegetation Index (NDVI),
slope,
near-surface air temperature (T2M),
wind speed at 10 m (WS10M),
relative humidity (RH),
Global Human Modification index (gHM),
night-time lights (NTLs), and
population density (Pop_km
2). A log-transformed version of the NTL variable (NTL_log1p) was also included to account for the strong right-skewed distribution of radiance values (
Figure 2).
Pairwise Pearson correlation coefficients (r) were computed for each Area of Interest (AOI) based on all valid raster pixels retained after preprocessing.
Overall, the correlations were moderate, indicating that each predictor contributed distinct information to the modelling framework. In particular, NDVI exhibited a weak negative correlation with gHM and NTL, suggesting contrasting gradients between vegetation cover and human-modified areas, while meteorological variables (T2M, WS10M, RH) showed limited interdependence across AOIs.
The anthropogenic predictors (NTL, NT_log1p, gHM, and Pop_km2) displayed strong inter-correlations, which is consistent with their derivation from similar underlying phenomena, namely, the spatial distribution of human settlements, infrastructure, and land-use modification.
2.6. Data Preprocessing
All satellite datasets were preprocessed to retain valid terrestrial pixels only. Sentinel-2 Level-2A surface reflectance imagery, preprocessed within the Google Earth Engine (GEE) environment, was filtered for low cloud cover and used to generate seasonal (summer) NDVI composites. Cloud masking relied on the preprocessing algorithms embedded in GEE, ensuring consistent removal of cloud- and shadow-contaminated pixels.
To exclude non-terrestrial areas, the MODIS MOD44W Water Mask product was employed to remove both oceanic and permanent inland water pixels. This filtering ensured that NDVI and related spectral indices accurately represented vegetation conditions over land, minimizing artefacts caused by clouds or water bodies.
All raster datasets retained for analysis were resampled and reprojected to a common spatial resolution of 500 m using the default GEE projection framework (EPSG:4326). To enable cross-variable comparison, all raster layers were subsequently normalized to a 0–1 range, with normalization applied individually per Area of Interest (AOI).
The complete preprocessing and harmonization workflow is illustrated in
Figure 3.
2.7. Modelling Framework
Wildfire fire susceptibility was modelled using two ensemble-based machine-learning algorithms: Random Forest (RF) and Extreme Gradient Boosting (XGB). Both algorithms are particularly well-suited for geospatial modelling tasks due to their ability to capture complex, non-linear relationships between predictors; their robustness to noise; and their flexibility in handling heterogeneous environmental data.
2.7.1. Random Forest (RF)
The Random Forest algorithm [
19] is an ensemble of decision trees trained on bootstrap samples of data, with random subsets of predictors considered at each split. This bootstrap aggregation (bagging) strategy reduces model variance and mitigates overfitting, yielding stable predictions even in the presence of correlated variables or missing data. In geospatial applications, RF has proven effective in modelling environmental phenomena such as vegetation dynamics, land cover change, and wildfire occurrence, primarily because it can represent high-dimensional interactions among climatic, topographic, and anthropogenic predictors without requiring strong parametric assumptions.
2.7.2. Extreme Gradient Boosting (XGB)
Extreme Gradient Boosting (XGBoost) [
20] is a scalable, tree-based boosting algorithm that sequentially builds an ensemble of weak learners to minimize a differentiable loss function. By iteratively fitting new trees to the residuals of previous models, XGB captures subtle patterns in the data and achieves strong predictive performance. Regularization techniques (e.g.,
and
penalties) help control model complexity and prevent overfitting, while feature importance metrics derived from gain or split frequency facilitate interpretability. In geospatial contexts, XGB is particularly valuable for integrating multi-source environmental datasets (e.g., remote-sensing, climate, and socio-economic layers) and for producing spatially explicit probability surfaces.
2.8. Validation and Interpretation
Both models were implemented in Python 3.10 using the scikit-learn and xgboost libraries. The modelling framework employed balanced datasets to address class imbalance between ignition and non-ignition pixels. Data were partitioned into training and validation subsets using an 80/20 split. Hyper-parameters, including tree depth, learning rate, and subsampling ratio, were optimized through a five-fold spatial cross-validation scheme to account for spatial autocorrelation and ensure generalization across distinct geographic areas.
2.9. Spatial Prediction and Reproducibility
Trained models were applied to continuous predictor rasters to generate ignition-probability maps, which were subsequently classified into five risk categories (very low to very high) using Jenks natural breaks. All scripts and input datasets are openly available to ensure reproducibility and facilitate model scalability across other Mediterranean regions.
3. Results
3.1. XGBoost Model Performance
The Extreme Gradient Boosting (XGBoost, XGB) classifier demonstrates strong predictive capability in discriminating ignition-prone locations from background conditions. The confusion matrix (
Table 7) shows clear separation between ignition and non-ignition samples, with relatively few misclassifications. The model achieves an overall accuracy of
0.88, an
AUC of 0.954, and a
PR-AUC of 0.925, indicating a high capacity to distinguish locations with elevated ignition susceptibility within burnable landscapes (
Figure 4).
Although AUC values above 0.95 can sometimes reflect overly permissive background sampling, this interpretation is unlikely here because background samples are explicitly restricted to burnable land-cover classes. Consequently, model performance reflects discrimination among locations with comparable fuel availability rather than a trivial separation between burnable and non-burnable surfaces.
The F1-score of 0.85 reflects a balanced trade-off between Precision (0.80) and Recall (0.91), with notably higher sensitivity to ignition events than the Random Forest model. The five-fold cross-validation yields a mean AUC of 0.945, indicating stable performance across spatial partitions. These results suggest that XGBoost effectively captures non-linear interactions among anthropogenic pressure, vegetation condition, topography, and climatic context.
Accordingly, the model outputs should be interpreted as relative indicators of long-term ignition susceptibility within vegetated areas, rather than absolute predictions of wildfire occurrence.
3.2. Random Forest Model Performance
The Random Forest (RF) classifier also achieved strong predictive performance across the Mediterranean ignition dataset. The confusion matrix (
Table 8) indicated balanced discrimination between ignition and non-ignition samples. The model achieved an
AUC of 0.955 as shown in
Figure 5, an overall accuracy of
0.90, and a
Cohen’s of 0.78, signifying substantial agreement beyond random classification.
As with the XGBoost model, high classification accuracy should be interpreted in the context of background sampling restricted to burnable land-cover classes. This design ensures that model performance reflects relative differences in ignition susceptibility within vegetated landscapes, rather than contrasts between fuel and non-fuel areas.
Both omission (false-negative) and commission (false-positive) errors remain below 10%, indicating consistent discrimination across classes. These results confirm that the RF model provides a robust baseline for estimating spatial patterns of ignition susceptibility, capturing non-linear relationships between anthropogenic accessibility and environmental variables with high reliability.
3.3. Model Comparison and Interpretation
A comparative assessment of the Random Forest (RF) and XGBoost (XGB) models highlights consistent and complementary performance patterns (
Table 9). Both ensemble methods achieve high predictive accuracy and substantial agreement beyond chance, with overall accuracies above 0.88 and AUC values exceeding 0.95. While RF achieves a slightly higher
AUC (0.96 vs. 0.954) and comparable
F1-score (0.85 vs. 0.85), XGBoost exhibits higher Recall (
0.91 vs. 0.85), capturing more ignition-prone pixels and reducing omission errors relative to the Random Forest model.
Despite these differences, both models reveal strong consistency in their predictive structure and rank anthropogenic variables (night-time lights and the gHM index) as the most influential ignition drivers. The slightly lower false-positive rate in the RF model suggests it generalizes more conservatively, whereas XGBoost, with its gradient boosting optimization, captures more subtle ignition patterns at the cost of a modest increase in commission errors. Overall, XGBoost provides marginally higher sensitivity and generalization, while Random Forest offers enhanced stability and interpretability, making the two approaches complementary within a reproducible ignition-modelling framework.
3.4. Cross-Country Transferability
To evaluate model generalization across regions, a leave-one-country-out (LOCO) transfer–validation strategy was applied. In this setting, models were trained on data from three countries and evaluated on the held-out country, iteratively rotating the test region.
While the mean transfer AUC across all LOCO experiments was 0.85, this aggregated value concealed substantial variability in cross-country transfer performance. To provide a more detailed assessment,
Table 10 reports a full 4 × 4 transfer matrix, where each entry represents the AUC obtained when training on one country and testing on another.
The transfer matrix reveals clear patterns in model portability. Higher transfer performance is generally observed between countries sharing similar Mediterranean bioclimatic conditions and land-use structures, such as Spain, Italy, and southern France. In contrast, transfer performance to Morocco is systematically lower, particularly when models are trained exclusively on European countries.
This degradation likely reflects differences in anthropogenic drivers, fire-management practices, data-reporting mechanisms, and socio-economic conditions, rather than purely climatic factors. Nevertheless, the fact that all cross-country transfers maintain AUC values above 0.80 indicates that the model captures robust, structurally meaningful relationships between ignition susceptibility and its predictors.
From an operational perspective, these results suggest that cross-border model reuse is the most reliable among the biogeographically and socio-environmentally similar Mediterranean regions, while transfer to more distinct contexts may benefit from regional recalibration or limited local retraining.
4. Discussion
4.1. Anthropogenic Versus Biophysical Drivers
Variable-importance analyses derived from the Random Forest and XGBoost models (
Figure 6 and
Figure 7) indicate that anthropogenic proxies consistently contribute strongly to wildfire ignition susceptibility across all study areas. In particular, night-time light intensity and the Global Human Modification (gHM) index rank among the most influential predictors, together accounting for approximately 40–60% of cumulative model importance.
It is important to emphasize that variable-importance metrics in tree-based ensemble models do not represent direct causal effects, but rather quantify the relative contribution of predictors to model discrimination. Anthropogenic variables such as gHM, night-time lights, and population-related indicators are strongly correlated and should therefore be interpreted collectively as proxies for human presence, infrastructure density, and land-use intensity, rather than as independent drivers of ignition processes.
In the Random Forest model (
Figure 6), gHM emerges as the dominant predictor, followed by NDVI and terrain slope. This pattern suggests that ignition susceptibility is shaped by the interaction between human pressure and biophysical context, where fuel availability, vegetation continuity, and landscape accessibility jointly influence the spatial distribution of ignitions. Similarly, the XGBoost model (
Figure 7) assigns high importance to gHM and night-time light metrics, reinforcing the central role of human activity intensity and spatial distribution in modulating ignition susceptibility across regions.
Vegetation condition, represented by NDVI, contributes by capturing variations in fuel load and moisture status that influence flammability and ignition potential. Topographic slope influences ignition susceptibility indirectly by shaping land-use patterns, accessibility, and fire-suppression effectiveness, with ignitions more frequently occurring in accessible low- to mid-slope environments rather than in steep or remote terrain.
4.2. Spatial Patterns of Ignition Susceptibility
Predicted ignition-susceptibility maps (
Figure 8 and
Figure 9) reveal consistent spatial clustering of high-susceptibility zones near transportation networks, peri-urban interfaces, and agricultural frontiers. Across all regions, elevated susceptibility aligns with areas characterized by dense human activity and fragmented land use, including coastal Andalusia (Spain), lowland corridors in Italy, and peri-urban belts surrounding Tangier.
In contrast, mountainous and remote areas generally exhibit lower ignition susceptibility, highlighting the mediating role of accessibility and human presence. These spatial patterns are consistent with the observed burned-area and ignition distributions reported in previous Mediterranean studies and support the ecological plausibility of the modelled susceptibility gradients.
In more densely populated regions such as Italy and France, the sharp transition from high-susceptibility peri-urban belts to lower-susceptibility forest interiors reflects strong socio-spatial gradients in land use, infrastructure density, and human access. Such patterns underscore the importance of human–landscape interactions in shaping ignition susceptibility beyond purely biophysical controls.
4.3. Role of Anthropogenic Pressure and the gHM Index
While the overall dominance of anthropogenic proxies is evident from the global importance rankings, the prominence of the Global Human Modification (gHM) index warrants specific discussion. Across both Random Forest and XGBoost models, gHM consistently emerges as the most influential predictor of wildfire ignition susceptibility. Given that the majority of fires in the Mediterranean basin are human-caused, this result is not unexpected; nevertheless, the use of a composite anthropogenic index requires careful interpretation.
Importantly, gHM does not encode fire occurrences or ignition mechanisms directly. Rather, it represents cumulative human pressure on landscapes by integrating information on built-up areas, agriculture, infrastructure, and population influence. Its strong contribution therefore reflects the spatial coupling between human-modified environments and ignition likelihood, rather than a tautological prediction of human-caused fires.
The dominance of gHM may also partially mask the contribution of more specific anthropogenic proxies, such as population density or night-time light intensity, due to substantial shared variance among these predictors. Tree-based ensemble methods are known to preferentially select integrative variables when multicollinearity is present, which can lead to a concentration of importance in composite indices. In this context, gHM acts as an efficient surrogate capturing multiple correlated dimensions of human pressure, rather than introducing circularity or inflating predictive performance.
Importantly, if human presence alone were sufficient to explain ignition patterns, purely demographic proxies would be expected to consistently outperform biophysical variables. The persistent contribution of vegetation condition and topography across models indicates that ignition susceptibility emerges from the interaction between anthropogenic pressure and environmental context, rather than from human factors alone.
From an applied perspective, the use of gHM offers a pragmatic advantage by capturing cumulative human influence in a single, spatially consistent variable. However, for interpretability and policy-oriented analyses, decomposing anthropogenic pressure into its constituent drivers remains valuable and should be considered in future work.
4.4. Interpretation and Comparison with Previous Studies
The prominence of anthropogenic proxies is consistent with a large body of Mediterranean wildfire research identifying human activity as the dominant ignition source [
10,
48,
49]. This study extends prior work by quantifying and comparing these effects across multiple regions within a harmonized and reproducible modelling framework, enabling direct cross-regional comparison.
The stability of model performance under cross-region transfer further suggests that common structural relationships between human pressure and ignition susceptibility exist across Mediterranean landscapes, despite regional differences in climate, vegetation, and socio-economic conditions. Rather than implying uniform basin-wide behaviour, these findings point to partially transferable ignition patterns under comparable human–environment configurations.
From a fire-management and policy perspective, the dominance of anthropogenic proxies highlights the importance of prevention strategies targeting human activities, particularly in peri-urban and agricultural landscapes. Ignition-susceptibility maps produced by this framework can support targeted awareness campaigns, infrastructure monitoring, and land-use planning in areas where human pressure and flammable vegetation intersect.
4.5. Limitations and Future Work
Several limitations should be considered when interpreting the results. First, ignition locations are approximated using MODIS burned-area pixels at 500 m resolution, which may introduce spatial uncertainty and under-represent small or short-lived fires, many of which are human-caused. Second, night-time light intensity may under-represent rural or low-luminance communities, potentially biasing estimates of human pressure in certain contexts. Third, despite the use of spatial cross-validation, residual spatial autocorrelation may persist due to unmodelled socio-economic or infrastructural factors. Finally, the use of a static population density layer may not fully capture temporal dynamics over the study period.
Future work should incorporate higher-resolution ignition datasets (e.g., VIIRS 375 m or Sentinel-2 MSI), dynamic meteorological and fuel-moisture variables, and additional accessibility metrics such as distance to settlements or transportation networks. Expanding the framework to a larger number of Mediterranean regions and other semi-arid environments would further improve the understanding of human-driven ignition processes under changing climatic and socio-economic conditions.
5. Conclusions
This study developed an open, reproducible, and multi-model framework to quantify and compare the anthropogenic and biophysical drivers of wildfire ignition across four representative Mediterranean countries: Morocco, Spain, France, and Italy. By integrating globally available remote-sensing indicators, vegetation greenness, terrain slope, land-surface temperature anomalies, night-time light intensity, and the Global Human Modification index with Random Forest and gradient-boosting models, we produced harmonized ignition-risk maps and assessed model transferability across diverse socio-environmental contexts.
The results demonstrate that anthropogenic accessibility and land-use intensity remain the primary determinants of ignition probability throughout the Mediterranean basin. Human proxies such as night-time light intensity and the gHM index consistently outperformed environmental predictors, while temperature anomalies and vegetation greenness played secondary roles by modulating fuel dryness, availability, and flammability. Spatial cross-validation confirmed robust model performance (AUC > 0.9 for tree-based models), and cross-country transfer tests revealed substantial generalization (mean LOCO AUC ≈ 0.85), underscoring the structural similarity of human-driven ignition processes across Mediterranean landscapes.
Scientific contribution: The proposed workflow unifies data processing, modelling, and validation under an open access framework, offering a transparent and replicable methodology for multi-country, large-scale ignition-risk mapping. Its scalability and interpretability make it adaptable to other fire-prone regions globally, where data scarcity limits conventional hazard modelling.
Practical implications: The resulting ignition-risk maps provide actionable information for national forestry and civil-protection agencies. They can guide the prioritization of surveillance and prevention resources toward high-accessibility corridors, urban–rural interfaces, and agricultural frontiers where human ignition pressure is greatest. Because all inputs rely solely on publicly available data and cloud computing infrastructure, the framework can be operationalized with minimal technical investment.
Future directions: Subsequent research should couple ignition models with dynamic meteorological variables—fuel moisture, wind fields, and drought indices—to improve temporal realism. Integrating higher-resolution burned-area detections (e.g., VIIRS 375 m and Sentinel-2 MSI) and socio-economic layers (e.g., distance to roads, road traffic, or tourism intensity) will further refine ignition localization. Finally, expanding the framework toward spatio-temporal or probabilistic forecasting could support early-warning systems and cross-border fire-management policies across the broader Mediterranean region. Additionally, obtaining validation data from local organizations would help to improve the results to a more precise location and fire size.