Generalizing Human-Driven Wildfire Ignition Models Across Mediterranean Regions Using Harmonized Remote-Sensing and Machine-Learning Data

Dimarco, Nicola Aimane; Faraji, Ibtissam; Wahbi, Miriam; Maatouk, Mustapha; Boulaassal, Hakim; Aalaoui, Otman Yazidi; El Kharki, Omar

doi:10.3390/geomatics6010013

Open AccessArticle

Generalizing Human-Driven Wildfire Ignition Models Across Mediterranean Regions Using Harmonized Remote-Sensing and Machine-Learning Data

by

Nicola Aimane Dimarco

^*

,

Ibtissam Faraji

,

Miriam Wahbi

,

Mustapha Maatouk

,

Hakim Boulaassal

,

Otman Yazidi Aalaoui

and

Omar El Kharki

Geomatic, Remote Sensing and Cartography Research Group (GeoTeCa), Faculty of Sciences and Techniques of Tangier (FSTT), Abdelmalek Essaadi University, Tetouan 93000, Morocco

^*

Author to whom correspondence should be addressed.

Geomatics 2026, 6(1), 13; https://doi.org/10.3390/geomatics6010013

Submission received: 22 November 2025 / Revised: 15 January 2026 / Accepted: 17 January 2026 / Published: 1 February 2026

(This article belongs to the Special Issue Editorial Board Members’ Collection Series: GeoAI in Disaster)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Wildfires represent a growing environmental and socio-economic threat across Mediterranean landscapes, where prolonged summer droughts and human activity increasingly shape ignition susceptibility. This study presents an open and reproducible modelling framework for comparing the relative influence of anthropogenic and biophysical drivers of wildfire ignition susceptibility across selected Mediterranean regions. Using harmonized 500 m predictors derived from global remote-sensing datasets, we integrate vegetation condition, topography, climatic context, and human pressure indicators within a cloud-based Google Earth Engine workflow. Two tree-based machine-learning models (Random Forest and Extreme Gradient Boosting) are trained and evaluated using spatial cross-validation and cross-region transfer experiments. Results consistently highlight the dominant role of anthropogenic pressure in shaping ignition susceptibility across all study areas, with night-time lights and human modification indices contributing to the largest share of model importance. Both models achieve high predictive performance (AUC > 0.90) and retain stable accuracy under cross-region transfer (mean transfer AUC ≈ 0.85), indicating partial generalization of human-driven ignition patterns across Mediterranean landscapes. Beyond predictive performance, the principal contribution of this work lies in its harmonized cross-regional comparison and explicit evaluation of model transferability using open data and scalable cloud processing. The resulting susceptibility maps provide a transparent and operational basis for comparative wildfire risk assessment and prevention planning within comparable Mediterranean contexts.

Keywords:

wildfire ignition; anthropogenic drivers; machine learning; remote sensing; Mediterranean Basin

Graphical Abstract

1. Introduction

Wildfires are among the most significant natural disturbances affecting terrestrial ecosystems globally [1,2]. In the Mediterranean Basin, fire represents both a natural ecological process and a recurrent socio-environmental hazard [3,4]. A unique combination of long dry summers, flammable sclerophyllous vegetation, and dense human settlements creates highly favourable conditions for ignition and rapid fire spread. This “fire-prone equilibrium” has been further destabilized in recent decades by increasing anthropogenic pressure, land-use change, and climate variability [5,6]. As a result, Mediterranean countries regularly experience some of the world’s most intense and damaging wildfire seasons, with profound implications for ecosystems, public safety, and carbon dynamics.

Over the last two decades, growing population density, urban expansion, and agricultural intensification have increased human interactions with fire-prone landscapes. Anthropogenic ignitions now account for the vast majority of wildfire events across southern Europe and North Africa, exceeding 80-90% of recorded ignitions in several countries [7,8,9]. Ignition sources are diverse and include agricultural residue burning, power-line failures, transportation activities, waste burning, and intentional land clearing [10,11,12]. This growing dominance of human-caused ignitions illustrates the transition of the Mediterranean Basin from a predominantly climate-driven fire regime to a human-dominated fire regime [13].

Despite this well-established human influence, many operational fire danger systems, including those used by civil protection and forest agencies, still rely primarily on meteorological and vegetation-based indices such as the Fire Weather Index (FWI) or the Normalized Difference Vegetation Index (NDVI) [14,15]. While these indices are valuable for estimating fuel moisture and flammability, they inadequately capture the spatial heterogeneity of human access and the land-use intensity that govern ignition likelihood. Consequently, existing fire risk maps may underestimate ignition hotspots associated with roads, peri-urban corridors, and agricultural interfaces, where most ignitions actually occur [16]. This limitation highlights the need to complement climatic and vegetation indicators with proxies of anthropogenic activity, such as night-time light emissions, population density, or human modification indices.

Recent advances in open access satellite imagery and cloud-based geospatial platforms have transformed wildfire research, enabling the integration of diverse environmental and socio-economic data layers within a unified analytical framework. Platforms such as Google Earth Engine (GEE) facilitate large-scale computation of Earth observation products, including Sentinel-2 reflectance composites, MODIS or VIIRS burned-area records, and ancillary predictors such as vegetation indices (NDVI), land surface temperature, terrain slope, ERA-5 re-calculated climate data, or human modification indices (gHM) [17]. These datasets, when combined with machine-learning (ML) algorithms, allow for modelling non-linear interactions between human accessibility and environmental flammability at unprecedented spatial scales [18].

Ensemble learning techniques such as Random Forest (RF) [19] and Extreme Gradient Boosting (XGBoost) [20] have emerged as powerful tools for wildfire susceptibility and ignition modelling. These algorithms can capture complex, non-linear relationships between predictor variables while remaining robust to noise and overfitting. In wildfire research, RF and XGB have demonstrated strong performance in mapping ignition likelihood, burned-area susceptibility, and post-fire recovery [21,22]. Their ability to integrate multi-source predictors such as topography, vegetation structure, weather, and human footprint makes them particularly suitable for trans-regional geospatial modelling.

Although numerous studies have examined wildfire susceptibility in individual Mediterranean countries or subregions [23,24,25], cross-basin comparative approaches remain rare. Most existing models are geographically constrained to southern Europe, whereas North African regions such as Morocco, Algeria, and Tunisia remain under-represented in the fire science literature [26]. This geographic asymmetry hinders the understanding of whether anthropogenic drivers of ignition exhibit consistent spatial signatures across the Mediterranean or vary under distinct climatic and socio-economic conditions. A trans-Mediterranean approach spanning both the northern and southern shores of the basin is essential to characterize fire risk as a coupled human–environmental process rather than a set of isolated regional phenomena.

Research Rationale and Objectives

To address these gaps, this study develops an open and reproducible multi-model framework for mapping and comparing anthropogenic wildfire ignition risk across four representative Mediterranean regions: Morocco, Italy, France, and Spain. The framework integrates open access remote-sensing data with two complementary machine-learning algorithms (RF and XGB) to quantify the combined influence of human and environmental predictors on ignition probability. All models are implemented within a spatially explicit, cross-validated workflow designed to test transferability across diverse biogeographical contexts.

The specific objectives of this study are to achieve the following:

Quantify and compare the relative influence of anthropogenic and biophysical factors on wildfire ignition probability across Mediterranean landscapes;
Evaluate model performance and spatial transferability through within-country cross-validation and leave-one-out cross-validation (LOOCV);
Generate harmonized ignition-risk maps at 500 m resolution to support fire prevention, land-use planning, and climate adaptation strategies in the Mediterranean Basin.

2. Data and Methods

2.1. Study Areas

Four Areas of Interest (AOIs) were selected across the Mediterranean Basin, located in Spain, Italy, Morocco, and France (Figure 1). These AOIs were chosen as representative test sites capturing strong bioclimatic and anthropogenic gradients characteristic of Mediterranean fire-prone landscapes, rather than as exhaustive representations of national-scale variability.

The Mediterranean Basin constitutes one of the world’s most fire-prone regions, where recurrent wildfires are driven by the combined effects of climatic variability, vegetation flammability, and human activity [3,4,27]. It is also recognized as a global biodiversity hotspot [28], where fire functions both as a natural ecological process and as a socio-environmental hazard amplified by land abandonment, urban expansion, and climate change [5,29].

The selected AOIs encompass contrasting physiographic and socio-ecological conditions that typify distinct Mediterranean fire-regime contexts and provide a controlled setting to evaluate model transferability across regions with differing environmental and human pressure characteristics:

Seville (Spain): A densely populated and fire-prone landscape dominated by pine–oak woodlands and agricultural mosaics. The area represents a strongly anthropogenic ignition context, where agricultural burning, infrastructure density, and urban expansion are major fire drivers.
Sicily (Italy): An insular Mediterranean system with complex topography and mixed land use. Fire activity in Sicily reflects the combined influence of human pressure and seasonal drought, illustrating intermediate ignition controls between climatic and anthropogenic factors [30].
Tangier (Morocco): A North African coastal–mountain interface characterized by steep slopes, maquis vegetation, and expanding peri-urban areas. This AOI serves as a test case for model transferability under comparatively limited data conditions, representing the under-studied southern Mediterranean context.
Corsica (France): A mountainous island dominated by maquis shrublands and pine–oak forests. Hot, dry summers and strong seasonal winds (Libeccio and Mistral), combined with expanding coastal settlements and tourism-driven land use, contribute to elevated human-induced ignition susceptibility [31].

Collectively, these AOIs span a broad range of Mediterranean environmental and socio-economic settings. While they do not capture the full heterogeneity within each country, they provide a robust comparative framework for assessing wildfire ignition susceptibility and evaluating cross-regional model transferability across selected Mediterranean landscapes.

2.2. Ignition and Background Samples

Wildfire ignition events were approximated using the MODIS MCD64A1 Collection 6.1 burned-area product [32]. Pixels with a positive BurnDatewere converted to point centroids, which were treated as ignition proxies representing the first detected burned location rather than the exact fire origin. Given the 500 m spatial resolution of MODIS, the true ignition point may have occurred anywhere within the pixel, including near pixel edges or along linear anthropogenic features (e.g., roads or agricultural boundaries). This spatial uncertainty is explicitly acknowledged and considered in the interpretation of model results.

It is further noted that MCD64A1 under-detects small fires (typically <

25

ha), which are often human-caused and concentrated in peri-urban or agricultural landscapes. As a result, the ignition dataset likely under-represents such events, and estimated relationships with anthropogenic predictors should be interpreted as conservative.

To construct a balanced modelling dataset, an equal number of background (non-ignition) samples were randomly drawn from unburned burnable land-cover classes within and adjacent to each AOI. Permanent water bodies, bare rock, and other non-burnable surfaces were excluded to ensure that model discrimination reflected relative ignition likelihood within vegetated landscapes rather than fuel presence alone. A spatially stratified sampling scheme was applied to background points to reduce spatial clustering and sampling bias. Each observation was labelled as 1 (ignition proxy) or 0 (background).

2.3. Predictor Variables

Environmental and anthropogenic predictors were compiled from open access datasets processed in Google Earth Engine and normalized to a common

[0, 1]

range at 500 m resolution. Predictors were grouped as described below.

2.3.1. Topography

Slope is a driving factor for wildfires. In this study, the topographic variable is derived from the Digital Elevation Model (DEM) product ASTER GDEM [33]. The SRTM DEM is a high-accuracy high-resolution product at 30m resolution. Computation of the slope was performed on GEE.

The relationship between topography and wildfire risk in Mediterranean ecosystems is highly correlated [34]. Topography influences climatic and vegetation distribution directly affecting the ignition patterns and therefore the likelihood of the occurrence of fires [35,36].

Additionally, topography directly affects the ability for civil protection to access actively burning areas and their ability to distribute human and logistical resources to respond to a fire event [37].

The slope statistics for the selected AOIs show a varied topography with mean values ranging from 2.6 deg to ≈10 deg (Table 1).

2.3.2. NDVI

NDVI was computed directly using the Landsat–Sentinel-2 harmonized dataset available on the GEE platform. NDVI served as an indicator of vegetation health and greenness. This greenness was reflected by the chlorophyll content of the leaves [38]. The NDVI mean, median and standard deviation are described in the Table 2.

2.3.3. Weather Variables

Weather data were obtained from the ERA5-Land reanalysis dataset [39]. The meteorological variables considered in this study included near-surface air temperature at 2 m, wind speed at 10 m, and relative humidity.

For each variable, long-term seasonal averages representative of typical fire-season conditions were extracted, rather than instantaneous weather conditions at the time of ignition. These variables therefore characterized the broader climatic context influencing wildfire susceptibility, such as fuel dryness and prevailing atmospheric conditions, rather than short-term ignition triggers driven by day-to-day weather variability Table 3.

Meteorological conditions influence wildfire occurrence primarily through their effects on fuel moisture and landscape flammability. Air temperature and relative humidity jointly regulate vegetation moisture content, with higher temperatures and lower humidity promoting fuel drying and increasing susceptibility to ignition [40]. Wind speed further contributes to fuel desiccation by enhancing evapotranspiration and atmospheric mixing, in addition to its well-known role in fire spread dynamics [41]. Within this framework, weather variables are treated as contextual controls on long-term ignition susceptibility rather than predictors of event-specific fire behaviour.

2.4. Anthropogenic Variables

Human activity plays a critical role in defining wildfire patterns, wildfire start points, and fuel distribution. Therefore, human activity is considered one of the most important factors driving wildfires [37]. For this study, we chose three variables: Global Human Modification (gHM) index, VIIRS night-time lights, and population density.

2.4.1. Global Human Modification (gHM)

The Global Human Modification dataset (gHM) provides a cumulative measure of human modification of terrestrial lands globally at a one-square-kilometre resolution [42]. Various studies suggest that the gHM is considered one of the most influential factors in the ignition of wildfires [43,44]. gHM affects the distribution of the Wild-land Urban Interface and fire patterns (burning for agriculture). The gHM variation in each AoI is described in the Table 4.

2.4.2. VIIRS Night-Time Light (NTL)

VIIRS night-time light (NTL) is a time series produced from monthly cloud-free average radiance grids spanning 2013 to 2021. Studies have found that the NTLs directly affect the patterns of ignitions [45]. We show the VIIRS night light statistics in each AoI in the Table 5.

2.4.3. Population Density

Population density is a an open access archive of high-resolution gridded population estimate datasets on tiles of 100 m [46]. A detailed description of the population density is presented in Table 6.

2.5. Correlation Study of the Different Predictor Variables

A correlation analysis was conducted to assess the degree of association between the environmental and anthropogenic predictors used in the wildfire ignition-modelling framework. Understanding inter-variable correlations is essential to detecting potential redundancy among predictors and to ensuring model robustness by minimizing multicollinearity effects [47].

This analysis included seven variables: the Normalized Difference Vegetation Index (NDVI), slope, near-surface air temperature (T2M), wind speed at 10 m (WS10M), relative humidity (RH), Global Human Modification index (gHM), night-time lights (NTLs), and population density (Pop_km²). A log-transformed version of the NTL variable (NTL_log1p) was also included to account for the strong right-skewed distribution of radiance values (Figure 2).

Pairwise Pearson correlation coefficients (r) were computed for each Area of Interest (AOI) based on all valid raster pixels retained after preprocessing.

Overall, the correlations were moderate, indicating that each predictor contributed distinct information to the modelling framework. In particular, NDVI exhibited a weak negative correlation with gHM and NTL, suggesting contrasting gradients between vegetation cover and human-modified areas, while meteorological variables (T2M, WS10M, RH) showed limited interdependence across AOIs.

The anthropogenic predictors (NTL, NT_log1p, gHM, and Pop_km²) displayed strong inter-correlations, which is consistent with their derivation from similar underlying phenomena, namely, the spatial distribution of human settlements, infrastructure, and land-use modification.

2.6. Data Preprocessing

All satellite datasets were preprocessed to retain valid terrestrial pixels only. Sentinel-2 Level-2A surface reflectance imagery, preprocessed within the Google Earth Engine (GEE) environment, was filtered for low cloud cover and used to generate seasonal (summer) NDVI composites. Cloud masking relied on the preprocessing algorithms embedded in GEE, ensuring consistent removal of cloud- and shadow-contaminated pixels.

To exclude non-terrestrial areas, the MODIS MOD44W Water Mask product was employed to remove both oceanic and permanent inland water pixels. This filtering ensured that NDVI and related spectral indices accurately represented vegetation conditions over land, minimizing artefacts caused by clouds or water bodies.

All raster datasets retained for analysis were resampled and reprojected to a common spatial resolution of 500 m using the default GEE projection framework (EPSG:4326). To enable cross-variable comparison, all raster layers were subsequently normalized to a 0–1 range, with normalization applied individually per Area of Interest (AOI).

The complete preprocessing and harmonization workflow is illustrated in Figure 3.

2.7. Modelling Framework

Wildfire fire susceptibility was modelled using two ensemble-based machine-learning algorithms: Random Forest (RF) and Extreme Gradient Boosting (XGB). Both algorithms are particularly well-suited for geospatial modelling tasks due to their ability to capture complex, non-linear relationships between predictors; their robustness to noise; and their flexibility in handling heterogeneous environmental data.

2.7.1. Random Forest (RF)

The Random Forest algorithm [19] is an ensemble of decision trees trained on bootstrap samples of data, with random subsets of predictors considered at each split. This bootstrap aggregation (bagging) strategy reduces model variance and mitigates overfitting, yielding stable predictions even in the presence of correlated variables or missing data. In geospatial applications, RF has proven effective in modelling environmental phenomena such as vegetation dynamics, land cover change, and wildfire occurrence, primarily because it can represent high-dimensional interactions among climatic, topographic, and anthropogenic predictors without requiring strong parametric assumptions.

2.7.2. Extreme Gradient Boosting (XGB)

Extreme Gradient Boosting (XGBoost) [20] is a scalable, tree-based boosting algorithm that sequentially builds an ensemble of weak learners to minimize a differentiable loss function. By iteratively fitting new trees to the residuals of previous models, XGB captures subtle patterns in the data and achieves strong predictive performance. Regularization techniques (e.g.,

L_{1}

and

L_{2}

penalties) help control model complexity and prevent overfitting, while feature importance metrics derived from gain or split frequency facilitate interpretability. In geospatial contexts, XGB is particularly valuable for integrating multi-source environmental datasets (e.g., remote-sensing, climate, and socio-economic layers) and for producing spatially explicit probability surfaces.

2.8. Validation and Interpretation

Both models were implemented in Python 3.10 using the scikit-learn and xgboost libraries. The modelling framework employed balanced datasets to address class imbalance between ignition and non-ignition pixels. Data were partitioned into training and validation subsets using an 80/20 split. Hyper-parameters, including tree depth, learning rate, and subsampling ratio, were optimized through a five-fold spatial cross-validation scheme to account for spatial autocorrelation and ensure generalization across distinct geographic areas.

2.9. Spatial Prediction and Reproducibility

Trained models were applied to continuous predictor rasters to generate ignition-probability maps, which were subsequently classified into five risk categories (very low to very high) using Jenks natural breaks. All scripts and input datasets are openly available to ensure reproducibility and facilitate model scalability across other Mediterranean regions.

3. Results

3.1. XGBoost Model Performance

The Extreme Gradient Boosting (XGBoost, XGB) classifier demonstrates strong predictive capability in discriminating ignition-prone locations from background conditions. The confusion matrix (Table 7) shows clear separation between ignition and non-ignition samples, with relatively few misclassifications. The model achieves an overall accuracy of 0.88, an AUC of 0.954, and a PR-AUC of 0.925, indicating a high capacity to distinguish locations with elevated ignition susceptibility within burnable landscapes (Figure 4).

Although AUC values above 0.95 can sometimes reflect overly permissive background sampling, this interpretation is unlikely here because background samples are explicitly restricted to burnable land-cover classes. Consequently, model performance reflects discrimination among locations with comparable fuel availability rather than a trivial separation between burnable and non-burnable surfaces.

The F1-score of 0.85 reflects a balanced trade-off between Precision (0.80) and Recall (0.91), with notably higher sensitivity to ignition events than the Random Forest model. The five-fold cross-validation yields a mean AUC of 0.945, indicating stable performance across spatial partitions. These results suggest that XGBoost effectively captures non-linear interactions among anthropogenic pressure, vegetation condition, topography, and climatic context.

Accordingly, the model outputs should be interpreted as relative indicators of long-term ignition susceptibility within vegetated areas, rather than absolute predictions of wildfire occurrence.

3.2. Random Forest Model Performance

The Random Forest (RF) classifier also achieved strong predictive performance across the Mediterranean ignition dataset. The confusion matrix (Table 8) indicated balanced discrimination between ignition and non-ignition samples. The model achieved an AUC of 0.955 as shown in Figure 5, an overall accuracy of 0.90, and a Cohen’s $κ$ of 0.78, signifying substantial agreement beyond random classification.

As with the XGBoost model, high classification accuracy should be interpreted in the context of background sampling restricted to burnable land-cover classes. This design ensures that model performance reflects relative differences in ignition susceptibility within vegetated landscapes, rather than contrasts between fuel and non-fuel areas.

Both omission (false-negative) and commission (false-positive) errors remain below 10%, indicating consistent discrimination across classes. These results confirm that the RF model provides a robust baseline for estimating spatial patterns of ignition susceptibility, capturing non-linear relationships between anthropogenic accessibility and environmental variables with high reliability.

3.3. Model Comparison and Interpretation

A comparative assessment of the Random Forest (RF) and XGBoost (XGB) models highlights consistent and complementary performance patterns (Table 9). Both ensemble methods achieve high predictive accuracy and substantial agreement beyond chance, with overall accuracies above 0.88 and AUC values exceeding 0.95. While RF achieves a slightly higher AUC (0.96 vs. 0.954) and comparable F1-score (0.85 vs. 0.85), XGBoost exhibits higher Recall (0.91 vs. 0.85), capturing more ignition-prone pixels and reducing omission errors relative to the Random Forest model.

Despite these differences, both models reveal strong consistency in their predictive structure and rank anthropogenic variables (night-time lights and the gHM index) as the most influential ignition drivers. The slightly lower false-positive rate in the RF model suggests it generalizes more conservatively, whereas XGBoost, with its gradient boosting optimization, captures more subtle ignition patterns at the cost of a modest increase in commission errors. Overall, XGBoost provides marginally higher sensitivity and generalization, while Random Forest offers enhanced stability and interpretability, making the two approaches complementary within a reproducible ignition-modelling framework.

3.4. Cross-Country Transferability

To evaluate model generalization across regions, a leave-one-country-out (LOCO) transfer–validation strategy was applied. In this setting, models were trained on data from three countries and evaluated on the held-out country, iteratively rotating the test region.

While the mean transfer AUC across all LOCO experiments was 0.85, this aggregated value concealed substantial variability in cross-country transfer performance. To provide a more detailed assessment, Table 10 reports a full 4 × 4 transfer matrix, where each entry represents the AUC obtained when training on one country and testing on another.

The transfer matrix reveals clear patterns in model portability. Higher transfer performance is generally observed between countries sharing similar Mediterranean bioclimatic conditions and land-use structures, such as Spain, Italy, and southern France. In contrast, transfer performance to Morocco is systematically lower, particularly when models are trained exclusively on European countries.

This degradation likely reflects differences in anthropogenic drivers, fire-management practices, data-reporting mechanisms, and socio-economic conditions, rather than purely climatic factors. Nevertheless, the fact that all cross-country transfers maintain AUC values above 0.80 indicates that the model captures robust, structurally meaningful relationships between ignition susceptibility and its predictors.

From an operational perspective, these results suggest that cross-border model reuse is the most reliable among the biogeographically and socio-environmentally similar Mediterranean regions, while transfer to more distinct contexts may benefit from regional recalibration or limited local retraining.

4. Discussion

4.1. Anthropogenic Versus Biophysical Drivers

Variable-importance analyses derived from the Random Forest and XGBoost models (Figure 6 and Figure 7) indicate that anthropogenic proxies consistently contribute strongly to wildfire ignition susceptibility across all study areas. In particular, night-time light intensity and the Global Human Modification (gHM) index rank among the most influential predictors, together accounting for approximately 40–60% of cumulative model importance.

It is important to emphasize that variable-importance metrics in tree-based ensemble models do not represent direct causal effects, but rather quantify the relative contribution of predictors to model discrimination. Anthropogenic variables such as gHM, night-time lights, and population-related indicators are strongly correlated and should therefore be interpreted collectively as proxies for human presence, infrastructure density, and land-use intensity, rather than as independent drivers of ignition processes.

In the Random Forest model (Figure 6), gHM emerges as the dominant predictor, followed by NDVI and terrain slope. This pattern suggests that ignition susceptibility is shaped by the interaction between human pressure and biophysical context, where fuel availability, vegetation continuity, and landscape accessibility jointly influence the spatial distribution of ignitions. Similarly, the XGBoost model (Figure 7) assigns high importance to gHM and night-time light metrics, reinforcing the central role of human activity intensity and spatial distribution in modulating ignition susceptibility across regions.

Vegetation condition, represented by NDVI, contributes by capturing variations in fuel load and moisture status that influence flammability and ignition potential. Topographic slope influences ignition susceptibility indirectly by shaping land-use patterns, accessibility, and fire-suppression effectiveness, with ignitions more frequently occurring in accessible low- to mid-slope environments rather than in steep or remote terrain.

4.2. Spatial Patterns of Ignition Susceptibility

Predicted ignition-susceptibility maps (Figure 8 and Figure 9) reveal consistent spatial clustering of high-susceptibility zones near transportation networks, peri-urban interfaces, and agricultural frontiers. Across all regions, elevated susceptibility aligns with areas characterized by dense human activity and fragmented land use, including coastal Andalusia (Spain), lowland corridors in Italy, and peri-urban belts surrounding Tangier.

In contrast, mountainous and remote areas generally exhibit lower ignition susceptibility, highlighting the mediating role of accessibility and human presence. These spatial patterns are consistent with the observed burned-area and ignition distributions reported in previous Mediterranean studies and support the ecological plausibility of the modelled susceptibility gradients.

In more densely populated regions such as Italy and France, the sharp transition from high-susceptibility peri-urban belts to lower-susceptibility forest interiors reflects strong socio-spatial gradients in land use, infrastructure density, and human access. Such patterns underscore the importance of human–landscape interactions in shaping ignition susceptibility beyond purely biophysical controls.

4.3. Role of Anthropogenic Pressure and the gHM Index

While the overall dominance of anthropogenic proxies is evident from the global importance rankings, the prominence of the Global Human Modification (gHM) index warrants specific discussion. Across both Random Forest and XGBoost models, gHM consistently emerges as the most influential predictor of wildfire ignition susceptibility. Given that the majority of fires in the Mediterranean basin are human-caused, this result is not unexpected; nevertheless, the use of a composite anthropogenic index requires careful interpretation.

Importantly, gHM does not encode fire occurrences or ignition mechanisms directly. Rather, it represents cumulative human pressure on landscapes by integrating information on built-up areas, agriculture, infrastructure, and population influence. Its strong contribution therefore reflects the spatial coupling between human-modified environments and ignition likelihood, rather than a tautological prediction of human-caused fires.

The dominance of gHM may also partially mask the contribution of more specific anthropogenic proxies, such as population density or night-time light intensity, due to substantial shared variance among these predictors. Tree-based ensemble methods are known to preferentially select integrative variables when multicollinearity is present, which can lead to a concentration of importance in composite indices. In this context, gHM acts as an efficient surrogate capturing multiple correlated dimensions of human pressure, rather than introducing circularity or inflating predictive performance.

Importantly, if human presence alone were sufficient to explain ignition patterns, purely demographic proxies would be expected to consistently outperform biophysical variables. The persistent contribution of vegetation condition and topography across models indicates that ignition susceptibility emerges from the interaction between anthropogenic pressure and environmental context, rather than from human factors alone.

From an applied perspective, the use of gHM offers a pragmatic advantage by capturing cumulative human influence in a single, spatially consistent variable. However, for interpretability and policy-oriented analyses, decomposing anthropogenic pressure into its constituent drivers remains valuable and should be considered in future work.

4.4. Interpretation and Comparison with Previous Studies

The prominence of anthropogenic proxies is consistent with a large body of Mediterranean wildfire research identifying human activity as the dominant ignition source [10,48,49]. This study extends prior work by quantifying and comparing these effects across multiple regions within a harmonized and reproducible modelling framework, enabling direct cross-regional comparison.

The stability of model performance under cross-region transfer further suggests that common structural relationships between human pressure and ignition susceptibility exist across Mediterranean landscapes, despite regional differences in climate, vegetation, and socio-economic conditions. Rather than implying uniform basin-wide behaviour, these findings point to partially transferable ignition patterns under comparable human–environment configurations.

From a fire-management and policy perspective, the dominance of anthropogenic proxies highlights the importance of prevention strategies targeting human activities, particularly in peri-urban and agricultural landscapes. Ignition-susceptibility maps produced by this framework can support targeted awareness campaigns, infrastructure monitoring, and land-use planning in areas where human pressure and flammable vegetation intersect.

4.5. Limitations and Future Work

Several limitations should be considered when interpreting the results. First, ignition locations are approximated using MODIS burned-area pixels at 500 m resolution, which may introduce spatial uncertainty and under-represent small or short-lived fires, many of which are human-caused. Second, night-time light intensity may under-represent rural or low-luminance communities, potentially biasing estimates of human pressure in certain contexts. Third, despite the use of spatial cross-validation, residual spatial autocorrelation may persist due to unmodelled socio-economic or infrastructural factors. Finally, the use of a static population density layer may not fully capture temporal dynamics over the study period.

Future work should incorporate higher-resolution ignition datasets (e.g., VIIRS 375 m or Sentinel-2 MSI), dynamic meteorological and fuel-moisture variables, and additional accessibility metrics such as distance to settlements or transportation networks. Expanding the framework to a larger number of Mediterranean regions and other semi-arid environments would further improve the understanding of human-driven ignition processes under changing climatic and socio-economic conditions.

5. Conclusions

This study developed an open, reproducible, and multi-model framework to quantify and compare the anthropogenic and biophysical drivers of wildfire ignition across four representative Mediterranean countries: Morocco, Spain, France, and Italy. By integrating globally available remote-sensing indicators, vegetation greenness, terrain slope, land-surface temperature anomalies, night-time light intensity, and the Global Human Modification index with Random Forest and gradient-boosting models, we produced harmonized ignition-risk maps and assessed model transferability across diverse socio-environmental contexts.

The results demonstrate that anthropogenic accessibility and land-use intensity remain the primary determinants of ignition probability throughout the Mediterranean basin. Human proxies such as night-time light intensity and the gHM index consistently outperformed environmental predictors, while temperature anomalies and vegetation greenness played secondary roles by modulating fuel dryness, availability, and flammability. Spatial cross-validation confirmed robust model performance (AUC > 0.9 for tree-based models), and cross-country transfer tests revealed substantial generalization (mean LOCO AUC ≈ 0.85), underscoring the structural similarity of human-driven ignition processes across Mediterranean landscapes.

Scientific contribution: The proposed workflow unifies data processing, modelling, and validation under an open access framework, offering a transparent and replicable methodology for multi-country, large-scale ignition-risk mapping. Its scalability and interpretability make it adaptable to other fire-prone regions globally, where data scarcity limits conventional hazard modelling.

Practical implications: The resulting ignition-risk maps provide actionable information for national forestry and civil-protection agencies. They can guide the prioritization of surveillance and prevention resources toward high-accessibility corridors, urban–rural interfaces, and agricultural frontiers where human ignition pressure is greatest. Because all inputs rely solely on publicly available data and cloud computing infrastructure, the framework can be operationalized with minimal technical investment.

Future directions: Subsequent research should couple ignition models with dynamic meteorological variables—fuel moisture, wind fields, and drought indices—to improve temporal realism. Integrating higher-resolution burned-area detections (e.g., VIIRS 375 m and Sentinel-2 MSI) and socio-economic layers (e.g., distance to roads, road traffic, or tourism intensity) will further refine ignition localization. Finally, expanding the framework toward spatio-temporal or probabilistic forecasting could support early-warning systems and cross-border fire-management policies across the broader Mediterranean region. Additionally, obtaining validation data from local organizations would help to improve the results to a more precise location and fire size.

Author Contributions

Conceptualization: N.A.D.; Methodology: N.A.D. and M.W.; Data curation and preprocessing: N.A.D. and I.F.; Model development and analysis: N.A.D. and I.F.; Visualization and interpretation: N.A.D. and M.W.; Writing—original draft: N.A.D., I.F., M.W. and M.M.; Supervising: M.W., M.M., H.B., O.Y.A. and O.E.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was conducted without direct financial support from specific funding agencies in the public, commercial, or not-for-profit sectors. Computational resources were provided through institutional access to Google Earth Engine and open cloud-based platforms.

Data Availability Statement

All datasets used in this study are publicly available and were accessed through open repositories. Specifically, MODIS MCD64A1 Burned Area (NASA LP DAAC): https://lpdaac.usgs.gov/products/mcd64a1v061/, accessed on 13 September 2025; VIIRS night-time lights (NOAA): https://eogdata.mines.edu/products/vnl/, accessed on 10 September 2025; Sentinel-2 Surface Reflectance (COPERNICUS): https://dataspace.copernicus.eu/, accessed on 14 September 2025; SRTM Digital Elevation Model (USGS): https://earthexplorer.usgs.gov/, accessed on 13 September 2025; and Global Human Modification index (CSP): https://developers.google.com/earth-engine/datasets/catalog/CSP_HM_GlobalHumanModification, accessed on 13 September 2025. All preprocessing and modelling scripts are available upon request to the authors.

Acknowledgments

We acknowledge the Google Earth Engine development team for maintaining the cloud-based geospatial infrastructure that made this research computationally feasible. Constructive feedback from colleagues and reviewers helped to improve the clarity and scope of this study. All analyses were conducted using open-source software, including Python3, scikit-learn, and statsmodels.

Conflicts of Interest

The authors declare that there are no commercial or financial relationships that could be construed as a potential conflict of interest. No part of this study was influenced by funding entities or external organizations with vested interests in the study outcomes.

References

Milanović, S.; Marković, N.; Pamučar, D.; Gigović, L.; Kostić, P.; Milanović, S.D. Forest Fire Probability Mapping in Eastern Serbia: Logistic Regression versus Random Forest Method. Forests 2020, 12, 5. [Google Scholar] [CrossRef]
Moritz, M.A.; Batllori, E.; Bradstock, R.A.; Gill, A.M.; Handmer, J.; Hessburg, P.F.; Leonard, J.; McCaffrey, S.; Odion, D.C.; Schoennagel, T.; et al. Learning to coexist with wildfire. Nature 2014, 515, 58–66. [Google Scholar] [CrossRef]
Dempsey, D.A.; Klessig, D.F. SOS—Too many signals for systemic acquired resistance? Trends Plant Sci. 2012, 17, 538–545. [Google Scholar] [CrossRef]
San-Miguel-Ayanz, J.; Schulte, E.; Schmuck, G.; Camia, A.; Strobl, P.; Liberta, G.; Giovando, C.; Boca, R.; Sedano, F.; Kempeneers, P.; et al. Comprehensive Monitoring of Wildfires in Europe: The European Forest Fire Information System (EFFIS). In Approaches to Managing Disaster—Assessing Hazards, Emergencies and Disaster Impacts; InTech: London, UK, 2012. [Google Scholar] [CrossRef]
Ruffault, J.; Curt, T.; Moron, V.; Trigo, R.M.; Mouillot, F.; Koutsias, N.; Pimont, F.; Martin-StPaul, N.K.; Barbero, R.; Dupuy, J.-L. Increased likelihood of heat-induced large wildfires in the Mediterranean Basin. Sci. Rep. 2020, 10, 13790. [Google Scholar] [CrossRef]
Turco, M.; Bedia, J.; Di Liberto, F.; Fiorucci, P.; von Hardenberg, J.; Koutsias, N.; Llasat, M.C.; Xystrakis, F.; Provenzale, A. Decreasing Fires in Mediterranean Europe. PLoS ONE 2016, 11, e0150663. [Google Scholar] [CrossRef]
Essaghi, S.; Hachmi, M.; Yessef, M.; Dehhaoui, M.; Sesbou, A. Litter and biomass traits of some dominant Moroccan understorey fuels in five fire-prone forest regions. Bois Forets Trop. 2019, 342, 3–16. [Google Scholar] [CrossRef]
Boubekraoui, H.; Maouni, Y.; Ghallab, A.; Draoui, M.; Maouni, A. Wildfires Risk Assessment Using Hotspot Analysis and Results Application to Wildfires Strategic Response in the Region of Tangier-Tetouan-Al Hoceima, Morocco. Fire 2023, 6, 314. [Google Scholar] [CrossRef]
Ochoa, C.; Bar-Massada, A.; Chuvieco, E. A European-scale analysis reveals the complex roles of anthropogenic and climatic factors in driving the initiation of large wildfires. Sci. Total Environ. 2024, 917, 170443. [Google Scholar] [CrossRef] [PubMed]
Ganteaume, A.; Camia, A.; Jappiot, M.; San-Miguel-Ayanz, J.; Long-Fournel, M.; Lampin, C. A Review of the Main Driving Factors of Forest Fire Ignition Over Europe. Environ. Manag. 2012, 51, 651–662. [Google Scholar] [CrossRef]
Fusco, E.J.; Abatzoglou, J.T.; Balch, J.K.; Finn, J.T.; Bradley, B.A. Quantifying the human influence on fire ignition across the western USA. Ecol. Appl. 2016, 26, 2390–2401. [Google Scholar] [CrossRef]
Haas, O.; Keeping, T.; Gomez-Dans, J.; Prentice, I.C.; Harrison, S.P. The global drivers of wildfire. Front. Environ. Sci. 2024, 12, 1438262. [Google Scholar] [CrossRef]
Nagy, R.C.; Fusco, E.; Bradley, B.; Abatzoglou, J.T.; Balch, J. Human-Related Ignitions Increase the Number of Large Wildfires across U.S. Ecoregions. Fire 2018, 1, 4. [Google Scholar] [CrossRef]
Amraoui, M.; Pereira, M.G.; DaCamara, C.C.; Calado, T.J. Atmospheric conditions associated with extreme fire activity in the Western Mediterranean region. Sci. Total Environ. 2015, 524–525, 32–39. [Google Scholar] [CrossRef]
Alves, D.; Almeida, M.; Viegas, D.X.; Novo, I.; Luna, M.Y. Fire Danger Harmonization Based on the Fire Weather Index for Transboundary Events between Portugal and Spain. Atmosphere 2021, 12, 1087. [Google Scholar] [CrossRef]
Ruffault, J.; Mouillot, F. Contribution of human and biophysical factors to the spatial distribution of forest fire ignitions and large wildfires in a French Mediterranean region. Int. J. Wildland Fire 2017, 26, 498–508. [Google Scholar] [CrossRef]
Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
El Amarty, F.; Chakir, A.; Purohit, S.; Elhag, M.; Lakhili, F.; Benaabidate, L.; Lahrach, A. Comparative Evaluation of Machine Learning Models and Remote Sensing Data for Wildfire Prediction in the Atlas Middle Forest, Morocco. Earth Syst. Environ. 2025. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System; KDD ’16. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
Rodriguez-Galiano, V.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J. An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS J. Photogramm. Remote Sens. 2012, 67, 93–104. [Google Scholar] [CrossRef]
Grbčić, L.; Družeta, S.; Mauša, G.; Lipić, T.; Lušić, D.V.; Alvir, M.; Lučin, I.; Sikirica, A.; Davidović, D.; Travaš, V.; et al. Coastal water quality prediction based on machine learning with feature interpretation and spatio-temporal analysis. Environ. Model. Softw. 2022, 155, 105458. [Google Scholar] [CrossRef]
Çolak, E.; Sunar, F. Evaluation of forest fire risk in the Mediterranean Turkish forests: A case study of Menderes region, Izmir. Int. J. Disaster Risk Reduct. 2020, 45, 101479. [Google Scholar] [CrossRef]
Chai-Allah, A.; Maillé, E. Mapping Forest Fire Risk in Mediterranean forests—A Case Study of SUD-Provence-Alpes-Côte d’Azur Region (SE, France). Environ. Sci. Proc. 2021, 3, 69. [Google Scholar] [CrossRef]
Gerberding, K.; Schirpke, U. Mapping the probability of forest fire hazard across the European Alps under climate change scenarios. J. Environ. Manag. 2025, 377, 124600. [Google Scholar] [CrossRef]
Nunes, A.R. The state of wildfire and health research: Emerging trends, challenges and gaps. Int. Health 2025, 17, 922–933. [Google Scholar] [CrossRef]
Moreira, F.; Ascoli, D.; Safford, H.; Adams, M.A.; Moreno, J.M.; Pereira, J.M.C.; Catry, F.X.; Armesto, J.; Bond, W.; González, M.E.; et al. Wildfire management in Mediterranean-type regions: Paradigm change needed. Environ. Res. Lett. 2020, 15, 011001. [Google Scholar] [CrossRef]
Myers, N.; Mittermeier, R.A.; Mittermeier, C.G.; da Fonseca, G.A.B.; Kent, J. Biodiversity hotspots for conservation priorities. Nature 2000, 403, 853–858. [Google Scholar] [CrossRef]
Fernandes, P.M. Fire-smart management of forest landscapes in the Mediterranean basin under global change. Landsc. Urban Plan. 2013, 110, 175–182. [Google Scholar] [CrossRef]
Lovreglio, R.; Leone, V.; Giaquinto, P.; Notarnicola, A. Wildfire cause analysis: Four case-studies in southern Italy. iFor. Biogeosci. For. 2010, 3, 8–15. [Google Scholar] [CrossRef]
Vecchiato, D.; Tempesta, T. Valuing the benefits of an afforestation project in a peri-urban area with choice experiments. For. Policy Econ. 2013, 26, 111–120. [Google Scholar] [CrossRef]
Giglio, L.; Boschetti, L.; Roy, D.P.; Humber, M.L.; Justice, C.O. The Collection 6 MODIS burned area mapping algorithm and product. Remote Sens. Environ. 2018, 217, 72–85. [Google Scholar] [CrossRef] [PubMed]
Tachikawa, T.; Hato, M.; Kaku, M.; Iwasaki, A. Characteristics of ASTER GDEM version 2. In Proceedings of the 2011 IEEE International Geoscience and Remote Sensing Symposium, IEEE, Vancouver, BC, Canada, 24–29 July 2011. [Google Scholar] [CrossRef]
Carmo, M.; Moreira, F.; Casimiro, P.; Vaz, P. Land use and topography influences on wildfire occurrence in northern Portugal. Landsc. Urban Plan. 2011, 100, 169–176. [Google Scholar] [CrossRef]
Harris, L.; Taylor, A.H. Previous burns and topography limit and reinforce fire severity in a large wildfire. Ecosphere 2017, 8, e02019. [Google Scholar] [CrossRef]
Povak, N.A.; Hessburg, P.F.; Salter, R.B. Evidence for scale-dependent topographic controls on wildfire spread. Ecosphere 2018, 9, e02443. [Google Scholar] [CrossRef]
Calviño-Cancela, M.; Chas-Amil, M.L.; García-Martínez, E.D.; Touza, J. Interacting effects of topography, vegetation, human activities and wildland-urban interfaces on wildfire ignition risk. For. Ecol. Manag. 2017, 397, 10–17. [Google Scholar] [CrossRef]
Pettorelli, N.; Vik, J.O.; Mysterud, A.; Gaillard, J.M.; Tucker, C.J.; Stenseth, N.C. Using the satellite-derived NDVI to assess ecological responses to environmental change. Trends Ecol. Evol. 2005, 20, 503–510. [Google Scholar] [CrossRef] [PubMed]
Vitolo, C.; Di Giuseppe, F.; Barnard, C.; Coughlan, R.; San-Miguel-Ayanz, J.; Libertá, G.; Krzeminski, B. ERA5-based global meteorological wildfire danger maps. Sci. Data 2020, 7, 216. [Google Scholar] [CrossRef]
Huang, X.; Ding, K.; Liu, J.; Wang, Z.; Tang, R.; Xue, L.; Wang, H.; Zhang, Q.; Tan, Z.M.; Fu, C.; et al. Smoke-weather interaction affects extreme wildfires in diverse coastal regions. Science 2023, 379, 457–461. [Google Scholar] [CrossRef]
Ghodrat, M.; Shakeriaski, F.; Fanaee, S.A.; Simeoni, A. Software-Based Simulations of Wildfire Spread and Wind-Fire Interaction. Fire 2022, 6, 12. [Google Scholar] [CrossRef]
Kennedy, C.M.; Oakleaf, J.R.; Theobald, D.M.; Baruch-Mordo, S.; Kiesecker, J. Managing the middle: A shift in conservation priorities based on the global human modification gradient. Glob. Change Biol. 2019, 25, 811–826. [Google Scholar] [CrossRef]
Pourmohamad, Y.; Abatzoglou, J.T.; Fleishman, E.; Short, K.C.; Shuman, J.; AghaKouchak, A.; Williamson, M.; Seydi, S.T.; Sadegh, M. Inference of Wildfire Causes From Their Physical, Biological, Social and Management Attributes. Earth’s Future 2025, 13, e2024EF005187. [Google Scholar] [CrossRef]
Teymoor Seydi, S.; Abatzoglou, J.T.; Jones, M.W.; Kolden, C.A.; Filippelli, G.; Hurteau, M.D.; AghaKouchak, A.; Luce, C.H.; Miao, C.; Sadegh, M. Increasing global human exposure to wildland fires despite declining burned area. Science 2025, 389, 826–829. [Google Scholar] [CrossRef] [PubMed]
Freeborn, P.H.; Jolly, W.M.; Cochrane, M.A.; Roberts, G. Large wildfire driven increases in nighttime fire activity observed across CONUS from 2003–2020. Remote Sens. Environ. 2022, 268, 112777. [Google Scholar] [CrossRef]
Sorichetta, A.; Hornby, G.M.; Stevens, F.R.; Gaughan, A.E.; Linard, C.; Tatem, A.J. High-resolution gridded population datasets for Latin America and the Caribbean in 2010, 2015, and 2020. Sci. Data 2015, 2, 150045. [Google Scholar] [CrossRef]
Dormann, C.F.; Elith, J.; Bacher, S.; Buchmann, C.; Carl, G.; Carré, G.; Marquéz, J.R.G.; Gruber, B.; Lafourcade, B.; Leitão, P.J.; et al. Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography 2012, 36, 27–46. [Google Scholar] [CrossRef]
Vilar, L.; Camia, A.; San-Miguel-Ayanz, J.; Martín, M.P. Modeling temporal changes in human-caused wildfires in Mediterranean Europe based on Land Use-Land Cover interfaces. For. Ecol. Manag. 2016, 378, 68–78. [Google Scholar] [CrossRef]
Ribotta, C.; Costa-Saura, J.M.; Bacciu, V.; Sirca, C.; Spano, D. Spatial Variability of Wildfire Causes in French Eastern Mediterranean Regions. Environ. Sci. Proc. 2022, 17, 110. [Google Scholar] [CrossRef]

Figure 1. Locations of the four Mediterranean Areas of Interest (AOIs). The red squares show the location of where the data was acquired and processed.

Figure 2. Correlation matrix of the predictor variables over the four AoIs.

Figure 3. Workflow of the wildfire ignition-modelling framework, illustrating data integration, model training, and model validation.

Figure 4. Receiver Operating Characteristic (ROC) and Precision–Recall (PR) curves for the XGBoost model. The classifier achieved an AUC of 0.954 and a PR-AUC of 0.925, indicating excellent discrimination and Recall balance in wildfire ignition prediction.

Figure 5. Receiver Operating Characteristic (ROC) and Precision–Recall (PR) curves for the Random Forest model, showing an AUC of 0.96.

Figure 6. Random forest feature importance showing a significant role of gHM followed by the NDVI and the slope.

Figure 7. XGBoost feature importance showing a significant role of gHM followed by the NTL and the NTL_log1p.

Figure 8. XGBoost feature importance showing a significant role of gHM followed by the NTL and the NTL_log1p.

Figure 9. XGBoost feature importance showing a significant role of gHM followed by the NTL and the NTL_log1p.

Table 1. Slope statistics for selected regions.

Region	Mean (°)	Median (°)	Std Dev (°)
Spain—Sevilla	2.62	1.94 (IQR: 0.94–3.56)	2.25 (P05–P95: 0.32–7.18)
Italy—Sicily	4.59	3.38 (IQR: 0.39–7.13)	4.78 (P05–P95: 0.01–13.62)
Morocco—Tangier	5.97	4.38 (IQR: 1.13–9.38)	5.96 (P05–P95: 0.01–17.62)
France—Corsica	9.62	8.87 (IQR: 2.63–15.13)	7.85 (P05–P95: 0.01–23.87)

Table 2. NDVI summary statistics for selected regions.

Region	Mean NDVI	Median NDVI	Std Dev
Spain—Sevilla	0.3780	0.3789	0.1276
Italy—Sicily	0.3356	0.3164	0.1239
Morocco—Tangier	0.3920	0.3574	0.1676
France—Corsica	0.6415	0.6680	0.1602

Table 3. Meteorological statistics for selected regions.

Region	Temperature (°C)			Wind (m/s)		RH (%)
Region	Mean	Median	Range	Mean	Range	Mean	Range
Spain—Sevilla	18.58	18.69	16.83–19.66	2.03	1.71–2.36	59.84	58.04–63.22
Italy—Sicily	16.95	17.02	13.96–19.02	2.09	1.52–3.87	70.67	67.81–74.90
Morocco—Tangier	17.41	17.65	15.36–18.60	1.93	0.94–4.81	73.82	68.95–78.09
France—Corsica	14.86	15.23	10.82–17.93	1.80	0.82–4.80	73.30	68.31–76.13

Table 4. Global Human Modification (gHM) statistics per Area of Interest (AOI).

Region	Mean gHM	Median gHM	IQR	Low (%)	Medium (%)	High (%)
Spain—Sevilla	0.344	0.283	0.197–0.459	0.0	66.8	33.2
Italy—Sicily	0.597	0.619	0.502–0.701	0.0	4.4	48.9
Morocco—Tangier	0.460	0.424	0.287–0.611	0.0	28.0	33.2
France—Corsica	0.261	0.256	0.150–0.346	3.4	42.5	7.3

Table 5. VIIRS night-time light statistics per Area of Interest (AOI).

Region	Mean Radiance (nW·cm⁻²·sr⁻¹)	Bright Areas (%)	Dim Areas (%)	Dark Areas (%)
Spain—Sevilla	2.32	4.5	12.8	82.7
Italy—Sicily	1.38	2.8	13.4	83.9
Morocco—Tangier	1.88	3.1	10.8	86.1
France—Corsica	0.43	0.4	4.0	95.8

Table 6. Population density (WorldPop 2020) statistics per Area of Interest (AOI).

Region	Mean Population Density (People/km²)	Urban (%)	Peri-Urban (%)	Rural (%)
Spain—Sevilla	132.7	4.1	7.8	88.2
Italy—Sicily	67.5	1.6	1.7	96.7
Morocco—Tangier	157.2	3.4	19.1	77.5
France—Corsica	14.2	0.4	3.5	96.2

Table 7. Confusion matrix for the XGBoost classifier on the validation dataset.

	Predicted: Non-Ignition (0)	Predicted: Ignition (1)
True: Non-Ignition (0)	1387	213
True: Ignition (1)	88	837

Table 8. Confusion matrix for the Random Forest classifier on the validation dataset.

	Predicted: Non-Ignition (0)	Predicted: Ignition (1)
True: Non-Ignition (0)	1484	127
True: Ignition (1)	122	749

Table 9. Comparison of key performance metrics between the Random Forest (RF) and XGBoost (XGB) classifiers.

Metric	Random Forest (RF)	XGBoost (XGB)
Accuracy	0.900	0.880
Cohen’s $κ$	0.780	0.820
AUC (ROC)	0.960	0.954
PR-AUC	0.910 (est.)	0.925
F1-score	0.850	0.850
Recall (Ignition)	0.85	0.91
Cross-val. AUC (mean ± SD)	–	0.945 ± 0.002

Table 10. Cross-country transfer AUC matrix. Rows indicate the training country and columns indicate the test country. Diagonal elements correspond to within-country spatial cross-validation and are reported for reference.

Train\Test	Spain	France	Italy	Morocco
Spain	0.89	0.86	0.87	0.84
France	0.85	0.90	0.86	0.80
Italy	0.87	0.88	0.91	0.82
Morocco	0.83	0.81	0.82	0.88

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dimarco, N.A.; Faraji, I.; Wahbi, M.; Maatouk, M.; Boulaassal, H.; Aalaoui, O.Y.; El Kharki, O. Generalizing Human-Driven Wildfire Ignition Models Across Mediterranean Regions Using Harmonized Remote-Sensing and Machine-Learning Data. Geomatics 2026, 6, 13. https://doi.org/10.3390/geomatics6010013

AMA Style

Dimarco NA, Faraji I, Wahbi M, Maatouk M, Boulaassal H, Aalaoui OY, El Kharki O. Generalizing Human-Driven Wildfire Ignition Models Across Mediterranean Regions Using Harmonized Remote-Sensing and Machine-Learning Data. Geomatics. 2026; 6(1):13. https://doi.org/10.3390/geomatics6010013

Chicago/Turabian Style

Dimarco, Nicola Aimane, Ibtissam Faraji, Miriam Wahbi, Mustapha Maatouk, Hakim Boulaassal, Otman Yazidi Aalaoui, and Omar El Kharki. 2026. "Generalizing Human-Driven Wildfire Ignition Models Across Mediterranean Regions Using Harmonized Remote-Sensing and Machine-Learning Data" Geomatics 6, no. 1: 13. https://doi.org/10.3390/geomatics6010013

APA Style

Dimarco, N. A., Faraji, I., Wahbi, M., Maatouk, M., Boulaassal, H., Aalaoui, O. Y., & El Kharki, O. (2026). Generalizing Human-Driven Wildfire Ignition Models Across Mediterranean Regions Using Harmonized Remote-Sensing and Machine-Learning Data. Geomatics, 6(1), 13. https://doi.org/10.3390/geomatics6010013

Article Menu

Generalizing Human-Driven Wildfire Ignition Models Across Mediterranean Regions Using Harmonized Remote-Sensing and Machine-Learning Data

Abstract

1. Introduction

Research Rationale and Objectives

2. Data and Methods

2.1. Study Areas

2.2. Ignition and Background Samples

2.3. Predictor Variables

2.3.1. Topography

2.3.2. NDVI

2.3.3. Weather Variables

2.4. Anthropogenic Variables

2.4.1. Global Human Modification (gHM)

2.4.2. VIIRS Night-Time Light (NTL)

2.4.3. Population Density

2.5. Correlation Study of the Different Predictor Variables

2.6. Data Preprocessing

2.7. Modelling Framework

2.7.1. Random Forest (RF)

2.7.2. Extreme Gradient Boosting (XGB)

2.8. Validation and Interpretation

2.9. Spatial Prediction and Reproducibility

3. Results

3.1. XGBoost Model Performance

3.2. Random Forest Model Performance

3.3. Model Comparison and Interpretation

3.4. Cross-Country Transferability

4. Discussion

4.1. Anthropogenic Versus Biophysical Drivers

4.2. Spatial Patterns of Ignition Susceptibility

4.3. Role of Anthropogenic Pressure and the gHM Index

4.4. Interpretation and Comparison with Previous Studies

4.5. Limitations and Future Work

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI