Applying an Interpretable Deep Learning Model to Identify Wildfire-Prone Areas in Southwest China

Ma, Chenyu; Yang, Siquan; Cui, Jing; Li, Qiang; Yao, Qichao; Zhang, De; Guo, Jiachang; Wang, Xinqian; Qu, Chong

doi:10.3390/fire9030107

Open AccessArticle

Applying an Interpretable Deep Learning Model to Identify Wildfire-Prone Areas in Southwest China

by

Chenyu Ma

¹

,

Siquan Yang

^1,2,*,

Jing Cui

¹,

Qiang Li

¹,

Qichao Yao

¹,

De Zhang

³,

Jiachang Guo

¹,

Xinqian Wang

¹ and

Chong Qu

²

¹

National Institute of Natural Hazards, Ministry of Emergency Management of China, Beijing 100085, China

²

School of Emergency Management Science and Engineering, University of Chinese Academy of Sciences, Beijing 101408, China

³

Department of Geography, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China

^*

Author to whom correspondence should be addressed.

Fire 2026, 9(3), 107; https://doi.org/10.3390/fire9030107

Submission received: 28 December 2025 / Revised: 1 February 2026 / Accepted: 25 February 2026 / Published: 1 March 2026

(This article belongs to the Topic AI for Natural Disasters Detection, Prediction and Modeling)

Download

Browse Figures

Versions Notes

Abstract

Assessing wildfire susceptibility requires integrating environmental and anthropogenic factors to quantify the probability and vulnerability of fires in a given area. Many existing machine-learning models offer high predictive power but limited interpretability, restricting their utility for operational decision-making. This study is the first to apply the intrinsically interpretable deep network TabNet to wildfire susceptibility modeling. By fusing multi-source data and leveraging TabNet’s feature-mask matrix, we achieve accurate prediction and built-in explanation without relying on auxiliary tools. On a dataset of 133,811 samples, the proposed model achieves an Area Under the Curve (AUC) of 0.760, recall of 0.883, precision of 0.395, and an F1.5 score of 0.640, outperforming XGBoost (version 1.5.0) and other baseline models. The importance rankings derived from the feature-mask matrix align with the Shapley Additive Explanations (SHAP) results, confirming the reliability of the explanations. This approach combines predictive accuracy with transparency, providing a deployable framework for wildfire early warning, risk management, and ecosystem conservation.

Keywords:

wildfire susceptibility; remote sensing; machine learning; interpretable

1. Introduction

Wildfires are uncontrolled fires that burn in wildland vegetation, typically in rural or sparsely populated areas. In recent decades, intensified global warming and escalating human activities have increased the frequency of extreme weather events, making wildfires a major natural hazard that threatens ecosystem security [1]. Beyond immediate damage to natural resources, wildfires can induce long-term and cascading ecological consequences, including biodiversity loss [2], reductions in carbon stocks [3], and elevated atmospheric PM_2.5 emissions [4], thereby aggravating regional and global air pollution [5]. Globally, the wildfire-burned area in 2024 reached 367 Mha (ranking 17th since 2001), with an estimated 1965 Tg of carbon emitted [6].

Wildfire susceptibility assessment aims to quantify the likelihood of wildfire occurrence across a geographic region by integrating environmental and anthropogenic determinants. Susceptibility maps can support targeted prevention, resource allocation, and the development of predictive tools for wildfire risk management. Nevertheless, susceptibility assessments remain challenging due to the stochasticity and multidimensional causality of wildfire ignition (e.g., human activities, lightning, and fuel-related spontaneous combustion), which introduce substantial uncertainty. Moreover, wildfire-prone ecosystems are governed by synergistic interactions among thermal conditions, drought intensity, wind, and topography, making single-factor analyses insufficient for comprehensive risk characterization.

To address these complexities, machine learning has been widely adopted for wildfire susceptibility mapping. The core objective is to identify areas that are more likely to be affected by wildfires by learning the relationship between wildfire occurrences and multidimensional environmental drivers. Some studies frame susceptibility mapping as an anomaly detection problem, identifying locations that resemble fire-affected regions [7,8]. More commonly, susceptibility assessment is formulated as a supervised learning task, where classification or regression models are trained using wildfire occurrence labels and environmental features [9,10,11,12,13,14,15]. The selection of environmental variables is critical in this setting; commonly used predictors include Land Surface Temperature (LST), the Normalized Difference Vegetation Index (NDVI), the Temperature Vegetation Dryness Index (TVDI), aspect, and wind, which jointly describe fire-prone conditions from multiple perspectives [16]. A variety of algorithms have been explored, including Kernel Logistic Regression (KLR) and Random Forest (RF) [17,18]. Ensemble learning methods, particularly XGBoost and LightGBM, are frequently reported to yield strong performance due to the iterative optimization of weak learners [16,19,20,21]. More recently, deep learning models such as Artificial Neural Networks (ANN) and U-Net have also been applied and, in some studies, have outperformed ensemble approaches [22,23,24]. Overall, these efforts demonstrate that data-driven models can substantially enhance susceptibility prediction by capturing complex nonlinear relationships between wildfire occurrence and environmental controls.

Despite these advances, model interpretability remains a key bottleneck for practical deployment. Interpretability refers to the extent to which a model’s decision process can be understood, traced, and trusted by humans, thereby enabling the identification of influential drivers and supporting actionable decision-making. Post hoc explanation techniques such as Shapley Additive Explanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) are commonly used in wildfire susceptibility studies. SHAP, in particular, attributes predictions to individual features based on Shapley values and decomposes outputs into additive contributions [25,26,27]. However, post hoc explanations can be sensitive to modeling assumptions and baseline selections, and may yield approximate or unstable interpretations, which can limit stakeholder confidence and complicate operational use.

Accordingly, a critical research gap persists in developing wildfire susceptibility models that can simultaneously achieve high predictive accuracy and provide transparent, reliable explanations. To address this gap, we propose a TabNet-based wildfire susceptibility assessment framework. TabNet [28] is a neural network designed for tabular data that integrates a sequential attention mechanism with a differentiable feature-selection module, offering intrinsic interpretability while modeling complex nonlinear relationships. Using Google Earth Engine (GEE), we compile multisource environmental predictors and remote-sensing fire-hotspot observations for the study area from 2010 to 2020, and train an optimal TabNet model using k-fold cross-validation. We evaluate the proposed approach across multiple wildfire events and benchmark it against commonly used machine learning models. In addition to susceptibility mapping, we provide interpretation outputs derived from TabNet’s built-in feature selection and attention, enabling the identification of key environmental drivers. In summary, the main contribution of this study is the introduction and validation of an accurate yet intrinsically interpretable deep learning framework for wildfire susceptibility mapping, which is intended to support wildfire prevention and early-warning applications.

Article Structure Overview: This study applies an intrinsically interpretable deep learning framework (TabNet) to wildfire susceptibility mapping in Southwest China. The article first delineates the study area, describes the data sources, and outlines the preprocessing workflow, in which multi-source climatic, vegetation, topographic, and anthropogenic predictors are integrated with Moderate Resolution Imaging Spectroradiometer(MODIS) FireMask-derived fire occurrence labels on a 1 km spatial grid. It then specifies the TabNet model architecture, sampling strategy, and evaluation protocol, and presents comparative performance analyses against several baseline models. Finally, the article examines model interpretability using TabNet’s sparse feature-mask matrices, conducts consistency checks with SHAP-based explanations, discusses the implications and limitations of the approach, and concludes with the principal findings relevant to wildfire risk assessment and management.

2. Materials and Methods

2.1. Study Area

The study area is located in southwestern China (21°08′–34°19′ N, 97°21′–110°11′ E), covering approximately 962,500 km² (Figure 1). The region contains extensive forest resources and serves as an important ecological barrier for the upper Yangtze River Basin(China) [29]. It is dominated by a subtropical plateau monsoon climate with distinct dry and wet seasons [30]. The dry season (November–April) features low precipitation, strong winds, prolonged sunshine, and low fuel moisture, resulting in elevated fire danger [31]. In contrast, the wet season (May–October) is characterized by high humidity and frequent rainfall.

The rugged terrain—high mountains, deep valleys, and steep slopes—facilitates wildfire occurrence, especially in mountainous areas at elevations of 2000–3500 m [32]. Seasonal hydroclimatic contrasts and the complex topography jointly increase regional wildfire susceptibility, threatening livelihoods, infrastructure, and ecosystem stability [33].

Wildfires increasingly affect wildland–urban interfaces, posing risks to critical infrastructure and biodiversity hotspots [34]. Fire suppression is further constrained by hypoxic high-elevation conditions, which can substantially reduce firefighters’ physical work capacity compared with lowland environments. Consequently, the region remains one of the most challenging areas in China for wildfire prevention and control. These conditions motivate the development of high-precision and interpretable wildfire susceptibility models to reduce escalating ecological and socioeconomic losses [35].

2.2. Data and Data Preprocessing

This study incorporated both natural and anthropogenic determinants directly associated with forest fire occurrence for modeling analyses. The natural determinants encompassed climatic, vegetative, and topographic attributes, whereas anthropogenic determinants primarily reflected the presence of ignition sources and the potential for fire spread. Fire mask data were obtained from the MODIS (MOD14A1) thermal anomaly product. Climatic variables included air temperature, precipitation, wind speed, wind direction, and soil moisture. Vegetation-related parameters comprised the Normalized Difference Vegetation Index (NDVI), the 3-month Standardized Precipitation Evapotranspiration Index (SPEI3), the Normalized Difference Infrared Index 7 (NDII7) [36], and land cover categories. Topographic characteristics were represented by elevation, and anthropogenic influence was proxied by population density. In addition, the day-of-year corresponding to the fire occurrence time was incorporated to characterize seasonal variability in fire activity. Detailed parameter descriptions and associated metadata are presented in Table 1.

All data aggregation and preprocessing were conducted on the Google Earth Engine (GEE) platform. Prior to model development, preprocessing comprised two primary tasks: (1) the acquisition and georeferencing of raw data and (2) the sampling and assembly of the modeling dataset. Among the candidate features, the Normalized Difference Infrared Index (NDII) was computed using the standard formulation (Equation (1)) and served as a key fire-related variable. In this study, NDII7 denotes the NDII, computed using the MODIS band 7 reflectance as the shortwave-infrared (SWIR) component, hence the suffix “7”. Climate variables (precipitation, temperature, and wind speed) were imported directly from their respective datasets without further processing.

The modeling dataset was constructed in three steps. First, all raster layers covering the study area were resampled to a uniform 1 km × 1 km grid to ensure consistent spatial resolution. Second, samples were generated using the MOD14A1 FireMask layer, whose pixel classes explicitly distinguish fire, non-fire, and no-observation conditions. Specifically, pixels with FireMask values 7–9 were labeled as positive samples (fire detections with confidence levels), while pixels with FireMask value 5 (non-fire land pixel) were labeled as negative samples. Pixels flagged as not processed, water, cloud, or unknown (e.g., FireMask values 1–4 and 6) were excluded from negative sampling to avoid introducing label noise from unobserved/invalid conditions. This choice follows the official FireMask class definitions and ensures that the negative class represents valid non-fire land observations [37]. To reduce spatial dependence and potential near-fire label contamination, we applied a 5 km buffer around historical fire detections. Positive samples were randomly selected within the buffered regions, while negative samples were drawn from areas outside all buffers. The 5 km radius represents an operational trade-off given the 1 km spatial resolution of the harmonized predictors and the MOD14A1 fire mask, and is consistent with the distance-based separation practices commonly used to mitigate spatial autocorrelation effects in spatial prediction. Third, because the focus of this study is forest fire occurrence, we applied an IGBP-based land-cover mask to restrict sampling to forest-dominated fuel environments and excluded non-target or essentially non-burnable classes (e.g., urban areas, water bodies, barren/sparsely vegetated surfaces, and non-forest vegetation types such as grasslands and wetlands). This masking reduces the risk of trivially separable negatives (e.g., open water) inflating model performance and keeps model interpretation centered on forest-fuel fire drivers. We note that the resulting model is intended for forested landscapes and should not be directly interpreted as a universal fire model across all land-cover regimes.

Missing values and outliers were identified and treated using the Isolation Forest algorithm to improve data quality.

Following these procedures, we obtained a structured binary wildfire dataset—labeled by FireMask values—suitable for model training. The final dataset comprises

N = 133, 811

samples with

d = 13

features, as shown in Figure 2.

NDII 7 = \frac{ρ_{Near Infrared} - ρ_{SWIR (MODIS Band 7)}}{ρ_{Near Infrared} + ρ_{SWIR (MODIS Band 7)}}

(1)

where:

$ρ_{Near Infrared}$ is the reflectance in the near-infrared band (dimensionless);
$ρ_{SWIR (MODIS Band 7)}$ is the reflectance in the shortwave infrared band from MODIS band 7 (dimensionless).

Following the aforementioned procedure, the heterogeneous environmental parameters obtained from multiple sources were converted into labeled, structured datasets suitable for subsequent input into the model.

Reproducibility and Sensitivity to Key Assumptions

All data aggregation, preprocessing, and sampling were implemented on the Google Earth Engine (GEE) platform with explicitly recorded dataset identifiers (collection/version IDs), spatial harmonization rules (continuous vs. categorical resampling), FireMask labeling and exclusion rules, buffer radius, random seed(s), and outlier screening settings. These key parameters are summarized in Table 2 to facilitate the exact reproduction of the modeling dataset.

To support reproducibility, we have released the processed wildfire dataset (including labels and predictors) and accompanying documentation on GitHub at (https://github.com/machenyu2023/wildfire-dataset (accessed on 28 December 2025)). This repository enables readers to reconstruct the modeling inputs and to conduct straightforward sensitivity checks of key assumptions (e.g., alternative buffer radii or land-cover masks) using the same workflow.

2.3. Model Selection and Methodology

This study utilizes TabNet for a wildfire susceptibility assessment. The TabNet architecture is shown in Figure 3, where the encoder–decoder-based framework consists of sequential decision steps. Each decision step incorporates two key components: the Attentive Transformer and the Feature Transformer. The Attentive Transformer dynamically selects key features using a sparse feature mask, while the Feature Transformer processes features through shared and independent layers to extract global and local patterns. The decoder aggregates contributions from all steps and employs residual connections to optimize gradient propagation, ensuring efficient end-to-end learning.

To address class imbalance in the wildfire dataset, we apply three strategies: class weighting, SMOTE oversampling, and random undersampling. Class weighting adjusts the loss function to amplify minority class errors. SMOTE generates synthetic minority class samples, enriching decision boundaries and reducing overfitting. Random undersampling reduces majority class samples, balancing class representation.

Following feature standardization via Batch Normalization, features are processed through the TabNet encoder. Each decision step computes attention weights using the previous step’s feature vector and scale information, followed by sparse feature masking via the Sparsemax function. The selected features undergo transformation through GLU-based feature transformers, producing decision vectors that are aggregated to form the final output.

Model training minimizes a weighted binary cross-entropy loss to account for class imbalance:

L_{CE} = - \sum_{i} w_{i} [y_{i} log ({\hat{y}}_{i}) + (1 - y_{i}) log (1 - {\hat{y}}_{i})]

(2)

where

w_{i}

denotes class weights, and

y_{i}

and

{\hat{y}}_{i}

represent the true labels and predicted probabilities, respectively. We use the Adam optimizer with learning rate scheduling and early stopping. Hyperparameter optimization is conducted using Bayesian optimization (TPE) to maximize the

F_{β}

score:

F_{β} = (1 + β^{2}) \frac{Precision \times Recall}{β^{2} \cdot Precision + Recall}

(3)

This process fine-tunes the hyperparameters:

n_{d}

,

n_{a}

,

n_{steps}

,

γ

, and the classification threshold.

TabNet’s interpretability is facilitated by its sparse feature masking, providing both local and global insights. Local interpretability is achieved by examining the sparse feature masks at each step, revealing the features used for decision-making at each step. Global interpretability is derived by aggregating these masks across all steps, weighted by the magnitude of the decision vectors:

M_{agg} = \sum_{i = 1}^{N_{steps}} η_{i} M_{i}

(4)

Sparse feature selection is encouraged by a regularization term that enforces sparsity on the masks:

L_{sparse} = \sum_{i = 1}^{N_{steps}} \sum_{b = 1}^{B} \sum_{j = 1}^{D} - M_{i, b, j} log (M_{i, b, j} + ε)

(5)

where

ε

ensures numerical stability. Visualization of the aggregated global importance and individual step masks enables better understanding of the model’s decision pathways.

2.4. Model Evaluation and Comparison

Model performance is evaluated using standard classification metrics—Precision, Recall, and the

F_{β}

-score—each characterizing a distinct dimension of predictive capability. These metrics are defined as

\begin{matrix} Precision = \frac{TP}{TP + FP} & Recall = \frac{TP}{TP + FN} \end{matrix}

(6)

F_{β} -score = \frac{(1 + β^{2}) \cdot Precision \cdot Recall}{β^{2} \cdot Precision + Recall}

(7)

where TP, FP, TN, and FN denote True Positives, False Positives, True Negatives, and False Negatives, respectively.

In the context of wildfire susceptibility assessment, the cost of false negatives (i.e., missed ignitions) typically exceeds that of false positives, as a single undetected fire can rapidly escalate into a large-scale incident, whereas an occasional false alarm primarily results in the unnecessary deployment of monitoring or suppression resources. The

F_{β}

-score explicitly encodes this asymmetry by weighting Recall

β

times more strongly than Precision; for

β = 1

, both contribute equally. In this study, we set

β = 1.5

, effectively assigning a 50% higher weight to Recall relative to Precision. This choice favours models that minimise missed fire events, even at the cost of a moderate increase in false alarms, aligning the evaluation criterion with the operational preferences of wildfire management agencies.

In this study, a hierarchical random sampling strategy was employed to partition the original dataset into three subsets: a training set (60%), a validation set (20%), and an independent test set (20%). The process involved the following steps:

Independent test set retention: 20% of the original samples were set aside as an independent test set, with fixed random seeds to ensure reproducibility.
Category-balanced processing: The remaining 80% of the data underwent category-balancing treatment, which included optional oversampling (SMOTE) or undersampling techniques to address class imbalance.
Stratified sampling: The balanced set was then divided into a training subset (75%) and a validation subset (25%) using stratified sampling, ensuring consistent class distribution across the subsets and minimizing sampling bias.

This careful division of data helps ensure the model’s robustness, reduces potential biases, and improves the generalizability of results.

For model training, a three-stage optimization framework was applied:

Hyperparameter search: Hyperparameters were optimized using Bayesian optimization (Hyperopt TPE) combined with K-fold hierarchical cross-validation. In each iteration, the features were standardized using Z-score normalization, and the mean $F_{1.5}$ score across validation folds was used as the optimization objective.
Retraining with complete training data: The model was then retrained using the entire training set, including both training and validation data. The optimal classification threshold was determined via grid search with a step size of 0.01.
Final evaluation: The final model performance was assessed on the independent test set using metrics such as the $F_{1.5}$ score, precision, recall, and AUC.

To ensure reliability, all random processes used fixed seeds, and early stopping (with a patience of 10 epochs) was implemented to prevent overfitting. A total of 30 optimization rounds with 10-fold cross-validation were performed to fine-tune the model and assess its performance.

Additional diagnostic information is obtained from the Receiver Operating Characteristic (ROC) curve, which plots the True Positive Rate (TPR) against the False Positive Rate (FPR) across varying decision thresholds. The Area Under the Curve (AUC) provides a scalar summary of overall discriminative ability, with values ranging from 0.5 (no skill, equivalent to random guessing) to 1.0 (perfect discrimination). Together, these metrics afford a multidimensional characterization of model robustness under varying class distributions and decision thresholds.

Model validation is performed using spatiotemporal wildfire event datasets sourced from Google Earth Engine (GEE), which integrate environmental covariates with validated FireMasks. A multi-event evaluation protocol is employed to rigorously assess the model’s generalization capability, thereby increasing confidence in its applicability for operational wildfire risk forecasting. For the case study validation, a decision threshold of 0.6 was selected, where regions with a fire likelihood greater than 0.6 are classified as high-risk (fire occurrence). This threshold was chosen to balance the model’s ability to identify high-risk areas while minimizing both false positives and false negatives. A higher threshold, such as 0.7 or above, could result in too many areas being incorrectly categorized as low-risk, whereas a lower threshold (e.g., 0.5 or below) would increase the occurrence of false positives. The threshold of 0.6 thus provides a suitable compromise. In creating the wildfire susceptibility maps, kriging interpolation and natural breaks classification were employed for visualization.

For comparative analysis, three baseline models were considered:

Random Forest (RF): A classical ensemble learning algorithm based on bootstrap aggregation of decision trees.
XGBoost: A gradient boosting framework employing second-order optimization of the loss function.
LightGBM: A high-efficiency gradient boosting implementation using leaf-wise tree growth with histogram-based splitting.

All models were trained using an identical preprocessing pipeline and evaluated under a five-fold cross-validation scheme to ensure a fair and statistically robust comparison.

3. Results and Case Studies

3.1. Results Overview

This section reports (i) the training and evaluation results of the proposed TabNet model, (ii) spatial validation using three historical wildfire incidents, and (iii) comparisons against benchmark machine-learning models.

3.2. Model Training Results

As summarized in Table 3, the final TabNet configuration was selected from a predefined search space, including

n_{d} = n_{a} = 16

,

n_{steps} = 6

,

γ = 1.0

,

n_{independent} = 1

,

n_{shared} = 2

,

λ_{sparse} = 0.001

, and a learning rate of 0.005. Using this configuration, the model achieved strong validation performance (Table 4) with

F_{1.5} = 0.9701

, precision of 0.9274, and recall of 0.9903 on 48,433 samples with 13 features. On the independent test set (26,763 samples; 13 features), the performance was

F_{1.5} = 0.7746

, with precision of 0.5416 and recall of 0.9577.

3.3. Case Study Results

This section presents the evaluation of several significant wildfire events through the application of a model optimized to assess environmental features on the respective event days. These features were derived from historical data and spatially matched against fire-point observations from the MODIS (MOD14A1) satellite. The primary objective was to investigate the correlation between model outputs and actual wildfire occurrences. The resulting data were visualized using ordinary kriging interpolation to generate wildfire susceptibility maps. While five case studies were initially selected, the analysis focuses on three cases for visual representation.

Table 5 summarizes the fire-point prediction accuracy of the selected cases (where samples with a confidence greater than 0.6 are defined as fire points, see Section 2.4), showing the number of MODIS fire points, the matched points, and the matching rate. Figure 4, Figure 5 and Figure 6 display comparisons between optical imagery and model-derived susceptibility for the Muli, Kangding, and Lijiang wildfire events. These figures provide insight into the spatial distribution of fire risk within the regions, with high-risk areas often aligning with observed fire points.

3.4. Model Comparison

Table 6 shows that TabNet produces relatively fewer positive predictions while maintaining a non-trivial proportion of high-confidence predictions, which benefits map credibility. Figure 7 shows that TabNet achieves the highest AUC (0.76), indicating better class separability. Figure 8 suggests that Random Forest produces overly homogeneous regions, while gradient-boosting baselines show limited recognition of adjacent similar environments; TabNet yields more coherent susceptibility gradients in complex geographical settings.

The evaluation of these three case studies reveals that the model generally provides good spatial generalization, with fire points frequently located within high-risk zones. However, the Muli (2017) case, despite a large dataset, shows no matches (0%) with MODIS fire points. This discrepancy warrants further exploration in the Discussion section, as it suggests the influence of additional factors not accounted for in the model’s current features.

4. Model Interpretability and Discussion

4.1. Model Performance and Weakness

By comparing precision and recall and calculating

F_{1.5}

, we evaluated the suitability of different models for wildfire susceptibility analysis. As shown in Figure 9, TabNet outperforms the other models in both precision and recall. Compared with XGBoost, TabNet improves precision from 0.3661 to 0.395 (+7.89%) and recall from 0.8675 to 0.8835 (+1.84%). Compared with LightGBM, precision improves by 12.36% (

0.3515 \to 0.395

) and recall by 2.49% (

0.8620 \to 0.8835

). Against Random Forest, precision nearly doubles (

0.2001 \to 0.395

) and recall increases by 24.45% (

0.7098 \to 0.8835

). Since, in many real-world scenarios, missed detections can be more costly than false alarms,

F_{1.5}

is often more aligned with practical needs, and TabNet performs well on this metric.

TabNet’s advantages are related to its “Feature Transformer–Sparse Attention–Stepwise Decision” architecture, which can suppress overfitting on noisy and imbalanced structured data while retaining an interpretable feature-selection chain. For example, Figure 8 indicates that TabNet assigns the Muli ground–truth area to a higher–susceptibility zone even when direct matching is poor, which may imply improved spatial generalization in complex environments.

4.1.1. Weakness Discussion Based on the Muli Case

The Muli incident also reveals a practical weakness of the alternative baselines: for this event, the high–susceptibility zones produced by XGBoost, LightGBM, and Random Forest do not provide effective spatial matching with the ground–truth burned area (i.e., most of the affected region is not covered by their predicted high-risk classes). This “non–matching” behavior may stem from their reliance on relatively global, non-stepwise feature interactions and split–based decision boundaries, which can be sensitive to local domain shifts (e.g., micro–topography, heterogeneous vegetation mosaics, or strong spatial autocorrelation). In addition, if the Muli area contains atypical combinations of drivers (e.g., elevation–slope–fuel–meteorology patterns) that are under–represented in the training samples, tree-ensemble baselines may regress toward more common patterns and thus fail to highlight the event footprint. Therefore, beyond aggregate metrics, the Muli case emphasizes the necessity of event–level evaluation: a model can achieve reasonable overall recall while still missing critical extreme events in specific geomorphic or climatic contexts.

4.1.2. Limitations and Stability Considerations

A key limitation of interpretability analyses based on feature masks (or SHAP values) is the potential instability of feature rankings under different data splits, random seeds, or parameter settings. In practice, wildfire data may exhibit spatial autocorrelation and class imbalance, which can further amplify variance in estimations.

4.2. Feature Importance and Geographical Distribution

The TabNet model offers intrinsic interpretability by producing feature-mask matrices during inference. As shown in Figure 10, the horizontal axis corresponds to the input features and the vertical axis corresponds to samples; each sub-panel represents a decision step. Brighter values indicate that the corresponding feature is more heavily used at that step. Aggregating these masks yields a global ranking of feature importance (Figure 11). Overall, thermal-related variables (e.g., LST) and vegetation-related factors (e.g., LandCover and NDVI) contribute most to the susceptibility prediction, followed by seasonal timing (day_of_year). Meteorological and moisture-related variables (e.g., precipitation, soil moisture, wind) provide secondary but non-negligible support, suggesting that TabNet learns a driver hierarchy that aligns with wildfire-prone conditions driven by heat, fuel availability, and seasonality.

We further evaluate event-level interpretability to reveal geographic heterogeneity in the learned drivers. Figure 12 visualizes the masks for three representative incidents, while Figure 13 summarizes the corresponding feature-importance patterns. Although the incidents share a broadly consistent structure (i.e., thermal and vegetation-related variables remain influential), the relative emphasis on specific features varies across events, reflecting differences in local climate, topography, and human activity. For example, mountainous and strongly seasonal regions tend to show stronger coupling between elevation/seasonal indicators and vegetation-related signals, whereas other events exhibit comparatively higher reliance on moisture or wind-related variables, consistent with event-dependent ignition and spread mechanisms.

Combining the spatial susceptibility maps (Figure 8) with the above interpretability results, Xichang—a typical mountainous area with pronounced dry–wet seasonality—illustrates that TabNet produces a more coherent and targeted depiction of localized risk in valleys, on slopes, and around settlements. In contrast, other models sometimes yield fragmented high-risk patches or miss key areas, which may reflect limitations in capturing multi-factor coupling under spatial heterogeneity.

Overall, the global and event-level masks indicate that TabNet can capture coupled environmental drivers while preserving geographically differentiated susceptibility signals, which is beneficial for operational risk zoning and for prioritizing monitoring and mitigation efforts.

4.3. Interpretable Methods and Their Applications

TabNet generates embedded feature masks through sequential sparse attention, while SHAP is a widely adopted post hoc interpretability method for tree-based models. Using the Kangding region as a case study (Figure 14), both approaches reveal similar overall patterns—thermal factors (such as LST and NDII7) and vegetation indices (like NDVI) show strong contributions, while precipitation has minimal impact. However, the methods diverge in their detailed rankings, such as the influence of population density. This discrepancy reflects the differing mechanisms of the models and their explanation frameworks: TabNet highlights dominant factors that are consistently selected across decision steps, while SHAP aggregates contributions from multiple tree paths, potentially emphasizing features that frequently trigger localized splits. In practice, combining both methods can help identify the most stable environmental drivers, while still accounting for occasional human or extreme events.

The interpretability framework proposed here sets itself apart from purely post hoc methods by leveraging TabNet’s intrinsic sparse feature selection. By tracking the attention masks during inference, it provides a transparent decision trace and facilitates feature-space refinement (for example, by eliminating consistently low-importance variables). The reduced feature subset can then be transferred to traditional models, such as Random Forest or GBDT, improving efficiency and enabling cross-model comparison.

Practical Application Value

The interpretability of this model significantly enhances its practical application, particularly in fields like environmental monitoring, disaster management, and policy-making. By providing insights into the key factors driving environmental changes, this framework can help policymakers and environmental scientists prioritize areas that require attention, allocate resources more effectively, and predict potential future events. For instance, understanding the relative importance of factors like temperature, vegetation indices, and precipitation can guide more targeted interventions in regions prone to natural disasters, such as floods or droughts. Furthermore, the ability to refine feature space and transfer the reduced feature set to other models improves the model’s adaptability and scalability, making it suitable for a wide range of applications, from climate modeling to urban planning.

5. Conclusions

This study developed an intrinsically interpretable wildfire-susceptibility assessment framework for forested landscapes in Southwest China by integrating multi-source environmental and anthropogenic predictors with the TabNet deep learning architecture. Motivated by the increasing frequency and severity of wildfires under climate change and intensifying human activities, the proposed approach addresses a key limitation of conventional machine-learning pipelines—namely, the trade-off between predictive accuracy and interpretability—by providing built-in, step-wise explanations through TabNet’s sparse feature-selection mechanism.

Using Google Earth Engine, we constructed a harmonized 1 km × 1 km dataset (2010–2020) that combines MODIS FireMask-derived labels with climatic (air temperature, precipitation, wind, soil moisture), vegetation (NDVI, NDII7, land cover), topographic (elevation), anthropogenic (population density), and seasonal (day-of-year) variables. A strict sampling strategy, including FireMask-based label filtering, a 5 km buffer to reduce spatial dependence, forest-land-cover masking, and data cleaning with Isolation Forest, was adopted to improve label reliability and reduce trivial separability. To mitigate class imbalance, class weighting and resampling strategies were incorporated during model training and evaluation.

The optimized TabNet model demonstrated strong wildfire-detection capabilities, achieving an AUC of 0.760 and a high recall (0.9577) on the independent test set, with an

F_{1.5}

score of 0.7746, outperforming Random Forest, XGBoost, and LightGBM under the same preprocessing and validation protocol. Event-level case studies further indicated that high-susceptibility zones generally aligned with observed MODIS fire points, suggesting that the model can capture spatially coherent risk gradients in complex mountainous environments. Importantly, TabNet’s mask matrices provided transparent global and event-specific interpretations, consistently highlighting thermal and vegetation-related controls (e.g., LST, land cover, NDVI) together with seasonality, while moisture and wind variables offered secondary but non-negligible contributions. The agreement between mask-based rankings and SHAP analyses further supports the reliability of the intrinsic explanations.

Overall, the proposed framework combines accuracy with operational transparency, offering a deployable pathway for wildfire early warning, risk zoning, and resource prioritization in forested regions. Future work should examine the sensitivity of susceptibility patterns to key assumptions (e.g., incorporating a buffer radius, land-cover masking, and sampling design).

Author Contributions

Conceptualization, C.M. and S.Y.; methodology, C.M. and J.C.; software, C.M. and D.Z.; validation, C.M., S.Y. and J.C.; formal analysis, C.M.; investigation, C.M., Q.L. and Q.Y.; resources, S.Y. and J.G.; data curation, C.M. and J.C.; writing—original draft preparation, C.M. and C.Q.; writing—review and editing, C.M. and S.Y.; visualization, C.M.,X.W. and Q.L.; supervision, S.Y. and J.C.; project administration, S.Y.; funding acquisition, S.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Civil Aerospace Technology Advanced Research Project of China under Grant D040306.

Data Availability Statement

Data supporting the results of this study are available at https://github.com/machenyu2023/wildfire-dataset (accessed on 28 December 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

GEE	Google Earth Engine
NDVI	Normalized Difference Vegetation Index
NDII	Normalized Difference Infrared Index

References

Canadell, J.G.; Meyer, C.P.; Cook, G.D.; Dowdy, A.; Briggs, P.R.; Knauer, J.; Pepler, A.; Haverd, V. Multi-decadal increase of forest burned area in Australia is linked to climate change. Nat. Commun. 2021, 12, 6921. [Google Scholar] [CrossRef]
Kelly, L.T.; Giljohann, K.M.; Duane, A.; Aquilue, N.; Archibald, S.; Batllori, E.; Bennett, A.F.; Buckland, S.T.; Canelles, Q.; Clarke, M.F.; et al. Fire and biodiversity in the Anthropocene. Science 2020, 370, eabb0355. [Google Scholar] [CrossRef]
Byrne, B.; Liu, J.; Bowman, K.W.; Pascolini-Campbell, M.; Chatterjee, A.; Pandey, S.; Miyazaki, K.; van der Werf, G.R.; Wunch, D.; Wennberg, P.O.; et al. Carbon emissions from the 2023 Canadian wildfires. Nature 2024, 633, 835–839. [Google Scholar] [CrossRef]
Park, C.Y.; Takahashi, K.; Fujimori, S.; Jansakoo, T.; Burton, C.; Huang, H.; Kou-Giesbrecht, S.; Reyer, C.P.O.; Mengel, M.; Burke, E.; et al. Attributing human mortality from fire PM2.5 to climate change. Nat. Clim. Change 2024, 14, 1193–1200. [Google Scholar] [CrossRef]
Burke, M.; Childs, M.L.; de la Cuesta, B.; Qiu, M.; Li, J.; Gould, C.F.; Heft-Neal, S.; Wara, M. The contribution of wildfire to PM2.5 trends in the USA. Nature 2023, 622, 761–766. [Google Scholar] [CrossRef]
Kolden, C.A.; Abatzoglou, J.T.; Jones, M.W.; Jain, P. Wildfires in 2024. Nat. Rev. Earth Environ. 2025, 6, 367. [Google Scholar] [CrossRef]
Luz, A.E.O.; Negri, R.G.; Massi, K.G.; Colnago, M.; Silva, E.A.; Casaca, W. Mapping Fire Susceptibility in the Brazilian Amazon Forests Using Multitemporal Remote Sensing and Time-Varying Unsupervised Anomaly Detection. Remote Sens. 2022, 14, 2429. [Google Scholar] [CrossRef]
Üstek, İ; Arana-Catania, M.; Farr, A.; Petrunin, I. Deep Autoencoders for Unsupervised Anomaly Detection in Wildfire Prediction. Earth Space Sci. 2024, 11, e2024EA003997. [Google Scholar] [CrossRef]
Barzani, A.R.; Pahlavani, P.; Ghorbanzadeh, O. Ensembling of decision trees, knn, and logistic regression with soft-voting method for wildfire susceptibility mapping. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, 10, 647–652. [Google Scholar] [CrossRef]
Jiang, W.; Qiao, Y.; Zheng, X.; Zhou, J.; Jiang, J.; Meng, Q.; Su, G.; Zhong, S.; Wang, F. Wildfire risk assessment using deep learning in Guangdong Province, China. Int. J. Appl. Earth Obs. Geoinf. 2024, 128, 103750. [Google Scholar] [CrossRef]
Wang, N.; Zhao, S.; Wang, S. A novel clustering-based resampling with cost-sensitive boosting method to model and map wildfire susceptibility. Reliab. Eng. Syst. Saf. 2024, 242, 109742. [Google Scholar] [CrossRef]
Rui, C.; He, B.; Li, Y.; Zhang, Y.; Liao, Z.; Fan, C.; Yin, J.; Zhang, H. Incorporating fire spread simulation and machine learning algorithms to estimate crown fire potential for pine forests in Sichuan, China. Int. J. Appl. Earth Obs. Geoinf. 2024, 132, 104080. [Google Scholar]
Sazib, N.; Bolten, J.D.; Mladenova, I.E. Leveraging NASA Soil Moisture Active Passive for Assessing Fire Susceptibility and Potential Impacts Over Australia and California. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 779–787. [Google Scholar] [CrossRef]
Ting, Y.; Li, J.; Ma, L.; Zhou, J.; Wang, R.; Eichhorn, M.P.; Zhang, H. Status, advancements and prospects of deep learning methods applied in forest studies. Int. J. Appl. Earth Obs. Geoinf. 2024, 131, 103938. [Google Scholar]
Nur, A.S.; Kim, Y.J.; Lee, J.; Lee, C.W. Spatial Prediction of Wildfire Susceptibility Using Hybrid Machine Learning Models Based on Support Vector Regression in Sydney, Australia. Remote Sens. 2023, 15, 760. [Google Scholar] [CrossRef]
Di Giuseppe, F.; McNorton, J.; Lombardi, A.; Wetterhall, F. Global data-driven prediction of fire activity. Nat. Commun. 2025, 16, 2918. [Google Scholar] [CrossRef]
Bui, D.T.; Le, K.T.T.; Nguyen, V.C.; Le, H.D.; Revhaug, I. Tropical Forest Fire Susceptibility Mapping at the Cat Ba National Park Area, Hai Phong City, Vietnam, Using GIS-Based Kernel Logistic Regression. Remote Sens. 2016, 8, 347. [Google Scholar]
Li, Y.; Xu, S.; Fan, Z.; Zhang, X.; Yang, X.; Wen, S.; Shi, Z. Risk Factors and Prediction of the Probability of Wildfire Occurrence in the China-Mongolia-Russia Cross-Border Area. Remote Sens. 2023, 15, 42. [Google Scholar] [CrossRef]
Sun, Y.; Zhang, F.; Lin, H.; Xu, S. A Forest Fire Susceptibility Modeling Approach Based on Light Gradient Boosting Machine Algorithm. Remote Sens. 2022, 14, 4362. [Google Scholar] [CrossRef]
Janizadeh, S.; Tran, T.T.K.; Bateni, S.M.; Jun, C.; Kim, D.; Trauernicht, C.; Heggy, E. Advancing the LightGBM approach with three novel nature-inspired optimizers for predicting wildfire susceptibility in Kauaʻi and Molokaʻi Islands, Hawaii. Expert Syst. Appl. 2024, 258, 124963. [Google Scholar] [CrossRef]
Xie, L.; Zhang, R.; Zhan, J.; Li, S.; Shama, A.; Zhan, R.; Wang, T.; Lv, J.; Bao, X.; Wu, R. Wildfire Risk Assessment in Liangshan Prefecture, China Based on An Integration Machine Learning Algorithm. Remote Sens. 2022, 14, 4592. [Google Scholar] [CrossRef]
Zhang, G.; Wang, M.; Liu, K. Deep neural networks for global wildfire susceptibility modelling. Ecol. Indic. 2021, 127, 107735. [Google Scholar] [CrossRef]
Kanwal, R.; Rafaqat, W.; Iqbal, M.; Song, W. Data-Driven Approaches for Wildfire Mapping and Prediction Assessment Using a Convolutional Neural Network (CNN). Remote Sens. 2023, 15, 5099. [Google Scholar] [CrossRef]
Rezaie, F.; Panahi, M.; Bateni, S.M.; Lee, S.; Jun, C.; Trauernicht, C.; Neale, C.M.U. Development of novel optimized deep learning algorithms for wildfire modeling: A case study of Maui, Hawai‘i. Eng. Appl. Artif. Intell. 2023, 125, 106699. [Google Scholar] [CrossRef]
Iban, M.C.; Aksu, O. SHAP-Driven Explainable Artificial Intelligence Framework for Wildfire Susceptibility Mapping Using MODIS Active Fire Pixels: An In-Depth Interpretation of Contributing Factors in Izmir, Türkiye. Remote Sens. 2024, 16, 2842. [Google Scholar] [CrossRef]
Abdollahi, A.; Pradhan, B. Explainable artificial intelligence (XAI) for interpreting the contributing factors feed into the wildfire susceptibility prediction model. Sci. Total Environ. 2023, 879, 163004. [Google Scholar] [CrossRef]
Ramayanti, S.; Kim, B.; Park, S.; Lee, C.W. Wildfire susceptibility mapping by incorporating damage proxy maps, differenced normalized burn Ratio, and deep learning algorithms based on sentinel-1/2 data: A case study on Maui Island, Hawaii. GIScience Remote Sens. 2024, 61, 2353982. [Google Scholar] [CrossRef]
Arik, S.; Pfister, T. TabNet: Attentive Interpretable Tabular Learning. Proc. AAAI Conf. Artif. Intell. 2021, 35, 6679–6687. [Google Scholar] [CrossRef]
Li, W.; Kang, J.; Wang, Y. Exploring the interactions and driving factors among typical ecological risks based on ecosystem services: A case study in the Sichuan-Yunnan ecological barrier area. Ecol. Indic. 2025, 170, 113000. [Google Scholar] [CrossRef]
Ying, L.; Cheng, H.; Shen, Z.; Guan, P.; Luo, C.; Peng, X. Relative humidity and agricultural activities dominate wildfire ignitions in Yunnan, Southwest China: Patterns, thresholds, and implications. Agric. For. Meteorol. 2021, 307, 108540. [Google Scholar] [CrossRef]
Du, J.; Wang, K.; Cui, B. Attribution of the Extreme Drought-Related Risk of Wildfires in Spring 2019 over Southwest China. Bull. Am. Meteorol. Soc. 2021, 102, S83–S90. [Google Scholar] [CrossRef]
Yin, J.; He, B.; Fan, C.; Chen, R. Fire has become a major disturbance agent in the forests of Southwest China. Ecol. Indic. 2024, 160. [Google Scholar] [CrossRef]
Xing, H.; Fang, K.; Yao, Q.; Zhou, F.; Ou, T.; Liu, J.; Zhou, S.; Jiang, S.; Chen, Y.; Bai, M.; et al. Impacts of changes in climate extremes on wildfire occurrences in China. Ecol. Indic. 2023, 157, 111288. [Google Scholar] [CrossRef]
Jiao, M.; Quan, X.; Yao, J.; Wang, W. How Does the Management Paradigm Contain Wildfire Over Southwest China? Evidence From Remote Sensing Observation. IEEE Geosci. Remote Sens. Lett. 2023, 20, 2504205. [Google Scholar] [CrossRef]
Liu, J.; Wang, Y.; Lu, Y.; Zhao, P.; Wang, S.; Sun, Y.; Luo, Y. Application of Remote Sensing and Explainable Artificial Intelligence (XAI) for Wildfire Occurrence Mapping in the Mountainous Region of Southwest China. Remote Sens. 2024, 16, 3602. [Google Scholar] [CrossRef]
Wilson, N.R.; Norman, L.M. Analysis of vegetation recovery surrounding a restored wetland using the normalized difference infrared index (NDII) and normalized difference vegetation index (NDVI). Int. J. Remote Sens. 2018, 39, 3243–3274. [Google Scholar] [CrossRef]
Giglio, L.; Loboda, T.; Roy, D.P.; Quayle, B.; Justice, C.O. An active-fire based burned area mapping algorithm for the MODIS sensor. Remote Sens. Environ. 2009, 113, 408–420. [Google Scholar] [CrossRef]

Figure 1. Location of the study area in Southwestern China. The (left) figure shows the location of the study area, with the study region highlighted in blue. The land-cover map (right) was derived from the MODIS Land Cover Type product (MOD11A2) following the IGBP classification scheme. Major geographical features and the boundary of the primary study area are annotated.

Figure 2. Schematicdiagram of the sampling process used to construct the modeling dataset from multi-source inputs. Red polygons denote positive-sample regions, red circles indicate the 5 km buffer zones around historical fire points, and negative samples are drawn from areas outside the buffers.

Figure 3. TabNet architecture: (a) Encoder, (b) Decoder, (c) Feature Transformer, (d) Attentive Transformer.

Figure 4. Comparison of optical imagery and model-derived susceptibility for the Muli wildfire event. (a) MODIS false-color composite (SWIR–Green–Blue) over the Muli region, acquired on 30 March 2020; the white overlay highlights the burning area. (b) Wildfire susceptibility map from model inference, based on MODIS data acquired on 30 March 2020; polygons denote FireMask vectors extracted from MOD14A1.

Figure 5. Comparison of optical imagery and model-derived susceptibility for the Kangding wildfire event. (a) MODIS false-color composite (SWIR–Green–Blue) acquired on 28 February 2013 over the Kangding region; the white overlay highlights the burning area. (b) Wildfire susceptibility map from model inference; polygons denote FireMask vectors extracted from MOD14A1.

Figure 6. Comparison of optical imagery and model-derived susceptibility for the Lijiang wildfire event. (a) MODIS false-color composite (SWIR–Green–Blue) acquired on 7 February 2017 over the Lijiang region; the white overlay highlights the burning area. (b) Wildfire susceptibility map from model inference; polygons denote FireMask vectors extracted from MOD14A1.

Figure 7. Receiver Operating Characteristic (ROC) curves for the comparison of model performance across multiple evaluation metrics, highlighting the trade-off between sensitivity and specificity.

Figure 8. Comparison of wildfire susceptibility maps generated by different machine learning models (left to right: LightGBM, Random Forest, TabNet, and XGBoost), showcasing the spatial variations in predicted susceptibility across models.

Figure 9. Performance comparison across models using precision, recall, and

F_{1.5}

.

Figure 9. Performance comparison across models using precision, recall, and

F_{1.5}

.

Figure 10. Mask visualizations for wildfire datasets (133,381 samples). Brighter bands indicate greater involvement of the corresponding feature in classification at each decision step. The x-axis labels for each step, from left to right, are: AirTemperature_Kelvin, Elevation_m, LST_Kelvin, LandCover_Type, NDVI, PopulationDensity, Precipitation_mm_per_hr, SoilMoisture, WindDirection_deg, WindSpeed_m_per_s, SPEI_03, and day_of_year.

Figure 11. Global feature importance derived from TabNet masks.

Figure 12. Mask visualizations for three wildfire incidents. Brighter bands indicate greater involvement of the corresponding feature in classification at each decision step. The x-axis labels for each step, from left to right, are: AirTemperature_Kelvin, Elevation_m, LST_Kelvin, LandCover_Type, NDVI, PopulationDensity, Precipitation_mm_per_hr, SoilMoisture, WindDirection_deg, WindSpeed_m_per_s, SPEI_03, and day_of_year.

Figure 13. Global and step-wise feature importance for three wildfire incidents (MuLi, XiChang, and KangDing) derived from TabNet masks. For each incident, the left panel shows aggregated (global) feature importance, while the right panel presents the step-wise feature-mask strength across decision steps (Steps 1–5). Larger bars and warmer colors indicate higher contribution of the corresponding predictors (AirTemperature, Elevation, LST, LandCover, NDVI, PopulationDensity, Precipitation, SoilMoisture, WindDirection, WindSpeed, SPEI_03, and day_of_year) to the final classification.

Figure 14. Comparison of interpretability methods in the Kangding incident: TabNet masks vs. XGBoost-SHAP. TabNet leverages sparse attention to generate embedded feature masks, highlighting dominant factors that are consistently selected during the decision-making process. SHAP, on the other hand, aggregates feature contributions across various decision paths in tree models. Both methods identify thermal factors (e.g., LST and NDII7) and vegetation indices (e.g., NDVI) as key drivers, while precipitation has a limited impact. Differences arise in the detailed ranking of features, such as population density, due to the models’ differing mechanisms: TabNet emphasizes factors selected across decision steps, while SHAP focuses on aggregated contributions from multiple tree paths.

Table 1. Detailed description of data sources.

Data Type	Description	Data Source
Soil Moisture	Monthly average soil moisture at 0–10 cm depth	NASA/FLDAS/NOAH01/C/GL/M/V001
SPEI_03	3-month Standardized Precipitation Evapotranspiration Index (SPEI)	CSIC/SPEI/2–9
FireMask	Binary mask indicating fire points	MODIS/061/MOD14A1
NDII7	Short-wave infrared moisture index, calculated from MODIS reflectance bands	MODIS/006/MOD09GA
Precipitation	Daily total precipitation in mm	NASA/GPM_L3/IMERG_V07
LST	Daily land surface temperature in Kelvin	MODIS/061/MOD11A1
Wind Speed	Daily maximum wind speed at 10 m height in m/s	ECMWF/ERA5/DAILY
Wind Direction	Wind direction at 10 m height in degrees	ECMWF/ERA5/DAILY
Air Temperature	2 m daily mean air temperature in Celsius	ECMWF/ERA5/DAILY
Elevation	30 m digital elevation model in meters	USGS/SRTMGL1_003
NDVI	16-day composite vegetation index	MODIS/061/MOD13A1
Land Cover	IGBP classification of land cover types	MODIS/061/MCD12Q1
Population Density	Gridded population density (persons per unit area)	CIESIN/GPWv411/GPW_Population_Density
Day of the Year	Fire occurrence day in the year	MODIS/061/MCD64A1

Table 2. Key preprocessing, sampling, and reproducibility parameters (GEE workflow).

Component	Setting/Rule	Value/Notes
Study period	Temporal coverage of all variables and labels	January 2010 to September 2020
Spatial grid	Target spatial resolution and grid definition	1 km × 1 km; CRS: EPSG:4326 (WGS84)
Resampling (continuous)	Resampling/aggregation method for continuous predictors	Nearest-neighbor aggregation/resampling on GEE (applied to continuous predictors after harmonization to 1 km)
Resampling (categorical)	Resampling method for categorical predictors (land cover)	As described in Section 2.2 (IGBP land-cover classes preserved during 1 km harmonization); exact categorical resampling operator not separately reported
Fire label definition	Positive and negative label mapping from MOD14A1 FireMask	Positive (fire): FireMask = 7, 8, 9; Negative (non-fire): FireMask = 5
Fire label exclusions	Excluded FireMask categories (invalid/unobserved conditions)	Not explicitly reported in the main text; see the released dataset documentation and sampling script/configuration in the GitHub repository https://github.com/machenyu2023/wildfire-dataset (accessed on 28 December 2025)
Sampling design	Spatial sampling rule for positives and negatives	Positive samples randomly selected within buffers around historical fire detections; negative samples randomly drawn from areas outside all buffers
Buffer radius	Radius for spatial separation around historical fire detections	5 km
Sampling balance	Class balance and sampling size control	Final dataset size: $N = 133, 811$ ; class ratio/ stratification scheme not separately reported (see repository documentation)
Randomness control	Random seed(s) for sampling and preprocessing	Random seed(s) not separately reported in the manuscript; see repository documentation
Missing value handling	Masking/imputation rule prior to modeling	Not separately reported; missing values and outliers were handled during data cleaning (see Section 2.2 and repository documentation)
Outlier screening	Outlier detection algorithm and hyperparameters	Isolation Forest used for outlier identification; hyperparameters not separately reported (see repository documentation)
Outlier impact	Fraction of samples removed as outliers (optional but recommended)	Not separately reported
Data versioning	Dataset identifiers (GEE collection/version IDs)	All input datasets and GEE collection/version IDs are listed in Table 1 (Data Sources)

Table 3. TabNet hyperparameters, including their definitions, search spaces, and the selected values after tuning for optimal model performance.

Hyperparameter	Definition	Search Space	Selected
$n_{d}$	Decision layer dimension in feature processing	{8, 16, 24}	16
$n_{a}$	Attention layer width for feature selection	{8, 16, 24}	16
$n_{steps}$	Number of sequential processing steps	{4, 5, 6}	6
$γ$	Mask relaxation coefficient (sparsity control)	{1.0, 1.2}	1.0
$n_{independent}$	Independent GLU blocks per step	{1, 2}	1
$n_{shared}$	Shared GLU blocks across steps	{2, 3}	2
$λ_{sparse}$	Regularization for sparse feature selection	{0.001, 0.005, 0.01}	0.001
Learning rate	Learning rate configuration	{0.002, 0.005}	0.005

Table 4. Model evaluation results on validation and independent test datasets, reporting performance metrics including

F_{1.5}

score, precision, and recall for wildfire detection.

Table 4. Model evaluation results on validation and independent test datasets, reporting performance metrics including

F_{1.5}

score, precision, and recall for wildfire detection.

Dataset	Samples (n)	Features (d)	$F_{1.5}$	Precision	Recall
Validation	48,433	13	0.9701	0.9274	0.9903
Test	26,763	13	0.7746	0.5416	0.9577

Note:

F_{1.5}

emphasizes recall over precision, reflecting the higher cost of missing wildfire detections in practical applications.

Table 5. Fire-point prediction accuracy comparison between model predictions and MODIS (MOD14A1) observations, showing the number of MODIS fire points, the number of matched points, and the matching rate for different locations and dates.

Location	Date	MODIS Fire Points (n)	Matched (n)	Rate (%)
XiChang	30 March 2020	41	13	31.7
MuLi	16 January 2017	28	0	0.0
MuLi	30 March 2020	109	86	78.9
LiJiang	7 February 2017	89	45	50.6
KangDing	28 February 2013	45	20	44.4

Table 6. Performance comparison of forest fire prediction models.

Region	Samples	Random Forest			LightGBM			XGBoost			TabNet
Region	Samples	Fires	>70%	>90%	Fires	>70%	>90%	Fires	>70%	>90%	Fires	>70%	>90%
Xichang	2434	1814	74.5	0.0	1510	62.0	0.2	1397	12.6	0.1	807	13.8	2.9
Kangding	13,355	8508	63.7	0.0	4197	5.7	0.5	3676	6.8	0.3	4443	19.5	4.5
Muli (2017)	15,000	205	1.4	0.0	395	1.0	0.2	839	1.0	0.2	456	1.2	0.3

Note: “Fires (n)” denotes the number of samples predicted as fire. “Conf > 70%” and “Conf > 90%” denote the percentage of predictions whose confidence exceeds the given threshold. The bolded numbers represent the results of this study.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ma, C.; Yang, S.; Cui, J.; Li, Q.; Yao, Q.; Zhang, D.; Guo, J.; Wang, X.; Qu, C. Applying an Interpretable Deep Learning Model to Identify Wildfire-Prone Areas in Southwest China. Fire 2026, 9, 107. https://doi.org/10.3390/fire9030107

AMA Style

Ma C, Yang S, Cui J, Li Q, Yao Q, Zhang D, Guo J, Wang X, Qu C. Applying an Interpretable Deep Learning Model to Identify Wildfire-Prone Areas in Southwest China. Fire. 2026; 9(3):107. https://doi.org/10.3390/fire9030107

Chicago/Turabian Style

Ma, Chenyu, Siquan Yang, Jing Cui, Qiang Li, Qichao Yao, De Zhang, Jiachang Guo, Xinqian Wang, and Chong Qu. 2026. "Applying an Interpretable Deep Learning Model to Identify Wildfire-Prone Areas in Southwest China" Fire 9, no. 3: 107. https://doi.org/10.3390/fire9030107

APA Style

Ma, C., Yang, S., Cui, J., Li, Q., Yao, Q., Zhang, D., Guo, J., Wang, X., & Qu, C. (2026). Applying an Interpretable Deep Learning Model to Identify Wildfire-Prone Areas in Southwest China. Fire, 9(3), 107. https://doi.org/10.3390/fire9030107

Article Menu

Applying an Interpretable Deep Learning Model to Identify Wildfire-Prone Areas in Southwest China

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data and Data Preprocessing

Reproducibility and Sensitivity to Key Assumptions

2.3. Model Selection and Methodology

2.4. Model Evaluation and Comparison

3. Results and Case Studies

3.1. Results Overview

3.2. Model Training Results

3.3. Case Study Results

3.4. Model Comparison

4. Model Interpretability and Discussion

4.1. Model Performance and Weakness

4.1.1. Weakness Discussion Based on the Muli Case

4.1.2. Limitations and Stability Considerations

4.2. Feature Importance and Geographical Distribution

4.3. Interpretable Methods and Their Applications

Practical Application Value

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI