Seasonal Driving Mechanisms and Spatial Patterns of Danger of Forest Wildfires in the Dongjiang Basin, Southern China

Xuewen He; Zhiwei Wan; Bin Yuan; Ji Zeng; Lingyue Liu; Keyuan Zhong; Hong Wu

doi:10.3390/f16060986

,

and

¹

School of Geography and Environmental Engineering, Gannan Normal University, Ganzhou 341000, China

²

Development Research Center of the State Council, Beijing 100010, China

^*

Author to whom correspondence should be addressed.

Forests2025, 16(6), 986;https://doi.org/10.3390/f16060986

This article belongs to the Special Issue Forest Fire: Landscape Patterns, Risk Prediction and Fuels Management

Version Notes

Order Reprints

Review Reports

Abstract

Global forest wildfires are increasing in both frequency and intensity, resulting in significant ecological degradation and posing substantial threats to human health. This study focused on the Dongjiang River Basin in southern China and investigated the seasonal and spatial distribution patterns of forest wildfires in the research region from 2003 to 2023 using geographic information system technology. This study employed the random forest (RF) model, a machine learning algorithm, to predict the danger level of wildfire across different seasons and quantitatively interpret the seasonal wildfire driving mechanisms using the SHapley Additive exPlanations (SHAP) values. The results indicated that forest wildfires in the Dongjiang Basin were predominantly concentrated in the eastern region of the Dongjiang Basin, with significant seasonal variation in the spatial distribution. The frequency of fire events exhibited distinct seasonal patterns, with higher incidence in spring and winter and relatively lower frequency in summer and autumn. The random forest model demonstrated high predictive accuracy for the wildfire danger in all the seasons. Furthermore, the analysis of the driving factors showed that, despite some seasonal variability, the underlying mechanisms of wildfire occurrence could be effectively quantified using the SHAP values. Notably, the Normalized Difference Vegetation Index and anthropogenic disturbances consistently emerged as the dominant driving forces behind forest wildfires across all the seasons.

Keywords:

Dongjiang River Basin; forest wildfire; spatial and temporal distribution; random forest; spatial pattern of danger; SHAP

1. Introduction

Driven by global climate change and anthropogenic disturbances, forest wildfires are increasing in frequency and intensity, posing a serious threat to ecosystem integrity [1,2,3,4]. Statistics indicate that approximately 350 million hectares of land are affected by wildfires annually, resulting in substantial biodiversity loss [5,6], depletion of carbon stocks [7,8], and emission of large quantities of greenhouse gases, which further exacerbate climate warming [9,10,11,12]. Faced with the increasingly severe threat of forest fires, countries around the world are actively developing forest fire prediction and early warning systems to reduce fire risks and minimize losses.

Several mature forest fire prediction systems have been established internationally. The National Fire Danger Rating System (NFDRS) of the United States is one of the earliest comprehensive fire danger forecasting systems in the world [13]. The system establishes a fire danger index based on the meteorological conditions, fuel characteristics and terrain factors, and it provides daily fire danger warnings for the US forest fire department [14]. The Canadian Forest Fire Danger Rating System (CFFDRS) is widely used in North America and even around the world. Its core includes the Forest Fire Danger Index System (FWI) and the Forest Fire Behavior Forecasting System (FBP), which can accurately predict the fire danger level and fire behavior characteristics [15,16]. The European Forest Fire Information System (EFFIS) integrates satellite remote sensing, meteorological data and ground observation information to provide real-time fire monitoring and risk assessment services for EU countries [17]. The ISDM-Rosleskhoz system of the Russian Federation combines the characteristics of the country’s vast forest resources and realizes large-scale forest fire monitoring and early warning through multi-source data fusion [18].

Approaches to forecasting forest fires have progressively evolved—from initial qualitative characterizations to data-driven quantitative modeling, and from reliance on standalone predictive frameworks to the adoption of comprehensive, multi-source integrated methodologies. Deterministic methods establish mathematical models based on physical processes, such as fire spread velocity models and burning intensity models, which can better describe the physical mechanism of fire behavior, but have strict requirements in terms of the input parameters and are complex to calculate [19,20]. Probabilistic methods statistically analyze historical fire data to establish a fire risk prediction model based on the probability distribution. It has the advantages of simple calculation and easy operation but often ignores the physical process of fire occurrence [20,21]. The deterministic–probabilistic hybrid method combines the advantages of the two methods, taking into account both physical processes and statistical laws, improving the prediction accuracy and applicability [22].

Wildfire emergence is governed by a multifaceted interaction between natural environmental drivers and human interventions. Among these, meteorological variables—such as the temperature, humidity, and wind—are critical in shaping the ignition potential and fire behavior [23,24,25,26]. For instance, wind modulates the spatial dynamics of fire propagation, while rainfall influences the combustibility of vegetation through changes in the fuel moisture [24,27]. Topographic factors indirectly affect the fire process by affecting the local climate and fuel distribution. The slope, aspect and altitude are the most important topographic variables [23].

The vegetation structure, including the biomass quantity, composition, and spatial arrangement, further determines fire-prone conditions [28,29]. The NDVI, a remote sensing index indicative of vegetation vigor, has been extensively applied in modeling regional fire risks due to its reliability in capturing fuel-related variability [30]. Human activity factors include direct ignition sources and indirect landscape changes [31,32,33]. Socioeconomic indicators such as the population density, economic development level, and road density are closely related to the frequency of fires.

The Dongjiang River Basin is located at the junction of Guangdong, Jiangxi and Fujian provinces in southern China. Its geographical coordinates are between 113°12′–115°50′ E and 22°39′–25°12′ N. Spanning approximately 25,325 square kilometers, the basin flows through major urban agglomerations within the Pearl River Delta. It is the main water source for the Hong Kong Special Administrative Region and the core city clusters in the Pearl River Delta [34]. The Dongjiang River Basin is not only an important environmental buffer zone but also a regional water conservation hub, making a huge contribution to ecological resilience [35]. However, the region’s complex climatic and topographic conditions, combined with the intense anthropogenic activities, result in pronounced seasonal variation and spatial heterogeneity in wildfire patterns [36].

Existing research on wildfires has primarily focused on the global or climatic-zone scales [37,38,39], while limited attention has been paid to the seasonal drivers of wildfires at the watershed scale, particularly in China’s subtropical monsoon zone. Traditional assessment approaches often rely on weight-based linear models, which are inadequate for capturing the complex nonlinear relationships among influencing factors [40,41]. With advances in computer science, machine learning algorithms—such as LR [42], ANN [43], SVM [44,45], RF [46,47], XGBoost [48], GBDT [49], and LGBM [50]—have been increasingly applied to wildfire danger assessments [51]. Among these, random forest algorithms exhibit particular strength in discerning complex, nonlinear interactions embedded in high-dimensional datasets, coupled with a notable resilience against overfitting [52,53,54,55]. However, their intrinsic black-box characteristics pose challenges to understanding the rationale behind the predictions. To address this limitation, SHapley Additive exPlanations (SHAP) offer a systematic approach to revealing the model’s internal reasoning processes [53,56,57].

Based on this, this study selected the Dongjiang River Basin in southern China as the study area, integrating Moderate Resolution Imaging Spectroradiometer (MODIS) fire point data with multi-source remote sensing and socioeconomic data from 2003 to 2023. The wildfire driving factors were identified using geoprobe methods, a seasonal random forest prediction model was developed, and the SHAP value technique was applied to analyze the seasonal driving mechanisms of wildfire occurrence. This research sets out to explore a series of core questions concerning the wildfire dynamics in the Dongjiang Basin. Specifically, it investigates whether seasonal variability leads to marked changes in both the spatial and temporal patterns of forest fire occurrence. It further analyzes how wildfire danger zones are distributed across different seasons, seeks to identify convergences and divergences in the seasonal ignition drivers, and determines the dominant environmental and anthropogenic factors that contribute to wildfire events.

The present work enhances conceptual insights into the seasonal behavior of forest wildfires within subtropical monsoon-influenced basins. It proposes an integrative methodological framework that combines machine learning algorithms with interpretative modeling techniques, and it provides actionable recommendations for wildfire mitigation and ecological danger governance specific to the Dongjiang River Basin.

2. Study Area and Methodology

2.1. Overview of the Study Area

Situated at the tri-provincial convergence of Guangdong, Jiangxi, and Fujian in southern China (Figure 1), the Dongjiang Basin encompasses a 25,325 km² catchment area. This region lies within a canonical subtropical monsoon climatic zone, exhibiting mean annual precipitation of 1600–2200 mm with pronounced seasonal heterogeneity. The summer monsoon period (June–August) concentrates 60%–70% of the annual rainfall through convective systems, whereas the winter months (December–February) under the continental high-pressure influence typically receive <50 mm monthly precipitation. The spring and autumn seasonal transitions exhibit moderate but highly variable rainfall regimes. The topography is primarily composed of hills and low mountains, with the highest altitude of 1360 m and the lowest altitude of −17 m. The rivers crisscross each other, forming a complex watershed system. This complex water system pattern not only affects the spatial differentiation of regional water and heat conditions but also forms different microclimate environments through terrain barrier effects and local circulation, which in turn affects the vegetation moisture content and combustible material distribution, becoming an important controlling factor in the spatial differences in the wildfire frequency and intensity. Vegetation is dominated by subtropical evergreen broad-leaved forests, supplemented by extensive areas of plantation forests (e.g., eucalyptus and cedar) and agricultural land (including rice paddies and orchards), reflecting a landscape where natural and anthropogenic ecosystems are closely interwoven. The region is subject to intensive human activity; urban expansion, agricultural burning, and the construction of transportation infrastructure have intensified the surface disturbances, emerging as major contributors to wildfire ignition.

Figure 1. Location map of the Dongjiang River Basin (a), and spatial distribution of fire points in different seasons (b).

Wildfires in the basin exhibit pronounced seasonal variation (Figure 2). During spring and winter, the dry climate significantly increases the fire danger. Although precipitation also decreases in autumn and winter, localized drought conditions elevate the fire danger in specific areas. In contrast, the wildfire occurrence sharply declines in summer due to the suppressive effect of abundant rainfall. Spatially, wildfire hotspots are primarily concentrated in forest-dense zones, urban–rural fringe areas, and regions with frequent agricultural activities. In these areas, farmland burning and urban expansion contribute to the accumulation of combustible material and proliferation of ignition sources, exacerbating the wildfire danger. Thus, the interplay between seasonal climatic variation and anthropogenic disturbance shapes the spatiotemporal dynamics of wildfire activity in the basin.

Figure 2. It illustrates the seasonal variation in the geographic distribution of wildfire occurrences, as depicted in subfigures (a–d), corresponding to spring, summer, autumn, and winter in sequence.

2.2. Data Sources

(1): Wildfire data

Wildfire data were sourced from the NASA LANCE FIRMS platform (https://firms.modaps.eosdis.nasa.gov, accessed on 17 January 2025), based on MODIS Collection 61 thermal anomaly/fire location products, processed in near real-time (NRT). These data are captured by MODIS sensors aboard the Aqua and Terra satellites using swath observation mode (i.e., raw single-transit data with product codes MOD14 and MYD14), rather than composite products (e.g., MOD14A1 and MYD14A1). The core detection algorithm identifies thermal anomalies on the land surface via thermal infrared bands, assigning a 1 km resolution fire or heat anomaly designation to each pixel center [58], but the center of a 1 km fire pixel is not necessarily the actual location of the fire, as one or more fires can be detected within a 1 km pixel. Widely used for global wildfire monitoring, emergency response, and resource management, these products offer timely updates. However, they may include non-fire thermal anomalies. Therefore, this study selected only those data points with a confidence level ≥80 and type field = 0 (vegetation fires), restricted to forested areas, for the period from 2003 to 2023. The seasonal grouping adhered to the climatic context of the Dongjiang Basin, where spring, summer, autumn, and winter correspond to March–May, June–August, September–November, and December–February, respectively.

(2): Physical environment data

Monthly precipitation data for 2003–2023 were downloaded from the NTPDC (National Tibetan Plateau Data Center, http://data.tpdc.ac.cn, accessed on 17 January 2025) at a resolution of 1 km. Monthly temperature data were sourced from the NESSDC (National Earth System Science Data Center, http://data.tpdc.ac.cn), also at a 1 km resolution. Monthly potential evapotranspiration data were acquired from the same platform, with the same spatial resolution. As shown in Table 1, the temperature data were seasonally averaged, while the precipitation and potential evapotranspiration data were seasonally aggregated. NDVI (Normalized Difference Vegetation Index) data for 2003–2023 were derived from MOD13A2 at a resolution of 1 km and seasonally averaged using the maximum NDVI values obtained via the Google Earth Engine (GEE) platform. Digital elevation model (DEM) data were also obtained from the GEE, based on the Shuttle Radar Topography Mission conducted by the United States National Aeronautics and Space Administration, the National Geospatial-Intelligence Agency, and space agencies in Germany and Italy, which used radar interferometry to generate near-global elevation data at a 30 m resolution. Using these DEM datasets, the slope, aspect, and Terrain Wetness Index (TWI) were calculated on the GEE platform. Land use data for 2023 were retrieved from the CLCD 2023 China land cover dataset released by Wuhan University, with a resolution of 30 m. The distance from each fire point to farmland was calculated using the Euclidean Distance tool.

Table 1. Physical environmental and anthropogenic factors and their abbreviations in this study.

(3): Human activity data

Raster data on the population distribution (2003–2023) were obtained from the LandScan dataset via GEE, at a 1 km resolution, developed by the ORNL (Oak Ridge National Laboratory) and provided by East View Cartographic. Gross domestic product (GDP) data were acquired from the RESDC (Resource and Environment Science and Data Center, http://www.resdc.cn, accessed on 17 January 2025), available for the years 2000, 2005, 2010, 2015, and 2020, and averaged across these time points. Transportation network data for 2023 were extracted from OpenStreetMap, and the ED (Euclidean Distance) tool was used to calculate the proximity of fire points to roadways.

This study will optimize 20 natural environmental factors and 4 human activity factors (Table 1), build a prediction model based on the optimized factors and fire point data, and analyze the driving mechanism of wildfire occurrence.

2.3. Research Methods

2.3.1. Kernel Density Analysis

The probability density function of a random variable can be estimated nonparametrically using kernel density estimation [59]. It transforms discrete data points into a smooth, continuous density surface, thereby revealing the spatial or numerical distribution characteristics of the dataset. The density function for kernel density estimation is defined as follows:

\hat{f} (x) = \frac{1}{n h} \sum_{i = 1}^{n} K (\frac{x - X_{i}}{h})

(1)

where X1, X2, …, Xn are the n observed data points; K(·) is the kernel function, which needs to be non-negative and an integral of 1, i.e.,

\int_{- \infty}^{\infty} K (u) d u = 1

; and h (where h > 0) is the bandwidth, which controls the degree of kernel smoothing. To elucidate the spatial characteristics of wildfire clustering, a kernel density estimation (KDE) approach was adopted. This method enabled the generation of a continuous density surface representing the intensity of fire events, which was subsequently visualized via a heat map for clearer spatial pattern recognition.

2.3.2. Data Pre-Processing

All the raster layers underwent a uniform resampling procedure to achieve a standardized spatial granularity of 1 km. In order to ensure a class balance within the dataset, 4629 artificial non-fire reference points—matching the count of true fire incidents—were stochastically distributed throughout the forested regions, while maintaining a buffer of at least 1 km from any documented fire event. These non-fire points were then divided into four seasonal subsets—1055 (spring), 142 (summer), 739 (autumn), and 2693 (winter)—to correspond to the seasonal distribution of the fire points. Factor values were extracted for both the fire and non-fire points using the “Multi-value Extraction to Points” tool. Fire points were assigned a value of 1, and non-fire points were assigned a value of 0.

2.3.3. Factor Screening

(1): Geodetector

Geodetector is a spatial statistics tool that enables the identification of potential drivers behind geographic variation by evaluating the consistency between variable distributions and outcomes [60]. It identifies spatial distribution patterns by quantifying the heterogeneity in spatial data, thereby revealing the influence of explanatory variables on the observed geographic outcomes [61]. The method is grounded in the principle of spatial heterogeneity, which posits that regional differences in a geographic phenomenon may be closely associated with the spatial distribution of specific driving factors, such as natural conditions or socioeconomic variables. If a factor significantly influences a geographic phenomenon, the spatial distributions of the two should exhibit a high degree of consistency. In this study, we use Geodetector to quantify the explanatory power of natural environmental and anthropogenic variables (denoted as X) in terms of the seasonal wildfire danger (Y) by calculating the q-value. The q-value is computed using the following formula:

q = 1 - \frac{\sum_{h = 1}^{L} N_{h} σ_{h}^{2}}{N σ^{2}}

(2)

where Nh is the number of samples in stratum h (factor classification),

σ_{h}^{2}

is the variance in stratum h, N is the total number of samples, and

σ^{2}

is the overall variance. Ranging from 0 to 1, a larger q-value indicates stronger explanatory power of the factor for Y.

(2): Multicollinearity test

To mitigate any multicollinearity issues [62], the Pearson and Spearman correlation coefficients between the driving factors were calculated for each season, referencing the Geodetector q-values. Factors with smaller q-values were excluded when the absolute value of Pearson’s r or Spearman’s ρ exceeded 0.75 [63]. Subsequently, multicollinearity diagnostics were performed using VIF analysis on the retained predictors, and only those with VIF scores under 10 were considered acceptable [64].

2.3.4. Wildfire Danger Modeling

Random forest, initially developed by Breiman in 2001, represents a powerful ensemble learning paradigm that integrates multiple decision trees to enhance the classification or regression accuracy [65]. Its core principle involves constructing multiple heterogeneous decision trees and aggregating their outputs, thereby reducing the overfitting danger and enhancing both the robustness and predictive accuracy. Unlike traditional single decision tree models, random forest captures nonlinear relationships and complex feature interactions through a dual-randomization strategy: bootstrap sampling and random feature subset selection. Additionally, its inherent flexibility allows for effective handling of missing values and categorical attributes, making it particularly well-suited for geospatial and ecological applications such as wildfire danger assessment. The mathematical basis of the random forest algorithm is as follows.

Assume that the training dataset is

D = {\{(x_{i}, y_{i})\}}_{i = 1}^{N}

, where

x_{i} ϵ R^{d}

is the input feature and

y_{i}

is the target label.

(1): Perform B bootstrap sampling on the dataset to obtain B training subsets $B {\{D_{b}\}}_{b = 1}^{B}$ .
(2): For each training subset $D_{b}$ , train a decision tree $h_{b} (\cdot)$ . At each node split, randomly select $m ≪ p$ feature candidates from all the features. Select the best split feature and threshold, and split the node until the stopping condition (such as the maximum depth or the number of leaf node samples) is met.
(3): The final random forest model consists of B trees: $\{h_{1} (\cdot), h_{2} (\cdot), \dots, h_{B} (\cdot)\}$ .
(4): The predicted category of each tree is ${\hat{y}}_{m} {= h}_{m} (x)$ . The final prediction result of the random forest is $\hat{y} = majorityvote {\{h_{m} (x)\}}_{m = 1}^{M}$ .

In this study, the spatial distribution of the wildfire danger was modeled using the random forest algorithm, a supervised ensemble learning approach. The dataset was split with a 70/30 ratio for training and validation, respectively. Model tuning was performed through a grid search strategy, refining key parameters such as the number of decision trees, maximum tree depth, feature selection criteria, and minimum number of samples for node splitting. Raster-based environmental and anthropogenic predictors were fed into the model to generate a high-resolution fire danger map. For this modeling process, we wrote a Python code and ran it in Spyder software (Python 3.12).

2.3.5. Evaluation of Model Performance

The predictive competence of the wildfire models was quantified through a multidimensional assessment framework comprising the accuracy, precision, recall, F1-score, and AUC metrics. Classification accuracy measures the ratio of correctly classified instances relative to the total sample population. Precision quantifies the reliability of positive predictions by computing the fraction of correctly identified wildfire events among all the predicted positives. Recall (sensitivity) captures the model’s detection capability as the proportion of actual wildfires successfully captured. The F1-score constitutes a harmonic equilibrium between precision and recall, integrating both metrics into a unified performance indicator. ROC analysis visualizes the trade-off between the true positive rates and the false positive rates across classification thresholds, with the enclosed AUC metric characterizing the overall discriminative power: an AUC value approaching 1 signifies optimal separability, whereas 0.5 indicates discriminative performance equivalent to random assignment. The corresponding formulas are as follows:

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(3)

P r e c i s i o n = \frac{T P}{T P + F P}

(4)

Re c a l l = \frac{T P}{T P + F N}

(5)

F 1 = 2 \times \frac{P r e c i s i o n \times Re c a l l}{P r e c i s i o n + Re c a l l}

(6)

F P R = \frac{F P}{F P + T N}

(7)

T P R = \frac{T P}{T P + F N}

(8)

where TP is a correctly predicted wildfire, TN is a correctly predicted non-wildfire, FP indicates that a non-wildfire was incorrectly predicted as a wildfire, and FN indicates that a wildfire was incorrectly predicted as a non-wildfire.

2.3.6. Interpretation of SHAP Values

SHAP derives its theoretical foundations from cooperative game theory, providing an axiomatic framework for quantifying individual feature contributions to predictive outcomes. This method decomposes the model predictions into additive components by computing each feature’s expected marginal contribution over the power set of feature permutations [66]. For each prediction result, SHAP considers all the possible feature subsets, calculates the prediction difference with and without a certain feature, and determines the weighted averages of these differences to obtain the Shapley value of the feature. The above steps are repeated to calculate the Shapley values of all the features. The SHAP value is defined as follows:

Φ_{i} = \sum_{S \subseteq F \ {i}} \frac{| S |! (| F | - | S | - 1)!}{| F |!} (f (S \cup \{i\}) - f (S))

(9)

where F denotes the set of all features, S represents any feature subset excluding feature i, and f(S) constitutes the model output based on the subset S.

3. Results

3.1. Spatial and Temporal Patterns of Wildfires in the Dongjiang Basin

According to the data analysis in Table 2 and Figure 2, the forest fire occurrence within the Dongjiang River Basin showed a significant seasonal variation during 2003–2023. The number of fires in winter (2693 times) and spring (1055 times) far exceeded that in summer (142 times) and autumn (739 times). Winter was the season with the highest frequency and summer was the season with the lowest frequency. The seasonal distribution characteristics of the forest fire intensity (FRP) were consistent with the frequency changes: the minimum FRP (10.6 MW), maximum FRP (1009.7 MW) and average FRP (65.9 MW) in winter were all the highest values in each season, and the corresponding indicators in summer (7.1 MW, 172.9 MW, 36.1 MW) were all the lowest values.

Table 2. The number and intensity of the MODIS active fire events in each season. The unit of the FRP is MW.

The observed seasonal wildfire disparity in the Dongjiang River Basin is primarily attributable to the region’s subtropical monsoon regime. During winter and spring, continental high-pressure systems dominate, creating a favorable meteorological milieu characterized by diminished precipitation, reduced atmospheric moisture, and declining vegetation hydration levels. These conditions collectively desiccate combustible materials, elevating the fire danger. Concurrently, these seasons coincide with peak agricultural activity periods, wherein anthropogenic ignition sources—including crop residue burning and forest land clearance operations—significantly amplify the fire exposure danger. In summer, under the influence of subtropical high-pressure and monsoon systems, precipitation is abundant and concentrated, and the high humidity environment effectively suppresses the occurrence and spread of wildfires. Although precipitation gradually decreases in autumn, there is sufficient moisture in the early stage, the moisture content of vegetation is relatively high, and the fire danger is at a medium level.

To investigate the spatial distribution patterns more comprehensively, kernel density analysis in ArcGIS was applied to the wildfire locations for each season (Figure 3). The results indicate a common spatial characteristic across all the seasons: wildfire points were predominantly concentrated in the forested regions on the eastern side of the Dongjiang Basin, suggesting the presence of persistent fire danger factors in this area. However, seasonal differences were evident in the spatial distribution patterns. In spring and winter, the wildfire distribution exhibited a pattern of large aggregation and small dispersion, with most fires concentrated on the eastern side, but with some spread toward the western portion of the basin. In contrast, the wildfires during summer and autumn were almost entirely confined to the eastern forested areas, with minimal occurrence on the western side.

Figure 3. (a–d) The kernel density distributions of the fire points in spring, summer, fall, and winter, respectively.

3.2. Spatial Distribution of Wildfire Danger

3.2.1. Regional Projections of Wildfire Danger

Following screening procedures using the Pearson correlation coefficients (Figure 4), Spearman correlation coefficients (Figure 5), and Geodetector q-values (Figure 6a), a total of 10 factors were selected for spring, 9 for summer, 11 for autumn, and 11 for winter wildfire danger prediction. A VIF-based collinearity test (Figure 6b) confirmed that all the input variables exhibited acceptable independence (VIF < 10). These validated predictors were subsequently introduced into a random forest framework, with model optimization achieved via grid-search-based hyperparameter tuning. The optimal parameter settings for each seasonal wildfire danger model are presented in Table 3.

Figure 4. (a–d) Pearson correlation coefficient matrices for 24 wildfire drivers in spring, summer, fall, and winter, respectively.

Figure 5. (a–d) Spearman correlation coefficient matrices for the 24 wildfire drivers in spring, summer, fall, and winter, respectively.

Figure 6. (a) The ranked q-values of the wildfire drivers by season; black labels denote drivers with p-values < 0.01, while gray labels indicate p-values > 0.01. (b) The VIF values for selected wildfire drivers across all the seasons.

Table 3. Optimal parameter settings for the random forest model in spring, summer, fall, and winter.

Seasonal mapping of the wildfire danger levels revealed marked spatial heterogeneity across the Dongjiang Basin (Figure 7, Table 4). The southern region of the basin, predominantly consisting of built-up land, consistently exhibited Level I (low danger) in all four seasons, with few or no wildfire occurrences. In spring, the highest-danger zones (Levels IV and V) were predominantly concentrated in the northern and northeastern areas. The summer danger pattern remained largely consistent with that of spring, although some central regions transitioned slightly toward Level I, indicating reduced danger. In autumn, the high-danger areas in the north and northeast diminished significantly, and the spatial distribution became more fragmented, with smoother transitions between danger levels. In winter, Levels IV and V expanded again, and the central region exhibited sharper danger boundaries, suggesting more pronounced spatial differentiation. This seasonal variation reflects the integrated influence of climate conditions, vegetation dynamics, and human activity intensity.

Figure 7. (a–d) The seasonal patterns of the wildfire danger distribution across spring, summer, autumn, and winter in sequence.

Table 4. Wildfire danger classification levels.

3.2.2. Random Forest Model Performance Evaluation

As indicated in Table 5, the winter model surpassed the other seasonal configurations in predictive effectiveness, achieving top-tier values across the accuracy, precision, recall, and F1-score—particularly excelling in the latter two metrics. The spring and fall models also delivered strong results, exhibiting comparable F1-scores and precision, though the spring model stood out for its heightened recall, reflecting improved wildfire detection sensitivity. In contrast, while the summer model maintained relatively high recall, it suffered from diminished precision and a lower F1-score, indicating a tendency toward over-prediction and reduced classification reliability.

Table 5. A set of evaluation criteria was utilized to quantify the predictive efficacy of the random forest algorithm under varying seasonal conditions. By systematically employing evaluation indica-tors—including accuracy, precision, recall, and the F1-score—the model’s performance can be thoroughly assessed, enabling a nuanced understanding of its predictive capability across different seasonal contexts.

The AUC is widely regarded as a robust indicator for measuring the predictive capability of binary classifiers, especially in scenarios where the class distributions are highly skewed. A higher AUC value indicates more consistent classification capability under varying decision thresholds. According to the results displayed in Figure 8, the winter model achieved the strongest classification performance, with an AUC of 0.943, thereby validating its superior ability to distinguish between fire and non-fire cases across varying threshold settings. The fall and spring models also showed strong performance, while the summer model, although slightly weaker, still demonstrated acceptable discriminative power.

Figure 8. AUC validation results for the spring, summer, fall, and winter models using the test datasets.

3.3. Major Wildfire Drivers

The SHAP value analysis results for the spring, summer, fall, and winter wildfire prediction models are illustrated in Figure 9a–d. These visualizations provide detailed insights into how individual factors contribute to the wildfire occurrence across different seasons. In each subfigure, the horizontal axis represents the magnitude and direction of the SHAP values—where negative values indicate a decrease in the wildfire danger and positive values indicate an increase. The vertical axis lists the corresponding features. The color gradient from dark blue to bright red reflects the feature value range from low to high. Each point represents a sample, and the scatter distribution illustrates the complex relationships between the variable values and the wildfire danger.

Figure 9. The SHAP summary diagrams illustrating the relative importance of the wildfire predictors during spring, summer, autumn, and winter, labeled as panels (a–d).

3.3.1. Analysis of Drivers of Spring Wildfires

In the spring model (Figure 9a), SR_NDVI exhibited the most significant influence. High SR_NDVI values (represented by red scatter points in Figure 9a) indicated lush vegetation cover during spring and were strongly associated with increased wildfire danger in those areas. This may be attributed to the drying of the abundant spring vegetation later in the season, which provides ample fuel for fire ignition and spread. SM_TMP (summer temperature from the previous year) showed a clear bidirectional effect: high temperatures (red scatter) promoted wildfire danger, while low temperatures (blue scatter) suppressed it. This pattern reflects a lagged effect of the previous summer’s heat on the vegetation dryness in the following spring. The influence of the DTF (distance to farmland) was widely distributed across both the positive and negative SHAP values, suggesting spatial heterogeneity in the wildfire danger near agricultural zones, potentially influenced by activities such as crop residue burning or sparks from machinery. Both the SLO (slope) and DTR (distance to roads) also had notable effects, each displaying bimodal distributions. Steeper slopes may accelerate the fire spread, while roads may act both as firebreaks and as ignition sources due to increased human activity.

3.3.2. Analysis of Drivers of Summer Wildfires

In the summer model (Figure 9b), SM_NDVI was the dominant factor, with the high values (red scatter) contributing strongly and positively to the wildfire predictions. This finding contrasts with the spring model, indicating that the vegetation condition during summer has a more direct and immediate effect on the fire danger. SLO and SM_TMP had a moderate influence but displayed more dispersed scatter distributions, suggesting that their effects vary across different geographic settings. Notably, the DTF primarily showed negative SHAP values, indicating that areas near farmland generally had lower wildfire danger during summer. This may be due to the high moisture content and vigorous growth of crops in this season. The GDP, representing socioeconomic activity, had a significant positive impact in the summer model, possibly reflecting an increased likelihood of anthropogenic ignition in economically active regions.

3.3.3. Analysis of Drivers of Autumn Wildfires

In the fall model (Figure 9c), SR_NDVI again emerged as the most influential variable, surpassing the summer vegetation index. The high spring NDVI values (red scatter) had a strong positive association with the wildfire danger in autumn, highlighting a delayed effect of spring vegetation growth on the fall fire potential—dense spring vegetation may result in an accumulation of combustible material later in the year. The DTF and SM_TMP were also significant, with a broad range of SHAP values. Post-harvest activities such as straw burning may account for the importance of proximity to farmland. WN_EVA (winter evapotranspiration) and the GDP also exhibited moderate effects. Specifically, the GDP had a significant negative influence in low-value regions (blue scatter), suggesting that economically less-developed areas may face higher wildfire danger, possibly due to less efficient waste management and limited fire prevention infrastructure.

3.3.4. Analysis of Drivers of Winter Wildfires

In the winter model (Figure 9d), SR_NDVI continued to have the strongest influence, reinforcing the persistent impact of spring vegetation growth on the wildfire danger throughout the year. Distinct from the other seasonal models, the ELE (elevation) appeared as a key predictor, showing that the wildfire danger was higher in low-elevation areas (blue scatter) and reduced at higher elevations (red scatter). This may be due to microclimatic differences: higher elevations generally experience cooler temperatures, greater humidity, and distinct vegetation conditions with higher fuel moisture, all of which reduce the fire likelihood. Additionally, human activity is typically less frequent at higher elevations, further reducing the ignition sources. The DTF remained important, indicating that agricultural proximity continues to affect the fire danger in winter. The DTR and SLO retained a moderate influence, affirming the continued relevance of both topography and human access routes in shaping the wildfire occurrence.

3.3.5. Comparison of Drivers in Four Seasons

A comparative analysis of the seasonal models revealed both shared and distinct wildfire drivers. Across all four seasons, the NDVI consistently emerged as the most critical factor, highlighting the central role of vegetation fuel loads in wildfire occurrence. Notably, the spring NDVI influenced the fire danger not only in spring but also in the subsequent seasons, underscoring its value as a long-term predictive indicator and a basis for early warning strategies.

Distance-related variables, particularly the DTF and DTR, were also consistently important across all the models, reflecting the continuous impact of human activities—such as agricultural production and transportation—on the wildfire danger. Moreover, the directionality of most factor effects remained stable across the seasons, suggesting underlying consistency in the environmental influence mechanisms.

However, seasonal distinctions were evident. The winter model uniquely emphasized the elevation, pointing to the relevance of topographic and microclimatic variation during colder months. The influence of the GDP varied across the seasons in both magnitude and direction, indicating that socioeconomic conditions exert season-specific effects on the wildfire danger. Additionally, the relative importance of the features differed between seasons, providing valuable insights for developing seasonally tailored wildfire prevention strategies.

These SHAP-based findings not only elucidate the seasonal drivers of wildfire occurrence but also offer a scientific foundation for differentiated wildfire danger management. Targeted monitoring and early warning systems should be developed in alignment with the dominant seasonal drivers to enhance fire prevention and mitigation efforts throughout the year.

4. Discussion

4.1. Seasonal Wildfire Patterns and Their Causal Mechanisms

4.1.1. Climate-Driven Mechanisms of Seasonal Distribution Characteristics

The Dongjiang River Basin forest fires show significant seasonal differences, with the frequency of fires in winter (2693) and spring (1055) far exceeding that in summer (142) and autumn (739). This temporal asymmetry arises directly from regime shifts within the subtropical monsoonal system. In winter and spring, the region is controlled by a continental high-pressure system, and cold air from the north moves south, bringing dry and rainy weather conditions. This meteorological condition causes a sharp drop in the vegetation moisture content, creating ideal fuel conditions for wildfires.

In contrast to drier periods, the summer precipitation becomes both frequent and spatially concentrated, primarily due to the synergistic interaction between the subtropical high-pressure belt and the southeast monsoon circulation. The high-humidity environment not only maintains a high moisture content in the vegetation but also increases the water vapor content in the air, effectively inhibiting the generation and spread of flames. In addition, the frequent precipitation events in summer form a “wet pulse”, and even in the intervals between precipitation, the soil and vegetation still maintain high humidity, significantly reducing the probability of wildfires.

4.1.2. Analysis of Geographical Causes of Spatial Heterogeneity

The observed spatial clustering of the wildfire activity in the eastern sector of the Dongjiang River Basin can be attributed to the region’s distinctive topographic and geomorphological characteristics. This part of the basin, characterized by low-elevation hills (200–600 m) and pronounced terrain undulations, fosters physical conditions that favor both the ignition and lateral propagation of fires. On the one hand, the complex terrain increases the intensity and variability of local wind systems, especially the valley wind and slope wind effects, which provide momentum for the spread of fires; on the other hand, areas with steeper slopes (slope > 15°) are more likely to form a “chimney effect” under dry conditions, accelerating the spread of fires to the top of the slope.

4.1.3. Driving Mechanisms of Seasonal Spatial Patterns

The wildfires in spring and winter show a spatial pattern of “large clusters and small dispersions”, while in summer and autumn they are almost completely confined to the east. This difference reflects the complexity of the interaction between seasonal climate conditions, terrain, and vegetation. Notably, despite the generally higher humidity in the western areas, the dry atmospheric conditions prevalent in early-year months—especially under prolonged sunshine and intense wind—can still create localized ignition scenarios. In the humid summer and autumn seasons, wildfires can only be caused in the most flammable eastern region combined with human factors, so the distribution of fire points is highly concentrated.

4.2. Seasonal Characteristics of Wildfire Drivers and Their Ecological Significance

4.2.1. The Dominant Mechanism of Vegetation Index

The SHAP value analysis revealed that the NDVI was the strongest predictor in all four seasons, but its mechanism of action had seasonal differences. The persistent impact of the spring NDVI (SR_NDVI) on the wildfire danger throughout the year reflects the lagged relationship between the vegetation phenology and the fuel load. Spring is a critical period for vegetation growth, and high NDVI values indicate that vegetation in the region is growing vigorously and biomass is abundant. When entering the dry season, this abundant vegetation gradually loses water and dries, turning into combustible materials, providing sufficient fuel basis for subsequent wildfires.

It is particularly noteworthy that this “fuel preloading” effect of the spring NDVI is of great ecological significance. It shows that wildfire danger assessment cannot rely solely on the vegetation status of the season but needs to consider the interannual accumulation effect of vegetation. This observation corroborates findings from international studies—particularly those focusing on Mediterranean climates and arid regions of western North America—which highlight the seasonal vegetation index as a key predictor of the fire danger.

4.2.2. Temporal and Spatial Heterogeneity of Human Activity Factors

The distance to farmland factor (DTF) shows significant differences in influence in different seasons, which is closely related to the seasonal cycle of agricultural activities. Spring is the beginning of agricultural activities, and field management, fertilization, plowing and other activities are frequent, which increases the possibility of mechanical friction ignition; at the same time, weed clearing and burning at the edges of farmland are also common agricultural activities. Therefore, the DTF shows extensive two-way influence characteristics in the spring model.

In summer, the DTF generally exhibits a suppressive effect on the fire occurrence, a trend that closely mirrors the actual field conditions during crop development. At this time, farmland vegetation is typically lush and moisture-rich, forming a natural “green firebreak.” Concurrently, farmers commonly enforce tight control over fire-related agricultural practices to protect crop yields, thereby reducing the human-induced ignition potential. This finding has important guiding significance for fire prevention management in the agricultural and forestry interlaced areas, indicating that differentiated fire prevention strategies around farmland should be adopted in different seasons.

4.2.3. Seasonal Regulation of Topographic Factors

The prominent role of the altitude factor (ELE) in the winter model reflects the seasonal variation of microclimate effects. In winter, the temperature drops more significantly with increasing altitude (an average of 0.6 °C for every 100 m of altitude increase), while the relative humidity increases. This altitude gradient effect is more obvious in the low-temperature season. Low-altitude areas are more likely to reach critical conditions for wildfires due to the relatively high temperatures and low humidity.

The slope factor (SLO) is important in all the seasons, but its mechanism of action is slightly different. In the dry season (winter and spring), the slope mainly plays a role by affecting the speed of fire spread; in the wet season (summer and autumn), the slope regulates the local fuel moisture content more by affecting drainage conditions and microclimate. Steep slopes (>25°) have good drainage and can quickly return to a relatively dry state even in the rainy season, so they maintain a higher fire danger level.

4.3. Methodological Contributions and Model Performance Evaluation

4.3.1. Advantages of Random Forest Models in Wildfire Prediction

The random forest model in this study showed excellent prediction performance; especially, the winter model reached an AUC value of 0.943, which is comparable to the level of similar international studies. The high accuracy of the model is mainly attributed to the following characteristics of the random forest algorithm. First, its integrated learning mechanism can effectively handle the complex nonlinear relationship between multiple variables, which is particularly important for wildfires, a natural phenomenon driven by multi-factor coupling. Second, the algorithm is highly robust to noise and outliers, and it can maintain a stable prediction effect even when there is a certain detection error in the MODIS fire point data. Finally, the random forest naturally has the function of feature importance evaluation, which provides a basis for the subsequent mechanism explanation.

The difference in model performance in the different seasons (winter > autumn > spring > summer) reflects the complexity of the wildfire mechanism in each season. The predictive efficacy of the wildfire models demonstrates significant seasonal variation, with the non-summer conflagrations exhibiting superior forecast precision. This phenomenon stems from the dominant influence of invariant climatic patterns and topographical determinants on the fire behavior during these periods, which establishes well-defined causal relationships between the explanatory and response variables. Consequently, the modeling frameworks achieve optimal performance in such scenarios. Summer wildfires occur less frequently and are more affected by random human factors, which increases the difficulty of prediction, so the model performance is relatively low.

4.3.2. Innovative Application of SHAP Value Interpretation Method

The application of SHAP value analysis in wildfire research provides an effective way to understand the “black box” machine learning model. Compared with the traditional feature importance ranking, the SHAP values enable precise estimation of how individual input variables incrementally influence the model outputs, while simultaneously uncovering the complex, nonlinear dependencies between the feature magnitudes and their directional effects on the prediction outcomes. The feature impact pattern shown in the SHAP value graph in this study (such as the high NDVI values promote wildfires, low altitudes increase winter fire danger, etc.) is highly consistent with the wildfire ecology theory, verifying the interpretability and credibility of the model.

It is particularly worth emphasizing that the seasonal differences revealed by the SHAP value analysis provide refined guidance for wildfire management. Value assessment methodologies yield optimized frameworks for wildfire governance. Specifically, the predictive capacity of the spring Normalized Difference Vegetation Index (NDVI) metrics regarding the fire danger in ensuing seasons establishes an empirical foundation for developing early-alert mechanisms. Conversely, the seasonal variations in the anthropogenic drivers’ influence profiles necessitate distinct regulatory approaches, thereby informing the design of temporally targeted fire suppression strategies.

4.4. Management Implications

The random forest models demonstrated strong performance in predicting the wildfire danger, with the winter model achieving the highest accuracy (AUC = 0.943). The prediction results indicated a high wildfire danger in the northern and northeastern forested areas and a consistently low danger in the southern urbanized regions, with substantial seasonal variation in the extent of the high-danger zones. Based on these findings, we propose the following management measures: (1) establish a seasonal wildfire early warning system based on NDVI monitoring, with a particular emphasis on the role of spring vegetation growth as a key indicator of the fire danger throughout the year; (2) implement differentiated management strategies for high-danger areas, including the construction of additional firebreaks during spring and winter and the enhancement of patrol and surveillance efforts; (3) strengthen the seasonal management of agricultural activities, particularly through improved straw disposal practices after harvest in spring and autumn; (4) optimize the fire prevention measures along transportation corridors to reduce the likelihood of anthropogenic ignition; and (5) incorporate topographic factors into fire prevention planning and apply tailored prevention and control strategies for areas with varying slopes and elevations.

4.5. Research Limitations and Future Prospects

Despite the promising outcomes, this study faces several methodological and data-related limitations. First, the spatial granularity of the MODIS fire point data (1 km) may hinder the detection of localized wildfire events, leading to a possible underestimation of the fire frequency. Second, the analysis primarily considers the static characteristics of the natural environment and human activities, without accounting for the long-term effects of dynamic processes such as climate change and vegetation succession on the wildfire danger. Third, although the model predictions demonstrate high accuracy, they are not yet fully suitable for direct practical application and require adjustment based on local knowledge and field experience. Future research could be expanded in the following areas: (1) enhance the fire detection accuracy by incorporating higher-resolution datasets, such as the VIIRS 375 m product; (2) introduce climate change scenarios to forecast potential shifts in the wildfire danger; (3) integrate additional socioeconomic variables, including changes in population density and land-use transitions, to construct a more comprehensive wildfire danger assessment framework; (4) extend the study area to include other watersheds, enabling comparative analysis of the long-term wildfire dynamics and region-specific danger factors; and (5) incorporate field investigations to validate the model predictions and improve their practical applicability. Additionally, establishing a holistic wildfire danger assessment system that incorporates hydrological protection concerns would enhance understanding of fire–water interactions, ultimately delivering policy-relevant insights for safeguarding ecological water resources, particularly in the Dongjiang River Basin.

5. Conclusions

Drawing upon the MODIS fire point records and multi-source environmental datasets from 2003 to 2023, this study utilized a random forest model enhanced with SHAP value interpretation to explore the spatiotemporal dynamics and drivers of forest fire occurrence in the Dongjiang River Basin. The principal conclusions are as follows:

(1): The wildfire activity demonstrates marked seasonality, peaking in winter and spring while declining in summer and autumn. Spatially, fire events tend to be concentrated in the basin’s eastern forest zones. Notably, clear seasonal divergences exist in the distribution patterns: the spring and winter seasons show a “large aggregation, small dispersion” spatial pattern, while the summer and autumn seasons are almost completely confined to the east side. This finding shows that the wildfire danger in the Dongjiang River Basin has obvious spatiotemporal differentiation characteristics, which provides a scientific basis for the formulation of differentiated seasonal fire prevention strategies.
(2): The random forest models in all four seasons showed excellent prediction performance, among which the winter model performed best (AUC = 0.943), followed by the spring and autumn models (AUCs of 0.929 and 0.924, respectively). The summer model was relatively low but still had strong discrimination ability (AUC = 0.895). The spatial configuration of the wildfire danger reveals that the heavily forested areas in the northern and northeastern portions of the basin are consistently exposed to elevated fire threat levels, in contrast to the lower-danger southern zones characterized by dense urban development. Furthermore, the observed seasonal variation in the high-danger zones underscores the value of seasonally adaptive modeling approaches, which not only yield more accurate predictions but also enhance the operational relevance of wildfire management strategies.
(3): SHAP value analysis showed that the Normalized Difference Vegetation Index (NDVI) was the most important predictive variable in the four seasonal models, but the impact mechanism of the NDVI on the wildfire danger in different seasons was different. In particular, the spring NDVI not only had a strong predictive power for wildfires in the current season but also had a continuous positive impact on the wildfire dangers in autumn and winter. This finding reveals the long-term lag effect of vegetation growth on the wildfire danger and provides important theoretical support for the establishment of a long-term wildfire warning system based on vegetation monitoring.
(4): The influence of the “distance to farmland” (DTF) variable varies considerably across seasons. In spring, its effect is bidirectional, reflecting the multifaceted nature of agricultural activity. In contrast, a negative association emerges during summer, likely due to the increased crop moisture. Autumn and winter, however, reveal a strong positive link, potentially stemming from open-field biomass burning practices. Additionally, factors such as the proximity to road networks (DTR) and broader socioeconomic metrics (e.g., GDP) display seasonally fluctuating impacts. Collectively, these findings emphasize the seasonal heterogeneity of human-driven wildfire determinants, highlighting the need for temporally adaptive fire source control strategies.

In the Dongjiang River Basin, wildfire occurrences arise from the interplay between anthropogenic influences and inherent environmental variables; nonetheless, the vegetation status consistently emerges as the predominant determinant influencing the fire dynamics. The seasonal differentiation modeling framework established in this study not only improves the prediction accuracy, but more importantly, reveals the seasonal variation law and cross-seasonal lag effect of the wildfire driving mechanism. These findings provide scientific support for forest fire prevention and management in the Dongjiang River Basin, help to establish a more accurate and effective wildfire danger warning and prevention and control system, and have important practical significance for protecting the safety of forest ecosystems and water sources in the region. In the future, we should strengthen the construction of a long-term early warning system based on dynamic vegetation monitoring and formulate differentiated fire management strategies based on the dominant driving factors in different seasons.

Author Contributions

Conceptualization, X.H. and Z.W.; methodology, X.H. and B.Y.; software, X.H. and BY.; validation, H.W., L.L. and J.Z.; formal analysis, X.H., K.Z. and H.W.; investigation, Z.W.; resources, Z.W.; data curation, X.H.; writing—original draft preparation, X.H. and Z.W.; writing—review and editing, Z.W.; visualization, X.H. and J.Z.; supervision, Z.W. and H.W.; project administration, Z.W. and K.Z.; funding acquisition, Z.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (42267068); Jiangxi Provincial University Humanities and Social Sciences Research Project (JC21113); and “Digital Intelligence and Humanities, Arts Integration and Innovation Interdisciplinary Research Cluster at Gannan Normal University”.

Data Availability Statement

Data will be made available upon request from the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

Balch, J.K.; Bradley, B.A.; Abatzoglou, J.T.; Nagy, R.C.; Fusco, E.J.; Mahood, A.L. Human-started wildfires expand the fire niche across the United States. Proc. Natl. Acad. Sci. USA 2017, 114, 2946–2951. [Google Scholar] [CrossRef] [PubMed]
Swain, D.L.; Abatzoglou, J.T.; Kolden, C.; Shive, K.; Kalashnikov, D.A.; Singh, D.; Smith, E. Climate change is narrowing and shifting prescribed fire windows in western United States. Commun. Earth Environ. 2023, 4, 340. [Google Scholar] [CrossRef]
Liu, Y.; Goodrick, S.; Heilman, W. Wildland fire emissions, carbon, and climate: Wildfire-Climate interactions. For. Ecol. Manag. 2014, 317, 80–96. [Google Scholar] [CrossRef]
Sowmya, S.V.; Somashekar, R.K. Application of remote sensing and geographical information system in mapping forest fire risk zone at Bhadra wildlife sanctuary, India. J. Environ. Biol. 2010, 31, 969. [Google Scholar]
Dove, N.C.; Safford, H.D.; Bohlman, G.N.; Estes, B.L.; Hart, S.C. High-severity wildfire leads to multi-decadal impacts on soil biogeochemistry in mixed-conifer forests. Ecol. Appl. 2020, 30, e2072. [Google Scholar] [CrossRef]
Corona Núñez, R.O.; Campo, J.E. Climate and socioeconomic drivers of biomass burning and carbon emissions from fires in tropical dry forests: A Pantropical analysis. Glob. Change Biol. 2023, 29, 1062–1079. [Google Scholar] [CrossRef] [PubMed]
Tang, X.; Machimura, T.; Li, J.; Yu, H.; Liu, W. Evaluating seasonal wildfire susceptibility and wildfire threats to local ecosystems in the largest forested area of China. Earth’s Future 2022, 10, e2021E–e2199E. [Google Scholar] [CrossRef]
Chen, A.; Tang, R.; Mao, J.; Yue, C.; Li, X.; Gao, M.; Shi, X.; Jin, M.; Ricciuto, D.; Rabin, S. Spatiotemporal dynamics of ecosystem fires and biomass burning-induced carbon emissions in China over the past two decades. Geogr. Sustain. 2020, 1, 47–58. [Google Scholar] [CrossRef]
Flannigan, M.D.; Krawchuk, M.A.; de Groot, W.J.; Wotton, B.M.; Gowman, L.M. Implications of changing climate for global wildland fire. Int. J. Wildland Fire 2009, 18, 483–507. [Google Scholar] [CrossRef]
Stephens, S.L.; Agee, J.K.; Fule, P.Z.; North, M.P.; Romme, W.H.; Swetnam, T.W.; Turner, M.G. Managing forests and fire in changing climates. Science 2013, 342, 41–42. [Google Scholar] [CrossRef]
Carlucci, M.; Zambon, I.; Colantoni, A.; Salvati, L. Socioeconomic development, demographic dynamics and forest fires in Italy, 1961–2017: A timeseries analysis. Sustainability 2019, 11, 1305. [Google Scholar] [CrossRef]
Xu, X.; Jia, G.; Zhang, X.; Riley, W.J.; Xue, Y. Climate regime shift and forest loss amplify fire in Amazonian forests. Glob. Change Biol. 2020, 26, 5874–5885. [Google Scholar] [CrossRef] [PubMed]
Deeming, J.E.; Burgan, R.E.; Cohen, J.D. The National Fire-Danger Rating System, 1978; Forest Service: Washington, DC, USA, 1977; Volume 39, pp. 1–81.
Wagner, C.V. Development and Structure of the Canadian Forest fire Weather Index System; Canadian Forestry Service: Ottawa, ON, Canada, 1987; Volume 51, pp. 1–37.
Stocks, B.J.; Lynham, T.J.; Lawson, B.D.; Alexander, M.E.; Wagner, C.V.; McAlpine, R.S.; Dube, D.E. Canadian forest fire danger rating system: An overview. For. Chron. 1989, 65, 258–265. [Google Scholar] [CrossRef]
San-Miguel-Ayanz, J.; Durrant, T.; Boca, R.; Liberta’, G.; Branco, A.; De Rigo, D.; Ferrari, D.; Maianti, P.; Artes Vivancos, T.; Pfeiffer, H.; et al. Forest Fires in Europe, Middle East and North Africa 2018; Publications Office of the European Union: Luxembourg, 2018; pp. 1–175. [Google Scholar]
Shvidenko, A.; Goldammer, J.G. Fire situation in Russia. Int. For. Fire News 2001, 24, 41–59. [Google Scholar]
Rothermel, R.C. A Mathematical Model for Predicting Fire Spread in Wildland Fuels; Forest Service: Washington, DC, USA, 1972; Volume 115, pp. 1–5.
Sullivan, A.L. Wildland surface fire spread modelling, 1990–2007. 2: Empirical and quasi-empirical models. Int. J. Wildland Fire 2009, 18, 369–386. [Google Scholar] [CrossRef]
Preisler, H.K.; Brillinger, D.R.; Burgan, R.E.; Benoit, J.W. Probability based models for estimation of wildfire risk. Int. J. Wildland Fire 2004, 13, 133–142. [Google Scholar] [CrossRef]
Parisien, M.; Moritz, M.A. Environmental controls on the distribution of wildfire at multiple spatial scales. Ecol. Monogr. 2009, 79, 127–154. [Google Scholar] [CrossRef]
Finney, M.A. The challenge of quantitative risk analysis for wildland fire. For. Ecol. Manag. 2005, 211, 97–108. [Google Scholar] [CrossRef]
Jones, M.W.; Veraverbeke, S.; Andela, N.; Doerr, S.H.; Kolden, C.; Mataveli, G.; Pettinari, M.L.; Le Quéré, C.; Rosan, T.M.; Van Der Werf, G.R. Global rise in forest fire emissions linked to climate change in the extratropics. Science 2024, 386, 5889. [Google Scholar] [CrossRef]
Jain, P.; Barber, Q.E.; Taylor, S.W.; Whitman, E.; Castellanos Acuna, D.; Boulanger, Y.; Chavardès, R.D.; Chen, J.; Englefield, P.; Flannigan, M. Drivers and impacts of the record-breaking 2023 wildfire season in Canada. Nat. Commun. 2024, 15, 6764. [Google Scholar] [CrossRef]
Povak, N.A.; Hessburg, P.F.; Salter, R.B. Evidence for scale-dependent topographic controls on wildfire spread. Ecosphere 2018, 9, e2443. [Google Scholar] [CrossRef]
Gan, J. Disentangling the drivers of wildfires. Science 2025, 387, 22–23. [Google Scholar] [CrossRef] [PubMed]
Boer, M.M.; Nolan, R.H.; Resco De Dios, V.; Clarke, H.; Price, O.F.; Bradstock, R.A. Changing Weather Extremes Call for Early Warning of Potential for Catastrophic Fire. Earth’s Future 2017, 5, 1196–1202. [Google Scholar] [CrossRef]
Bowman, D.M.; Kolden, C.A.; Abatzoglou, J.T.; Johnston, F.H.; van der Werf, G.R.; Flannigan, M. Vegetation fires in the Anthropocene. Nat. Rev. Earth Environ. 2020, 1, 500–515. [Google Scholar] [CrossRef]
Mermoz, M.; Kitzberger, T.; Veblen, T.T. Landscape influences on occurrence and spread of wildfires in Patagonian forests and shrublands. Ecology 2005, 86, 2705–2715. [Google Scholar] [CrossRef]
Brotons, L.; Duane, A. Correspondence: Uncertainty in climate-vegetation feedbacks on fire regimes challenges reliable long-term projections of burnt area from correlative models. Fire 2019, 2, 8. [Google Scholar] [CrossRef]
Archibald, S. Managing the human component of fire regimes: Lessons from Africa. Philos. Trans. R. Soc. Biol. Sci. 2016, 371, 1–11. [Google Scholar] [CrossRef]
Camp, P.E.; Krawchuk, M.A. Spatially varying constraints of human-caused fire occurrence in British Columbia, Canada. Int. J. Wildland Fire 2017, 26, 219–229. [Google Scholar] [CrossRef]
Cardil, A.; De-Miguel, S.; Silva, C.A.; Reich, P.B.; Calkin, D.; Brancalion, P.H.; Vibrans, A.C.; Gamarra, J.G.; Zhou, M.; Pijanowski, B.C. Recent deforestation drove the spike in Amazonian fires. Environ. Res. Lett. 2020, 15, 121003. [Google Scholar] [CrossRef]
Zeng, X.; He, W. Government Cooperation in Cross border Water Pollution Control from the Perspective of Network Structure Analysis. Lingnan J. 2023, 1, 58–67. [Google Scholar] [CrossRef]
Karimian, H.; Zou, W.; Chen, Y.; Xia, J.; Wang, Z. Landscape ecological risk assessment and driving factor analysis in Dongjiang river watershed. Chemosphere 2022, 307, 135835. [Google Scholar] [CrossRef] [PubMed]
Bajocco, S.; Koutsias, N.; Ricotta, C. Linking fire ignitions hotspots and fuel phenology: The importance of being seasonal. Ecol. Indic. 2017, 82, 433–440. [Google Scholar] [CrossRef]
Chuvieco, E.; Pettinari, M.L.; Koutsias, N.; Forkel, M.; Hantson, S.; Turco, M. Human and climate drivers of global biomass burning variability. Sci. Total Environ. 2021, 779, 146361. [Google Scholar] [CrossRef] [PubMed]
Forkel, M.; Andela, N.; Harrison, S.P.; Lasslop, G.; Van Marle, M.; Chuvieco, E.; Dorigo, W.; Forrest, M.; Hantson, S.; Heil, A. Emergent relationships with respect to burned area in global satellite observations and fire-enabled vegetation models. Biogeosciences 2019, 16, 57–76. [Google Scholar] [CrossRef]
Westerling, A.L.; Bryant, B.P. Climate change and wildfire in California. Clim. Change 2008, 87, 231–249. [Google Scholar] [CrossRef]
Yuan, X.; Liu, C.; Nie, R.; Yang, Z.; Li, W.; Dai, X.; Cheng, J.; Zhang, J.; Ma, L.; Fu, X. A comparative analysis of certainty factor-based machine learning methods for collapse and landslide susceptibility mapping in Wenchuan County, China. Remote Sens. 2022, 14, 3259. [Google Scholar] [CrossRef]
Mandallaz, D.; Ye, R. Prediction of forest fires with Poisson models. Can. J. For. Res. 1997, 27, 1685–1694. [Google Scholar] [CrossRef]
Cao, X.; Cui, X.; Yue, M.; Chen, J.; Tanikawa, H.; Ye, Y. Evaluation of wildfire propagation susceptibility in grasslands using burned areas and multivariate logistic regression. Int. J. Remote Sens. 2013, 34, 6679–6700. [Google Scholar] [CrossRef]
Dutta, R.; Das, A.; Aryal, J. Big data integration shows Australian bush-fire frequency is increasing significantly. R. Soc. Open Sci. 2016, 3, 150241. [Google Scholar] [CrossRef]
Ghorbanzadeh, O.; Valizadeh Kamran, K.; Blaschke, T.; Aryal, J.; Naboureh, A.; Einali, J.; Bian, J. Spatial prediction of wildfire susceptibility using field survey GPS data and machine learning approaches. Fire 2019, 2, 43. [Google Scholar] [CrossRef]
Gholamnia, K.; Gudiyangada Nachappa, T.; Ghorbanzadeh, O.; Blaschke, T. Comparisons of diverse machine learning approaches for wildfire susceptibility mapping. Symmetry 2020, 12, 604. [Google Scholar] [CrossRef]
Arpaci, A.; Malowerschnig, B.; Sass, O.; Vacik, H. Using multi variate data mining techniques for estimating fire susceptibility of Tyrolean forests. Appl. Geogr. 2014, 53, 258–270. [Google Scholar] [CrossRef]
Bustillo Sánchez, M.; Tonini, M.; Mapelli, A.; Fiorucci, P. Spatial assessment of wildfires susceptibility in Santa Cruz (Bolivia) using random forest. Geosciences 2021, 11, 224. [Google Scholar] [CrossRef]
Shmuel, A.; Heifetz, E. Global wildfire susceptibility mapping based on machine learning models. Forests 2022, 13, 1050. [Google Scholar] [CrossRef]
He, Q.; Jiang, Z.; Wang, M.; Liu, K. Landslide and wildfire susceptibility assessment in Southeast Asia using ensemble machine learning methods. Remote Sens. 2021, 13, 1572. [Google Scholar] [CrossRef]
Yue, W.; Ren, C.; Liang, Y.; Lin, X.; Yin, A.; Liang, J. Wildfire Risk Assessment Considering Seasonal Differences: A Case Study of Nanning, China. Forests 2023, 14, 1616. [Google Scholar] [CrossRef]
Jain, P.; Coogan, S.C.; Subramanian, S.G.; Crowley, M.; Taylor, S.; Flannigan, M.D. A review of machine learning applications in wildfire science and management. Environ. Rev. 2020, 28, 478–505. [Google Scholar] [CrossRef]
Qiu, J.; Wang, H.; Shen, W.; Zhang, Y.; Su, H.; Li, M. Quantifying forest fire and post-fire vegetation recovery in the Daxin’anling area of northeastern China using landsat time-series data and machine learning. Remote Sens. 2021, 13, 792. [Google Scholar] [CrossRef]
Iban, M.C.; Sekertekin, A. Machine learning based wildfire susceptibility mapping using remotely sensed fire data and GIS: A case study of Adana and Mersin provinces, Turkey. Ecol. Inform. 2022, 69, 101647. [Google Scholar] [CrossRef]
Lan, Y.; Wang, J.; Hu, W.; Kurbanov, E.; Cole, J.; Sha, J.; Jiao, Y.; Zhou, J. Spatial pattern prediction of forest wildfire susceptibility in Central Yunnan Province, China based on multivariate data. Nat. Hazards 2023, 116, 565–586. [Google Scholar] [CrossRef]
Jaafari, A.; Zenner, E.K.; Panahi, M.; Shahabi, H. Hybrid artificial intelligence models based on a neuro-fuzzy system and metaheuristic optimization algorithms for spatial prediction of wildfire probability. Agric. For. Meteorol. 2019, 266, 198–207. [Google Scholar] [CrossRef]
Abdollahi, A.; Pradhan, B. Explainable artificial intelligence (XAI) for interpreting the contributing factors feed into the wildfire susceptibility prediction model. Sci. Total Environ. 2023, 879, 163004. [Google Scholar] [CrossRef] [PubMed]
Al-Bashiti, M.K.; Naser, M.Z. Machine learning for wildfire classification: Exploring blackbox, eXplainable, symbolic, and SMOTE methods. Nat. Hazards Res. 2022, 2, 154–165. [Google Scholar] [CrossRef]
Giglio, L.; Descloitres, J.; Justice, C.O.; Kaufman, Y.J. An enhanced contextual fire detection algorithm for MODIS. Remote Sens. Environ. 2003, 87, 273–282. [Google Scholar] [CrossRef]
Brunsdon, C.; Corcoran, J.; Higgs, G. Visualising space and time in crime patterns: A comparison of methods. Comput. Environ. Urban Syst. 2007, 31, 52–75. [Google Scholar] [CrossRef]
Wang, J.; Xu, C. Geodetector: Principle and prospective. J. Geogr. Sci. 2017, 72, 116–134. [Google Scholar]
Wang, J.F.; Li, X.H.; Christakos, G.; Liao, Y.L.; Zhang, T.; Gu, X.; Zheng, X.Y. Geographical detectors-based health risk assessment and its application in the neural tube defects study of the Heshun Region, China. Int. J. Geogr. Inf. Sci. 2010, 24, 107–127. [Google Scholar] [CrossRef]
Dormann, C.F.; Elith, J.; Bacher, S.; Buchmann, C.; Carl, G.; Carré, G.; Marquéz, J.R.G.; Gruber, B.; Lafourcade, B.; Leitão, P.J. Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography 2013, 36, 27–46. [Google Scholar] [CrossRef]
Zhang, K.; Yao, L.; Meng, J.; Tao, J. Maxent modeling for predicting the potential geographical distribution of two peony species under climate change. Sci. Total Environ. 2018, 634, 1326–1334. [Google Scholar] [CrossRef]
Hong, H.; Tsangaratos, P.; Ilia, I.; Liu, J.; Zhu, A.; Xu, C. Applying genetic algorithms to set the optimal combination of forest fire related variables and model forest fire susceptibility based on data mining models. The case of Dayu County, China. Sci. Total Environ. 2018, 630, 1044–1056. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Kavzoglu, T.; Teke, A.; Yilmaz, E.O. Shared blocks-based ensemble deep learning for shallow landslide susceptibility mapping. Remote Sens. 2021, 13, 4776. [Google Scholar] [CrossRef]

Figure 1. Location map of the Dongjiang River Basin (a), and spatial distribution of fire points in different seasons (b).

Figure 2. It illustrates the seasonal variation in the geographic distribution of wildfire occurrences, as depicted in subfigures (a–d), corresponding to spring, summer, autumn, and winter in sequence.

Figure 3. (a–d) The kernel density distributions of the fire points in spring, summer, fall, and winter, respectively.

Figure 4. (a–d) Pearson correlation coefficient matrices for 24 wildfire drivers in spring, summer, fall, and winter, respectively.

Figure 5. (a–d) Spearman correlation coefficient matrices for the 24 wildfire drivers in spring, summer, fall, and winter, respectively.

Figure 6. (a) The ranked q-values of the wildfire drivers by season; black labels denote drivers with p-values < 0.01, while gray labels indicate p-values > 0.01. (b) The VIF values for selected wildfire drivers across all the seasons.

Figure 7. (a–d) The seasonal patterns of the wildfire danger distribution across spring, summer, autumn, and winter in sequence.

Figure 8. AUC validation results for the spring, summer, fall, and winter models using the test datasets.

Figure 9. The SHAP summary diagrams illustrating the relative importance of the wildfire predictors during spring, summer, autumn, and winter, labeled as panels (a–d).

Table 1. Physical environmental and anthropogenic factors and their abbreviations in this study.

Type	Season	Indicator	Abbreviation	Unit
Natural environment	Spring	Temperature	SR_TEM	0.1 °C
	Summer		SM_TEM
	Autumn		AU_TEM
	Winter		WN_TEM
	Spring	Accumulated precipitation	SR_PRE	mm
	Summer		SM_PRE
	Autumn		AU_PRE
	Winter		WN_PRE
	Spring	Potential evapotranspiration	SR_EVA	0.1 mm
	Summer		SM_EVA
	Autumn		AU_EVA
	Winter		WN_EVA
	Spring	Normalized Difference Vegetation Index	SR_NDVI	-
	Summer		SM_NDVI
	Autumn		AU_NDVI
	Winter		WN_NDVI
	-	Elevation	ELE	m
	-	Slope	SLO	°
	-	Aspect index	ASP	-
	-	Topographic wetness index	TWI	-
Human activities	-	Population density	POP	people/km²
	-	Gross domestic product	GDP	million CNY/km²
	-	Distance to farmland	DTF	M
	-	Distance to road	DTR	m

Table 2. The number and intensity of the MODIS active fire events in each season. The unit of the FRP is MW.

Season	Number of Fires	Minimum FRP	Maximum FRP	Average FRP
Spring	1055	8.0	660.2	55.7
Summer	142	7.1	172.9	36.1
Autumn	739	8.3	753.8	58.3
Winter	2693	10.6	1009.7	65.9

Table 3. Optimal parameter settings for the random forest model in spring, summer, fall, and winter.

Season	Max_Depth	Max_Features	Min_Samples_Leaf	Min_Samples_Split	n_Estimators
SR	10	sqrt	1	2	100
SM	10	sqrt	1	5	200
AU	20	sqrt	1	5	200
WN	20	sqrt	1	10	100

Table 4. Wildfire danger classification levels.

Level	Corresponding Interval	Descriptions
I level	~–0.2	Almost no fire
II level	0.2–0.4	Unlikely to have a fire
III level	0.4–0.6	Fire possible
IV level	0.6–0.8	Higher likelihood
V level	0.8–~	Highly susceptible

Table 5. A set of evaluation criteria was utilized to quantify the predictive efficacy of the random forest algorithm under varying seasonal conditions. By systematically employing evaluation indica-tors—including accuracy, precision, recall, and the F1-score—the model’s performance can be thoroughly assessed, enabling a nuanced understanding of its predictive capability across different seasonal contexts.

Season	Accuracy	Precision	Recall	F1-Score
SR	0.857	0.835	0.882	0.858
SM	0.824	0.766	0.900	0.828
AU	0.852	0.867	0.831	0.849
WN	0.878	0.856	0.905	0.880

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Seasonal Driving Mechanisms and Spatial Patterns of Danger of Forest Wildfires in the Dongjiang Basin, Southern China

Abstract

1. Introduction

2. Study Area and Methodology

2.1. Overview of the Study Area

2.2. Data Sources

2.3. Research Methods

2.3.1. Kernel Density Analysis

2.3.2. Data Pre-Processing

2.3.3. Factor Screening

2.3.4. Wildfire Danger Modeling

2.3.5. Evaluation of Model Performance

2.3.6. Interpretation of SHAP Values

3. Results

3.1. Spatial and Temporal Patterns of Wildfires in the Dongjiang Basin

3.2. Spatial Distribution of Wildfire Danger

3.2.1. Regional Projections of Wildfire Danger

3.2.2. Random Forest Model Performance Evaluation

3.3. Major Wildfire Drivers

3.3.1. Analysis of Drivers of Spring Wildfires

3.3.2. Analysis of Drivers of Summer Wildfires

3.3.3. Analysis of Drivers of Autumn Wildfires

3.3.4. Analysis of Drivers of Winter Wildfires

3.3.5. Comparison of Drivers in Four Seasons

4. Discussion

4.1. Seasonal Wildfire Patterns and Their Causal Mechanisms

4.1.1. Climate-Driven Mechanisms of Seasonal Distribution Characteristics

4.1.2. Analysis of Geographical Causes of Spatial Heterogeneity

4.1.3. Driving Mechanisms of Seasonal Spatial Patterns

4.2. Seasonal Characteristics of Wildfire Drivers and Their Ecological Significance

4.2.1. The Dominant Mechanism of Vegetation Index

4.2.2. Temporal and Spatial Heterogeneity of Human Activity Factors

4.2.3. Seasonal Regulation of Topographic Factors

4.3. Methodological Contributions and Model Performance Evaluation

4.3.1. Advantages of Random Forest Models in Wildfire Prediction

4.3.2. Innovative Application of SHAP Value Interpretation Method

4.4. Management Implications

4.5. Research Limitations and Future Prospects

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics